The TMC-ETR supports routing the Coresight trace data to the System memory. It supports two different modes in which the memory could be used.
1) Contiguous memory - The memory is assumed to be physically contiguous.
2) Scatter Gather list - The memory can be chunks of 4K pages, which are specified in a table of pointers which itself could be multiple 4K size pages.
To avoid the complications of the managing the buffer, this series adds a layer for managing the ETR buffer, which makes the best possibly choice based on what is available. The allocation can be tuned by passing in flags, existing pages (e.g, perf ring buffer) etc.
Towards supporting ETR Scatter Gather mode, we introduce a generic TMC scatter-gather table which can be used to manage the data and table pages. The table can be filled in the format expected by the Scatter-Gather mode.
The TMC ETR-SG mechanism doesn't allow starting the trace at non-zero offset (required by perf). So we make some tricky changes to the table at run time to allow starting at any "Page aligned" offset and then wrap around to the beginning of the buffer with very less overhead. See patches for more description.
The series also improves the way the ETR is controlled by different modes (sysfs vs. perf) by keeping mode specific data. This allows access to the trace data collected in sysfs mode, even when the ETR is operated in perf mode. Also with the transparent management of the buffer and scatter-gather mechanism, we can allow the user to request for larger trace buffers for sysfs mode. This is supported by providing a sysfs file, "buffer_size" which accepts a page aligned size, which will be used by the ETR when allocating a buffer.
Finally, it cleans up the etm perf sink callbacks a little bit and then adds the support for ETR sink. For the ETR, we try our best to use the perf ring buffer as the target hardware buffer, provided : 1) The ETR is dma coherent (since the pages will be shared with userspace perf tool). 2) The perf is used in snapshot mode (The ETR cannot be stopped based on the size of the data written hence we could easily overwrite the buffer. We may be able to fix this in the future) 3) The ETR supports the Scatter-Gather mode.
If we can't use the perf buffers directly, we fallback to using software buffering where we have to copy the trace data back to the perf ring buffer.
Suzuki K Poulose (17): coresight etr: Disallow perf mode temporarily coresight tmc: Hide trace buffer handling for file read coresight: Add helper for inserting synchronization packets coresight: Add generic TMC sg table framework coresight: Add support for TMC ETR SG unit coresight: tmc: Make ETR SG table circular coresight: tmc etr: Add transparent buffer management coresight: tmc: Add configuration support for trace buffer size coresight: Convert driver messages to dev_dbg coresight: etr: Track if the device is coherent coresight etr: Handle driver mode specific ETR buffers coresight etr: Relax collection of trace from sysfs mode coresight etr: Do not clean ETR trace buffer coresight: etr: Add support for save restore buffers coresight: etr_buf: Add helper for padding an area of trace data coresight: perf: Remove reset_buffer call back for sinks coresight perf: Add ETR backend support for etm-perf
.../ABI/testing/sysfs-bus-coresight-devices-tmc | 8 + .../coresight/coresight-dynamic-replicator.c | 4 +- drivers/hwtracing/coresight/coresight-etb10.c | 72 +- drivers/hwtracing/coresight/coresight-etm-perf.c | 9 +- drivers/hwtracing/coresight/coresight-etm3x.c | 4 +- drivers/hwtracing/coresight/coresight-etm4x.c | 4 +- drivers/hwtracing/coresight/coresight-funnel.c | 4 +- drivers/hwtracing/coresight/coresight-priv.h | 8 + drivers/hwtracing/coresight/coresight-replicator.c | 4 +- drivers/hwtracing/coresight/coresight-stm.c | 4 +- drivers/hwtracing/coresight/coresight-tmc-etf.c | 109 +- drivers/hwtracing/coresight/coresight-tmc-etr.c | 1665 ++++++++++++++++++-- drivers/hwtracing/coresight/coresight-tmc.c | 75 +- drivers/hwtracing/coresight/coresight-tmc.h | 128 +- drivers/hwtracing/coresight/coresight-tpiu.c | 4 +- include/linux/coresight.h | 5 +- 16 files changed, 1837 insertions(+), 270 deletions(-)
We don't support ETR in perf mode yet. Temporarily fail the operation until we add proper support.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com --- drivers/hwtracing/coresight/coresight-tmc-etr.c | 28 ++----------------------- 1 file changed, 2 insertions(+), 26 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index 68fbc8f7450e..d0208f01afd9 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -192,32 +192,8 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev)
static int tmc_enable_etr_sink_perf(struct coresight_device *csdev) { - int ret = 0; - unsigned long flags; - struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); - - spin_lock_irqsave(&drvdata->spinlock, flags); - if (drvdata->reading) { - ret = -EINVAL; - goto out; - } - - /* - * In Perf mode there can be only one writer per sink. There - * is also no need to continue if the ETR is already operated - * from sysFS. - */ - if (drvdata->mode != CS_MODE_DISABLED) { - ret = -EINVAL; - goto out; - } - - drvdata->mode = CS_MODE_PERF; - tmc_etr_enable_hw(drvdata); -out: - spin_unlock_irqrestore(&drvdata->spinlock, flags); - - return ret; + /* We don't support perf mode yet ! */ + return -EINVAL; }
static int tmc_enable_etr_sink(struct coresight_device *csdev, u32 mode)
At the moment we adjust the buffer pointers for reading the trace data via misc device in the common code for ETF/ETB and ETR. Since we are going to change how we manage the buffer for ETR, let us move the buffer manipulation to the respective driver files, hiding it from the common code. We do so by adding type specific helpers for finding the length of data and the pointer to the buffer, for a given length at a file position.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com --- drivers/hwtracing/coresight/coresight-tmc-etf.c | 16 ++++++++++++ drivers/hwtracing/coresight/coresight-tmc-etr.c | 33 ++++++++++++++++++++++++ drivers/hwtracing/coresight/coresight-tmc.c | 34 ++++++++++++++----------- drivers/hwtracing/coresight/coresight-tmc.h | 4 +++ 4 files changed, 72 insertions(+), 15 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c index e2513b786242..0b6f1eb746de 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c @@ -120,6 +120,22 @@ static void tmc_etf_disable_hw(struct tmc_drvdata *drvdata) CS_LOCK(drvdata->base); }
+/* + * Return the available trace data in the buffer from @pos, with + * a maximum limit of @len, updating the @bufpp on where to + * find it. + */ +ssize_t tmc_etb_get_sysfs_trace(struct tmc_drvdata *drvdata, + loff_t pos, size_t len, char **bufpp) +{ + /* Adjust the len to available size @pos */ + if (pos + len > drvdata->len) + len = drvdata->len - pos; + if (len > 0) + *bufpp = drvdata->buf + pos; + return len; +} + static int tmc_enable_etf_sink_sysfs(struct coresight_device *csdev) { int ret = 0; diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index d0208f01afd9..063f253f1c99 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -69,6 +69,39 @@ static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata) CS_LOCK(drvdata->base); }
+/* + * Return the available trace data in the buffer @pos, with a maximum + * limit of @len, also updating the @bufpp on where to find it. + */ +ssize_t tmc_etr_get_sysfs_trace(struct tmc_drvdata *drvdata, + loff_t pos, size_t len, char **bufpp) +{ + char *bufp = drvdata->buf + pos; + char *bufend = (char *)(drvdata->vaddr + drvdata->size); + + /* Adjust the len to available size @pos */ + if (pos + len > drvdata->len) + len = drvdata->len - pos; + + if (len <= 0) + return len; + + /* + * Since we use a circular buffer, with trace data starting + * @drvdata->buf, possibly anywhere in the buffer @drvdata->vaddr, + * wrap the current @pos to within the buffer. + */ + if (bufp >= bufend) + bufp -= drvdata->size; + /* + * For simplicity, avoid copying over a wrapped around buffer. + */ + if ((bufp + len) > bufend) + len = bufend - bufp; + *bufpp = bufp; + return len; +} + static void tmc_etr_dump_hw(struct tmc_drvdata *drvdata) { const u32 *barrier; diff --git a/drivers/hwtracing/coresight/coresight-tmc.c b/drivers/hwtracing/coresight/coresight-tmc.c index 2ff4a66a3caa..c7201e40d737 100644 --- a/drivers/hwtracing/coresight/coresight-tmc.c +++ b/drivers/hwtracing/coresight/coresight-tmc.c @@ -131,24 +131,29 @@ static int tmc_open(struct inode *inode, struct file *file) return 0; }
+static inline ssize_t tmc_get_sysfs_trace(struct tmc_drvdata *drvdata, + loff_t pos, size_t len, char **bufpp) +{ + switch (drvdata->config_type) { + case TMC_CONFIG_TYPE_ETB: + case TMC_CONFIG_TYPE_ETF: + return tmc_etb_get_sysfs_trace(drvdata, pos, len, bufpp); + case TMC_CONFIG_TYPE_ETR: + return tmc_etr_get_sysfs_trace(drvdata, pos, len, bufpp); + } + + return -EINVAL; +} + static ssize_t tmc_read(struct file *file, char __user *data, size_t len, loff_t *ppos) { + char *bufp; struct tmc_drvdata *drvdata = container_of(file->private_data, struct tmc_drvdata, miscdev); - char *bufp = drvdata->buf + *ppos; - - if (*ppos + len > drvdata->len) - len = drvdata->len - *ppos; - - if (drvdata->config_type == TMC_CONFIG_TYPE_ETR) { - if (bufp == (char *)(drvdata->vaddr + drvdata->size)) - bufp = drvdata->vaddr; - else if (bufp > (char *)(drvdata->vaddr + drvdata->size)) - bufp -= drvdata->size; - if ((bufp + len) > (char *)(drvdata->vaddr + drvdata->size)) - len = (char *)(drvdata->vaddr + drvdata->size) - bufp; - } + len = tmc_get_sysfs_trace(drvdata, *ppos, len, &bufp); + if (len <= 0) + return 0;
if (copy_to_user(data, bufp, len)) { dev_dbg(drvdata->dev, "%s: copy_to_user failed\n", __func__); @@ -156,9 +161,8 @@ static ssize_t tmc_read(struct file *file, char __user *data, size_t len, }
*ppos += len; + dev_dbg(drvdata->dev, "%zu bytes copied\n", len);
- dev_dbg(drvdata->dev, "%s: %zu bytes copied, %d bytes left\n", - __func__, len, (int)(drvdata->len - *ppos)); return len; }
diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h index 8df7a813f537..6deb3afe9db8 100644 --- a/drivers/hwtracing/coresight/coresight-tmc.h +++ b/drivers/hwtracing/coresight/coresight-tmc.h @@ -183,10 +183,14 @@ int tmc_read_unprepare_etb(struct tmc_drvdata *drvdata); extern const struct coresight_ops tmc_etb_cs_ops; extern const struct coresight_ops tmc_etf_cs_ops;
+ssize_t tmc_etb_get_sysfs_trace(struct tmc_drvdata *drvdata, + loff_t pos, size_t len, char **bufpp); /* ETR functions */ int tmc_read_prepare_etr(struct tmc_drvdata *drvdata); int tmc_read_unprepare_etr(struct tmc_drvdata *drvdata); extern const struct coresight_ops tmc_etr_cs_ops; +ssize_t tmc_etr_get_sysfs_trace(struct tmc_drvdata *drvdata, + loff_t pos, size_t len, char **bufpp);
#define TMC_REG_PAIR(name, lo_off, hi_off) \
Hi Suzuki,
On 19/10/17 18:15, Suzuki K Poulose wrote:
At the moment we adjust the buffer pointers for reading the trace data via misc device in the common code for ETF/ETB and ETR. Since we are going to change how we manage the buffer for ETR, let us move the buffer manipulation to the respective driver files, hiding it from the common code. We do so by adding type specific helpers for finding the length of data and the pointer to the buffer, for a given length at a file position.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
drivers/hwtracing/coresight/coresight-tmc-etf.c | 16 ++++++++++++ drivers/hwtracing/coresight/coresight-tmc-etr.c | 33 ++++++++++++++++++++++++ drivers/hwtracing/coresight/coresight-tmc.c | 34 ++++++++++++++----------- drivers/hwtracing/coresight/coresight-tmc.h | 4 +++ 4 files changed, 72 insertions(+), 15 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c index e2513b786242..0b6f1eb746de 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c @@ -120,6 +120,22 @@ static void tmc_etf_disable_hw(struct tmc_drvdata *drvdata) CS_LOCK(drvdata->base); } +/*
- Return the available trace data in the buffer from @pos, with
- a maximum limit of @len, updating the @bufpp on where to
- find it.
- */
+ssize_t tmc_etb_get_sysfs_trace(struct tmc_drvdata *drvdata,
loff_t pos, size_t len, char **bufpp)
+{
- /* Adjust the len to available size @pos */
- if (pos + len > drvdata->len)
len = drvdata->len - pos;
- if (len > 0)
Do we have some guarantee that "pos <= drvdata->len"? because since len is unsigned this check only covers the case were len is 0.
Maybe it would be better to use a signed variable to store the result of the difference.
*bufpp = drvdata->buf + pos;
- return len;
+}
- static int tmc_enable_etf_sink_sysfs(struct coresight_device *csdev) { int ret = 0;
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index d0208f01afd9..063f253f1c99 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -69,6 +69,39 @@ static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata) CS_LOCK(drvdata->base); } +/*
- Return the available trace data in the buffer @pos, with a maximum
- limit of @len, also updating the @bufpp on where to find it.
- */
+ssize_t tmc_etr_get_sysfs_trace(struct tmc_drvdata *drvdata,
loff_t pos, size_t len, char **bufpp)
+{
- char *bufp = drvdata->buf + pos;
- char *bufend = (char *)(drvdata->vaddr + drvdata->size);
- /* Adjust the len to available size @pos */
- if (pos + len > drvdata->len)
len = drvdata->len - pos;
- if (len <= 0)
return len;
Similar issue here.
Cheers,
On 20/10/17 13:34, Julien Thierry wrote:
Hi Suzuki,
On 19/10/17 18:15, Suzuki K Poulose wrote:
At the moment we adjust the buffer pointers for reading the trace data via misc device in the common code for ETF/ETB and ETR. Since we are going to change how we manage the buffer for ETR, let us move the buffer manipulation to the respective driver files, hiding it from the common code. We do so by adding type specific helpers for finding the length of data and the pointer to the buffer, for a given length at a file position.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
drivers/hwtracing/coresight/coresight-tmc-etf.c | 16 ++++++++++++ drivers/hwtracing/coresight/coresight-tmc-etr.c | 33 ++++++++++++++++++++++++ drivers/hwtracing/coresight/coresight-tmc.c | 34 ++++++++++++++----------- drivers/hwtracing/coresight/coresight-tmc.h | 4 +++ 4 files changed, 72 insertions(+), 15 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c index e2513b786242..0b6f1eb746de 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c @@ -120,6 +120,22 @@ static void tmc_etf_disable_hw(struct tmc_drvdata *drvdata) CS_LOCK(drvdata->base); } +/*
- Return the available trace data in the buffer from @pos, with
- a maximum limit of @len, updating the @bufpp on where to
- find it.
- */
+ssize_t tmc_etb_get_sysfs_trace(struct tmc_drvdata *drvdata, + loff_t pos, size_t len, char **bufpp) +{ + /* Adjust the len to available size @pos */ + if (pos + len > drvdata->len) + len = drvdata->len - pos; + if (len > 0)
Do we have some guarantee that "pos <= drvdata->len"? because since len is unsigned this check only covers the case were len is 0.
Maybe it would be better to use a signed variable to store the result of the difference.
+ *bufpp = drvdata->buf + pos;
- Return the available trace data in the buffer @pos, with a maximum
- limit of @len, also updating the @bufpp on where to find it.
- */
+ssize_t tmc_etr_get_sysfs_trace(struct tmc_drvdata *drvdata, + loff_t pos, size_t len, char **bufpp) +{ + char *bufp = drvdata->buf + pos; + char *bufend = (char *)(drvdata->vaddr + drvdata->size);
+ /* Adjust the len to available size @pos */ + if (pos + len > drvdata->len) + len = drvdata->len - pos;
+ if (len <= 0) + return len;
Similar issue here.
Thanks for spotting. I will fix it.
Cheers
Right now we open code filling the trace buffer with synchronization packets when the circular buffer wraps around in different drivers. Move this to a common place.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Cc: Mike Leach mike.leach@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com --- drivers/hwtracing/coresight/coresight-etb10.c | 10 +++------ drivers/hwtracing/coresight/coresight-priv.h | 8 ++++++++ drivers/hwtracing/coresight/coresight-tmc-etf.c | 27 ++++++++----------------- drivers/hwtracing/coresight/coresight-tmc-etr.c | 13 +----------- 4 files changed, 20 insertions(+), 38 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-etb10.c b/drivers/hwtracing/coresight/coresight-etb10.c index 56ecd7aff5eb..d7164ab8e229 100644 --- a/drivers/hwtracing/coresight/coresight-etb10.c +++ b/drivers/hwtracing/coresight/coresight-etb10.c @@ -203,7 +203,6 @@ static void etb_dump_hw(struct etb_drvdata *drvdata) bool lost = false; int i; u8 *buf_ptr; - const u32 *barrier; u32 read_data, depth; u32 read_ptr, write_ptr; u32 frame_off, frame_endoff; @@ -234,19 +233,16 @@ static void etb_dump_hw(struct etb_drvdata *drvdata)
depth = drvdata->buffer_depth; buf_ptr = drvdata->buf; - barrier = barrier_pkt; for (i = 0; i < depth; i++) { read_data = readl_relaxed(drvdata->base + ETB_RAM_READ_DATA_REG); - if (lost && *barrier) { - read_data = *barrier; - barrier++; - } - *(u32 *)buf_ptr = read_data; buf_ptr += 4; }
+ if (lost) + coresight_insert_barrier_packet(drvdata->buf); + if (frame_off) { buf_ptr -= (frame_endoff * 4); for (i = 0; i < frame_endoff; i++) { diff --git a/drivers/hwtracing/coresight/coresight-priv.h b/drivers/hwtracing/coresight/coresight-priv.h index f1d0e21d8cab..d12f64928c00 100644 --- a/drivers/hwtracing/coresight/coresight-priv.h +++ b/drivers/hwtracing/coresight/coresight-priv.h @@ -65,6 +65,7 @@ static DEVICE_ATTR_RO(name) __coresight_simple_func(type, NULL, name, lo_off, hi_off)
extern const u32 barrier_pkt[5]; +#define CORESIGHT_BARRIER_PKT_SIZE (sizeof(barrier_pkt) - sizeof(u32))
enum etm_addr_type { ETM_ADDR_TYPE_NONE, @@ -98,6 +99,13 @@ struct cs_buffers { void **data_pages; };
+static inline void coresight_insert_barrier_packet(void *buf) +{ + if (buf) + memcpy(buf, barrier_pkt, CORESIGHT_BARRIER_PKT_SIZE); +} + + static inline void CS_LOCK(void __iomem *addr) { do { diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c index 0b6f1eb746de..d89bfb3042a2 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c @@ -43,39 +43,28 @@ static void tmc_etb_enable_hw(struct tmc_drvdata *drvdata)
static void tmc_etb_dump_hw(struct tmc_drvdata *drvdata) { - bool lost = false; char *bufp; - const u32 *barrier; - u32 read_data, status; + u32 read_data, lost; int i;
- /* - * Get a hold of the status register and see if a wrap around - * has occurred. - */ - status = readl_relaxed(drvdata->base + TMC_STS); - if (status & TMC_STS_FULL) - lost = true; - + /* Check if the buffer was wrapped around. */ + lost = readl_relaxed(drvdata->base + TMC_STS) & TMC_STS_FULL; bufp = drvdata->buf; drvdata->len = 0; - barrier = barrier_pkt; while (1) { for (i = 0; i < drvdata->memwidth; i++) { read_data = readl_relaxed(drvdata->base + TMC_RRD); if (read_data == 0xFFFFFFFF) - return; - - if (lost && *barrier) { - read_data = *barrier; - barrier++; - } - + goto done; memcpy(bufp, &read_data, 4); bufp += 4; drvdata->len += 4; } } +done: + if (lost) + coresight_insert_barrier_packet(drvdata->buf); + return; }
static void tmc_etb_disable_hw(struct tmc_drvdata *drvdata) diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index 063f253f1c99..41535fa6b6cf 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -104,9 +104,7 @@ ssize_t tmc_etr_get_sysfs_trace(struct tmc_drvdata *drvdata,
static void tmc_etr_dump_hw(struct tmc_drvdata *drvdata) { - const u32 *barrier; u32 val; - u32 *temp; u64 rwp;
rwp = tmc_read_rwp(drvdata); @@ -119,16 +117,7 @@ static void tmc_etr_dump_hw(struct tmc_drvdata *drvdata) if (val & TMC_STS_FULL) { drvdata->buf = drvdata->vaddr + rwp - drvdata->paddr; drvdata->len = drvdata->size; - - barrier = barrier_pkt; - temp = (u32 *)drvdata->buf; - - while (*barrier) { - *temp = *barrier; - temp++; - barrier++; - } - + coresight_insert_barrier_packet(drvdata->buf); } else { drvdata->buf = drvdata->vaddr; drvdata->len = rwp - drvdata->paddr;
On Thu, Oct 19, 2017 at 06:15:39PM +0100, Suzuki K Poulose wrote:
Right now we open code filling the trace buffer with synchronization packets when the circular buffer wraps around in different drivers. Move this to a common place.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Cc: Mike Leach mike.leach@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
drivers/hwtracing/coresight/coresight-etb10.c | 10 +++------ drivers/hwtracing/coresight/coresight-priv.h | 8 ++++++++ drivers/hwtracing/coresight/coresight-tmc-etf.c | 27 ++++++++----------------- drivers/hwtracing/coresight/coresight-tmc-etr.c | 13 +----------- 4 files changed, 20 insertions(+), 38 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-etb10.c b/drivers/hwtracing/coresight/coresight-etb10.c index 56ecd7aff5eb..d7164ab8e229 100644 --- a/drivers/hwtracing/coresight/coresight-etb10.c +++ b/drivers/hwtracing/coresight/coresight-etb10.c @@ -203,7 +203,6 @@ static void etb_dump_hw(struct etb_drvdata *drvdata) bool lost = false; int i; u8 *buf_ptr;
- const u32 *barrier; u32 read_data, depth; u32 read_ptr, write_ptr; u32 frame_off, frame_endoff;
@@ -234,19 +233,16 @@ static void etb_dump_hw(struct etb_drvdata *drvdata) depth = drvdata->buffer_depth; buf_ptr = drvdata->buf;
- barrier = barrier_pkt; for (i = 0; i < depth; i++) { read_data = readl_relaxed(drvdata->base + ETB_RAM_READ_DATA_REG);
if (lost && *barrier) {
read_data = *barrier;
barrier++;
}
- *(u32 *)buf_ptr = read_data; buf_ptr += 4; }
- if (lost)
coresight_insert_barrier_packet(drvdata->buf);
- if (frame_off) { buf_ptr -= (frame_endoff * 4); for (i = 0; i < frame_endoff; i++) {
diff --git a/drivers/hwtracing/coresight/coresight-priv.h b/drivers/hwtracing/coresight/coresight-priv.h index f1d0e21d8cab..d12f64928c00 100644 --- a/drivers/hwtracing/coresight/coresight-priv.h +++ b/drivers/hwtracing/coresight/coresight-priv.h @@ -65,6 +65,7 @@ static DEVICE_ATTR_RO(name) __coresight_simple_func(type, NULL, name, lo_off, hi_off) extern const u32 barrier_pkt[5]; +#define CORESIGHT_BARRIER_PKT_SIZE (sizeof(barrier_pkt) - sizeof(u32))
When using a memcpy() there is no need to have a 0x0 at the end of the barrier_pkt array. A such I suggest you remove that and simply use sizeof() in coresight_insert_barrier_packet().
I'll review the rest of your patches tomorrow.
enum etm_addr_type { ETM_ADDR_TYPE_NONE, @@ -98,6 +99,13 @@ struct cs_buffers { void **data_pages; }; +static inline void coresight_insert_barrier_packet(void *buf) +{
- if (buf)
memcpy(buf, barrier_pkt, CORESIGHT_BARRIER_PKT_SIZE);
+}
static inline void CS_LOCK(void __iomem *addr) { do { diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c index 0b6f1eb746de..d89bfb3042a2 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c @@ -43,39 +43,28 @@ static void tmc_etb_enable_hw(struct tmc_drvdata *drvdata) static void tmc_etb_dump_hw(struct tmc_drvdata *drvdata) {
- bool lost = false; char *bufp;
- const u32 *barrier;
- u32 read_data, status;
- u32 read_data, lost; int i;
- /*
* Get a hold of the status register and see if a wrap around
* has occurred.
*/
- status = readl_relaxed(drvdata->base + TMC_STS);
- if (status & TMC_STS_FULL)
lost = true;
- /* Check if the buffer was wrapped around. */
- lost = readl_relaxed(drvdata->base + TMC_STS) & TMC_STS_FULL; bufp = drvdata->buf; drvdata->len = 0;
- barrier = barrier_pkt; while (1) { for (i = 0; i < drvdata->memwidth; i++) { read_data = readl_relaxed(drvdata->base + TMC_RRD); if (read_data == 0xFFFFFFFF)
return;
if (lost && *barrier) {
read_data = *barrier;
barrier++;
}
} }goto done; memcpy(bufp, &read_data, 4); bufp += 4; drvdata->len += 4;
+done:
- if (lost)
coresight_insert_barrier_packet(drvdata->buf);
- return;
} static void tmc_etb_disable_hw(struct tmc_drvdata *drvdata) diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index 063f253f1c99..41535fa6b6cf 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -104,9 +104,7 @@ ssize_t tmc_etr_get_sysfs_trace(struct tmc_drvdata *drvdata, static void tmc_etr_dump_hw(struct tmc_drvdata *drvdata) {
- const u32 *barrier; u32 val;
- u32 *temp; u64 rwp;
rwp = tmc_read_rwp(drvdata); @@ -119,16 +117,7 @@ static void tmc_etr_dump_hw(struct tmc_drvdata *drvdata) if (val & TMC_STS_FULL) { drvdata->buf = drvdata->vaddr + rwp - drvdata->paddr; drvdata->len = drvdata->size;
barrier = barrier_pkt;
temp = (u32 *)drvdata->buf;
while (*barrier) {
*temp = *barrier;
temp++;
barrier++;
}
} else { drvdata->buf = drvdata->vaddr; drvdata->len = rwp - drvdata->paddr;coresight_insert_barrier_packet(drvdata->buf);
-- 2.13.6
On 30/10/17 21:44, Mathieu Poirier wrote:
On Thu, Oct 19, 2017 at 06:15:39PM +0100, Suzuki K Poulose wrote:
Right now we open code filling the trace buffer with synchronization packets when the circular buffer wraps around in different drivers. Move this to a common place.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Cc: Mike Leach mike.leach@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
drivers/hwtracing/coresight/coresight-etb10.c | 10 +++------ drivers/hwtracing/coresight/coresight-priv.h | 8 ++++++++ drivers/hwtracing/coresight/coresight-tmc-etf.c | 27 ++++++++----------------- drivers/hwtracing/coresight/coresight-tmc-etr.c | 13 +----------- 4 files changed, 20 insertions(+), 38 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-etb10.c b/drivers/hwtracing/coresight/coresight-etb10.c index 56ecd7aff5eb..d7164ab8e229 100644 --- a/drivers/hwtracing/coresight/coresight-etb10.c +++ b/drivers/hwtracing/coresight/coresight-etb10.c @@ -203,7 +203,6 @@ static void etb_dump_hw(struct etb_drvdata *drvdata) bool lost = false; int i; u8 *buf_ptr;
- const u32 *barrier; u32 read_data, depth; u32 read_ptr, write_ptr; u32 frame_off, frame_endoff;
@@ -234,19 +233,16 @@ static void etb_dump_hw(struct etb_drvdata *drvdata) depth = drvdata->buffer_depth; buf_ptr = drvdata->buf;
- barrier = barrier_pkt; for (i = 0; i < depth; i++) { read_data = readl_relaxed(drvdata->base + ETB_RAM_READ_DATA_REG);
if (lost && *barrier) {
read_data = *barrier;
barrier++;
}
- *(u32 *)buf_ptr = read_data; buf_ptr += 4; }
- if (lost)
coresight_insert_barrier_packet(drvdata->buf);
- if (frame_off) { buf_ptr -= (frame_endoff * 4); for (i = 0; i < frame_endoff; i++) {
diff --git a/drivers/hwtracing/coresight/coresight-priv.h b/drivers/hwtracing/coresight/coresight-priv.h index f1d0e21d8cab..d12f64928c00 100644 --- a/drivers/hwtracing/coresight/coresight-priv.h +++ b/drivers/hwtracing/coresight/coresight-priv.h @@ -65,6 +65,7 @@ static DEVICE_ATTR_RO(name) __coresight_simple_func(type, NULL, name, lo_off, hi_off) extern const u32 barrier_pkt[5]; +#define CORESIGHT_BARRIER_PKT_SIZE (sizeof(barrier_pkt) - sizeof(u32))
When using a memcpy() there is no need to have a 0x0 at the end of the barrier_pkt array. A such I suggest you remove that and simply use sizeof() in coresight_insert_barrier_packet().
There is one place where can't simply do a memcpy(), in tmc_update_etf_buffer(), where we could potentially move over to the next PAGE while filling the barrier packets. This is why I didn't trim it off. However, I believe this shouldn't trigger as the trace data should always be aligned to the frame size of the TMC and the perf buffer size is page aligned. So, we should be able use memcpy() in that case too. I will fix it in the next version.
Thanks Suzuki
This patch introduces a generic sg table data structure and associated operations. An SG table can be used to map a set of Data pages where the trace data could be stored by the TMC ETR. The information about the data pages could be stored in different formats, depending on the type of the underlying SG mechanism (e.g, TMC ETR SG vs Coresight CATU). The generic structure provides book keeping of the pages used for the data as well as the table contents. The table should be filled by the user of the infrastructure.
A table can be created by specifying the number of data pages as well as the number of table pages required to hold the pointers, where the latter could be different for different types of tables. The pages are mapped in the appropriate dma data direction mode (i.e, DMA_TO_DEVICE for table pages and DMA_FROM_DEVICE for data pages). The framework can optionally accept a set of allocated data pages (e.g, perf ring buffer) and map them accordingly. The table and data pages are vmap'ed to allow easier access by the drivers. The framework also provides helpers to sync the data written to the pages with appropriate directions.
This will be later used by the TMC ETR SG unit.
Cc: Mathieu Poirier matheiu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com --- --- drivers/hwtracing/coresight/coresight-tmc-etr.c | 289 +++++++++++++++++++++++- drivers/hwtracing/coresight/coresight-tmc.h | 44 ++++ 2 files changed, 332 insertions(+), 1 deletion(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index 41535fa6b6cf..4b9e2b276122 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -16,10 +16,297 @@ */
#include <linux/coresight.h> -#include <linux/dma-mapping.h> +#include <linux/slab.h> #include "coresight-priv.h" #include "coresight-tmc.h"
+/* + * tmc_pages_get_offset: Go through all the pages in the tmc_pages + * and map @phys_addr to an offset within the buffer. + */ +static long +tmc_pages_get_offset(struct tmc_pages *tmc_pages, dma_addr_t addr) +{ + int i; + dma_addr_t page_start; + + for (i = 0; i < tmc_pages->nr_pages; i++) { + page_start = tmc_pages->daddrs[i]; + if (addr >= page_start && addr < (page_start + PAGE_SIZE)) + return i * PAGE_SIZE + (addr - page_start); + } + + return -EINVAL; +} + +/* + * tmc_pages_free : Unmap and free the pages used by tmc_pages. + */ +static void tmc_pages_free(struct tmc_pages *tmc_pages, + struct device *dev, enum dma_data_direction dir) +{ + int i; + + for (i = 0; i < tmc_pages->nr_pages; i++) { + if (tmc_pages->daddrs && tmc_pages->daddrs[i]) + dma_unmap_page(dev, tmc_pages->daddrs[i], + PAGE_SIZE, dir); + if (tmc_pages->pages && tmc_pages->pages[i]) + __free_page(tmc_pages->pages[i]); + } + + kfree(tmc_pages->pages); + kfree(tmc_pages->daddrs); + tmc_pages->pages = NULL; + tmc_pages->daddrs = NULL; + tmc_pages->nr_pages = 0; +} + +/* + * tmc_pages_alloc : Allocate and map pages for a given @tmc_pages. + * If @pages is not NULL, the list of page virtual addresses are + * used as the data pages. The pages are then dma_map'ed for @dev + * with dma_direction @dir. + * + * Returns 0 upon success, else the error number. + */ +static int tmc_pages_alloc(struct tmc_pages *tmc_pages, + struct device *dev, int node, + enum dma_data_direction dir, void **pages) +{ + int i, nr_pages; + dma_addr_t paddr; + struct page *page; + + nr_pages = tmc_pages->nr_pages; + tmc_pages->daddrs = kcalloc(nr_pages, sizeof(*tmc_pages->daddrs), + GFP_KERNEL); + if (!tmc_pages->daddrs) + return -ENOMEM; + tmc_pages->pages = kcalloc(nr_pages, sizeof(*tmc_pages->pages), + GFP_KERNEL); + if (!tmc_pages->pages) { + kfree(tmc_pages->daddrs); + tmc_pages->daddrs = NULL; + return -ENOMEM; + } + + for (i = 0; i < nr_pages; i++) { + if (pages && pages[i]) { + page = virt_to_page(pages[i]); + get_page(page); + } else { + page = alloc_pages_node(node, + GFP_KERNEL | __GFP_ZERO, 0); + } + paddr = dma_map_page(dev, page, 0, PAGE_SIZE, dir); + if (dma_mapping_error(dev, paddr)) + goto err; + tmc_pages->daddrs[i] = paddr; + tmc_pages->pages[i] = page; + } + return 0; +err: + tmc_pages_free(tmc_pages, dev, dir); + return -ENOMEM; +} + +static inline dma_addr_t tmc_sg_table_base_paddr(struct tmc_sg_table *sg_table) +{ + if (WARN_ON(!sg_table->data_pages.pages[0])) + return 0; + return sg_table->table_daddr; +} + +static inline void *tmc_sg_table_base_vaddr(struct tmc_sg_table *sg_table) +{ + if (WARN_ON(!sg_table->data_pages.pages[0])) + return NULL; + return sg_table->table_vaddr; +} + +static inline void * +tmc_sg_table_data_vaddr(struct tmc_sg_table *sg_table) +{ + if (WARN_ON(!sg_table->data_pages.nr_pages)) + return 0; + return sg_table->data_vaddr; +} + +static inline unsigned long +tmc_sg_table_buf_size(struct tmc_sg_table *sg_table) +{ + return sg_table->data_pages.nr_pages << PAGE_SHIFT; +} + +static inline long +tmc_sg_get_data_page_offset(struct tmc_sg_table *sg_table, dma_addr_t addr) +{ + return tmc_pages_get_offset(&sg_table->data_pages, addr); +} + +static inline void tmc_free_table_pages(struct tmc_sg_table *sg_table) +{ + if (sg_table->table_vaddr) + vunmap(sg_table->table_vaddr); + tmc_pages_free(&sg_table->table_pages, sg_table->dev, DMA_TO_DEVICE); +} + +static void tmc_free_data_pages(struct tmc_sg_table *sg_table) +{ + if (sg_table->data_vaddr) + vunmap(sg_table->data_vaddr); + tmc_pages_free(&sg_table->data_pages, sg_table->dev, DMA_FROM_DEVICE); +} + +void tmc_free_sg_table(struct tmc_sg_table *sg_table) +{ + tmc_free_table_pages(sg_table); + tmc_free_data_pages(sg_table); +} + +/* + * Alloc pages for the table. Since this will be used by the device, + * allocate the pages closer to the device (i.e, dev_to_node(dev) + * rather than the CPU nod). + */ +static int tmc_alloc_table_pages(struct tmc_sg_table *sg_table) +{ + int rc; + struct tmc_pages *table_pages = &sg_table->table_pages; + + rc = tmc_pages_alloc(table_pages, sg_table->dev, + dev_to_node(sg_table->dev), + DMA_TO_DEVICE, NULL); + if (rc) + return rc; + sg_table->table_vaddr = vmap(table_pages->pages, + table_pages->nr_pages, + VM_MAP, + PAGE_KERNEL); + if (!sg_table->table_vaddr) + rc = -ENOMEM; + else + sg_table->table_daddr = table_pages->daddrs[0]; + return rc; +} + +static int tmc_alloc_data_pages(struct tmc_sg_table *sg_table, void **pages) +{ + int rc; + + rc = tmc_pages_alloc(&sg_table->data_pages, + sg_table->dev, sg_table->node, + DMA_FROM_DEVICE, pages); + if (!rc) { + sg_table->data_vaddr = vmap(sg_table->data_pages.pages, + sg_table->data_pages.nr_pages, + VM_MAP, + PAGE_KERNEL); + if (!sg_table->data_vaddr) + rc = -ENOMEM; + } + return rc; +} + +/* + * tmc_alloc_sg_table: Allocate and setup dma pages for the TMC SG table + * and data buffers. TMC writes to the data buffers and reads from the SG + * Table pages. + * + * @dev - Device to which page should be DMA mapped. + * @node - Numa node for mem allocations + * @nr_tpages - Number of pages for the table entries. + * @nr_dpages - Number of pages for Data buffer. + * @pages - Optional list of virtual address of pages. + */ +struct tmc_sg_table *tmc_alloc_sg_table(struct device *dev, + int node, + int nr_tpages, + int nr_dpages, + void **pages) +{ + long rc; + struct tmc_sg_table *sg_table; + + sg_table = kzalloc(sizeof(*sg_table), GFP_KERNEL); + if (!sg_table) + return ERR_PTR(-ENOMEM); + sg_table->data_pages.nr_pages = nr_dpages; + sg_table->table_pages.nr_pages = nr_tpages; + sg_table->node = node; + sg_table->dev = dev; + + rc = tmc_alloc_data_pages(sg_table, pages); + if (!rc) + rc = tmc_alloc_table_pages(sg_table); + if (rc) { + tmc_free_sg_table(sg_table); + kfree(sg_table); + return ERR_PTR(rc); + } + + return sg_table; +} + +/* + * tmc_sg_table_sync_data_range: Sync the data buffer written + * by the device from @offset upto a @size bytes. + */ +void tmc_sg_table_sync_data_range(struct tmc_sg_table *table, + u64 offset, u64 size) +{ + int i, index, start; + int npages = DIV_ROUND_UP(size, PAGE_SIZE); + struct device *dev = table->dev; + struct tmc_pages *data = &table->data_pages; + + start = offset >> PAGE_SHIFT; + for (i = start; i < (start + npages); i++) { + index = i % data->nr_pages; + dma_sync_single_for_cpu(dev, data->daddrs[index], + PAGE_SIZE, DMA_FROM_DEVICE); + } +} + +/* tmc_sg_sync_table: Sync the page table */ +void tmc_sg_table_sync_table(struct tmc_sg_table *sg_table) +{ + int i; + struct device *dev = sg_table->dev; + struct tmc_pages *table_pages = &sg_table->table_pages; + + for (i = 0; i < table_pages->nr_pages; i++) + dma_sync_single_for_device(dev, table_pages->daddrs[i], + PAGE_SIZE, DMA_TO_DEVICE); +} + +/* + * tmc_sg_table_get_data: Get the buffer pointer for data @offset + * in the SG buffer. The @bufpp is updated to point to the buffer. + * Returns : + * the length of linear data available at @offset. + * or + * <= 0 if no data is available. + */ +ssize_t tmc_sg_table_get_data(struct tmc_sg_table *sg_table, + u64 offset, size_t len, char **bufpp) +{ + size_t size; + int pg_idx = offset >> PAGE_SHIFT; + int pg_offset = offset & (PAGE_SIZE - 1); + struct tmc_pages *data_pages = &sg_table->data_pages; + + size = tmc_sg_table_buf_size(sg_table); + if (offset >= size) + return -EINVAL; + len = (len < (size - offset)) ? len : size - offset; + len = (len < (PAGE_SIZE - pg_offset)) ? len : (PAGE_SIZE - pg_offset); + if (len > 0) + *bufpp = page_address(data_pages->pages[pg_idx]) + pg_offset; + return len; +} + static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata) { u32 axictl, sts; diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h index 6deb3afe9db8..5e49c035a1ac 100644 --- a/drivers/hwtracing/coresight/coresight-tmc.h +++ b/drivers/hwtracing/coresight/coresight-tmc.h @@ -19,6 +19,7 @@ #define _CORESIGHT_TMC_H
#include <linux/miscdevice.h> +#include <linux/dma-mapping.h>
#define TMC_RSZ 0x004 #define TMC_STS 0x00c @@ -171,6 +172,38 @@ struct tmc_drvdata { u32 etr_caps; };
+/** + * struct tmc_pages - Collection of pages used for SG. + * @nr_pages: Number of pages in the list. + * @daddr: DMA'able page address returned by dma_map_page(). + * @vaddr: Virtual address returned by page_address(). + */ +struct tmc_pages { + int nr_pages; + dma_addr_t *daddrs; + struct page **pages; +}; + +/* + * struct tmc_sg_table : Generic SG table for TMC + * @dev: Device for DMA allocations + * @table_vaddr: Contiguous Virtual address for PageTable + * @data_vaddr: Contiguous Virtual address for Data Buffer + * @table_daddr: DMA address of the PageTable base + * @node: Node for Page allocations + * @table_pages: List of pages & dma address for Table + * @data_pages: List of pages & dma address for Data + */ +struct tmc_sg_table { + struct device *dev; + void *table_vaddr; + void *data_vaddr; + dma_addr_t table_daddr; + int node; + struct tmc_pages table_pages; + struct tmc_pages data_pages; +}; + /* Generic functions */ void tmc_wait_for_tmcready(struct tmc_drvdata *drvdata); void tmc_flush_and_stop(struct tmc_drvdata *drvdata); @@ -226,4 +259,15 @@ static inline bool tmc_etr_has_cap(struct tmc_drvdata *drvdata, u32 cap) return !!(drvdata->etr_caps & cap); }
+struct tmc_sg_table *tmc_alloc_sg_table(struct device *dev, + int node, + int nr_tpages, + int nr_dpages, + void **pages); +void tmc_free_sg_table(struct tmc_sg_table *sg_table); +void tmc_sg_table_sync_table(struct tmc_sg_table *sg_table); +void tmc_sg_table_sync_data_range(struct tmc_sg_table *table, + u64 offset, u64 size); +ssize_t tmc_sg_table_get_data(struct tmc_sg_table *sg_table, + u64 offset, size_t len, char **bufpp); #endif
On Thu, Oct 19, 2017 at 06:15:40PM +0100, Suzuki K Poulose wrote:
This patch introduces a generic sg table data structure and associated operations. An SG table can be used to map a set of Data pages where the trace data could be stored by the TMC ETR. The information about the data pages could be stored in different formats, depending on the type of the underlying SG mechanism (e.g, TMC ETR SG vs Coresight CATU). The generic structure provides book keeping of the pages used for the data as well as the table contents. The table should be filled by the user of the infrastructure.
A table can be created by specifying the number of data pages as well as the number of table pages required to hold the pointers, where the latter could be different for different types of tables. The pages are mapped in the appropriate dma data direction mode (i.e, DMA_TO_DEVICE for table pages and DMA_FROM_DEVICE for data pages). The framework can optionally accept a set of allocated data pages (e.g, perf ring buffer) and map them accordingly. The table and data pages are vmap'ed to allow easier access by the drivers. The framework also provides helpers to sync the data written to the pages with appropriate directions.
This will be later used by the TMC ETR SG unit.
Cc: Mathieu Poirier matheiu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
drivers/hwtracing/coresight/coresight-tmc-etr.c | 289 +++++++++++++++++++++++- drivers/hwtracing/coresight/coresight-tmc.h | 44 ++++ 2 files changed, 332 insertions(+), 1 deletion(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index 41535fa6b6cf..4b9e2b276122 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -16,10 +16,297 @@ */ #include <linux/coresight.h> -#include <linux/dma-mapping.h> +#include <linux/slab.h> #include "coresight-priv.h" #include "coresight-tmc.h" +/*
- tmc_pages_get_offset: Go through all the pages in the tmc_pages
- and map @phys_addr to an offset within the buffer.
Did you mean "... map @addr"? It might also be worth it to explicitly mention that it maps a physical address to an offset in the contiguous range.
- */
+static long +tmc_pages_get_offset(struct tmc_pages *tmc_pages, dma_addr_t addr) +{
- int i;
- dma_addr_t page_start;
- for (i = 0; i < tmc_pages->nr_pages; i++) {
page_start = tmc_pages->daddrs[i];
if (addr >= page_start && addr < (page_start + PAGE_SIZE))
return i * PAGE_SIZE + (addr - page_start);
- }
- return -EINVAL;
+}
+/*
- tmc_pages_free : Unmap and free the pages used by tmc_pages.
- */
+static void tmc_pages_free(struct tmc_pages *tmc_pages,
struct device *dev, enum dma_data_direction dir)
+{
- int i;
- for (i = 0; i < tmc_pages->nr_pages; i++) {
if (tmc_pages->daddrs && tmc_pages->daddrs[i])
dma_unmap_page(dev, tmc_pages->daddrs[i],
PAGE_SIZE, dir);
if (tmc_pages->pages && tmc_pages->pages[i])
__free_page(tmc_pages->pages[i]);
- }
- kfree(tmc_pages->pages);
- kfree(tmc_pages->daddrs);
- tmc_pages->pages = NULL;
- tmc_pages->daddrs = NULL;
- tmc_pages->nr_pages = 0;
+}
+/*
- tmc_pages_alloc : Allocate and map pages for a given @tmc_pages.
- If @pages is not NULL, the list of page virtual addresses are
- used as the data pages. The pages are then dma_map'ed for @dev
- with dma_direction @dir.
- Returns 0 upon success, else the error number.
- */
+static int tmc_pages_alloc(struct tmc_pages *tmc_pages,
struct device *dev, int node,
enum dma_data_direction dir, void **pages)
+{
- int i, nr_pages;
- dma_addr_t paddr;
- struct page *page;
- nr_pages = tmc_pages->nr_pages;
- tmc_pages->daddrs = kcalloc(nr_pages, sizeof(*tmc_pages->daddrs),
GFP_KERNEL);
- if (!tmc_pages->daddrs)
return -ENOMEM;
- tmc_pages->pages = kcalloc(nr_pages, sizeof(*tmc_pages->pages),
GFP_KERNEL);
- if (!tmc_pages->pages) {
kfree(tmc_pages->daddrs);
tmc_pages->daddrs = NULL;
return -ENOMEM;
- }
- for (i = 0; i < nr_pages; i++) {
if (pages && pages[i]) {
page = virt_to_page(pages[i]);
get_page(page);
} else {
page = alloc_pages_node(node,
GFP_KERNEL | __GFP_ZERO, 0);
}
paddr = dma_map_page(dev, page, 0, PAGE_SIZE, dir);
if (dma_mapping_error(dev, paddr))
goto err;
tmc_pages->daddrs[i] = paddr;
tmc_pages->pages[i] = page;
- }
- return 0;
+err:
- tmc_pages_free(tmc_pages, dev, dir);
- return -ENOMEM;
+}
+static inline dma_addr_t tmc_sg_table_base_paddr(struct tmc_sg_table *sg_table) +{
- if (WARN_ON(!sg_table->data_pages.pages[0]))
return 0;
- return sg_table->table_daddr;
+}
+static inline void *tmc_sg_table_base_vaddr(struct tmc_sg_table *sg_table) +{
- if (WARN_ON(!sg_table->data_pages.pages[0]))
return NULL;
- return sg_table->table_vaddr;
+}
+static inline void * +tmc_sg_table_data_vaddr(struct tmc_sg_table *sg_table) +{
- if (WARN_ON(!sg_table->data_pages.nr_pages))
return 0;
- return sg_table->data_vaddr;
+}
+static inline unsigned long +tmc_sg_table_buf_size(struct tmc_sg_table *sg_table) +{
- return sg_table->data_pages.nr_pages << PAGE_SHIFT;
+}
+static inline long +tmc_sg_get_data_page_offset(struct tmc_sg_table *sg_table, dma_addr_t addr) +{
- return tmc_pages_get_offset(&sg_table->data_pages, addr);
+}
+static inline void tmc_free_table_pages(struct tmc_sg_table *sg_table) +{
- if (sg_table->table_vaddr)
vunmap(sg_table->table_vaddr);
- tmc_pages_free(&sg_table->table_pages, sg_table->dev, DMA_TO_DEVICE);
+}
+static void tmc_free_data_pages(struct tmc_sg_table *sg_table) +{
- if (sg_table->data_vaddr)
vunmap(sg_table->data_vaddr);
- tmc_pages_free(&sg_table->data_pages, sg_table->dev, DMA_FROM_DEVICE);
+}
+void tmc_free_sg_table(struct tmc_sg_table *sg_table) +{
- tmc_free_table_pages(sg_table);
- tmc_free_data_pages(sg_table);
+}
+/*
- Alloc pages for the table. Since this will be used by the device,
- allocate the pages closer to the device (i.e, dev_to_node(dev)
- rather than the CPU nod).
- */
+static int tmc_alloc_table_pages(struct tmc_sg_table *sg_table) +{
- int rc;
- struct tmc_pages *table_pages = &sg_table->table_pages;
- rc = tmc_pages_alloc(table_pages, sg_table->dev,
dev_to_node(sg_table->dev),
DMA_TO_DEVICE, NULL);
- if (rc)
return rc;
- sg_table->table_vaddr = vmap(table_pages->pages,
table_pages->nr_pages,
VM_MAP,
PAGE_KERNEL);
- if (!sg_table->table_vaddr)
rc = -ENOMEM;
- else
sg_table->table_daddr = table_pages->daddrs[0];
- return rc;
+}
+static int tmc_alloc_data_pages(struct tmc_sg_table *sg_table, void **pages) +{
- int rc;
- rc = tmc_pages_alloc(&sg_table->data_pages,
sg_table->dev, sg_table->node,
Am I missing something very subtle here or sg_table->node should be the same as dev_to_node(sg_table->dev)? If the same both tmc_alloc_table_pages() and tmc_alloc_data_pages() should be using the same construct. Otherwise please add a comment to justify the difference.
DMA_FROM_DEVICE, pages);
- if (!rc) {
sg_table->data_vaddr = vmap(sg_table->data_pages.pages,
sg_table->data_pages.nr_pages,
VM_MAP,
PAGE_KERNEL);
if (!sg_table->data_vaddr)
rc = -ENOMEM;
- }
- return rc;
+}
+/*
- tmc_alloc_sg_table: Allocate and setup dma pages for the TMC SG table
- and data buffers. TMC writes to the data buffers and reads from the SG
- Table pages.
- @dev - Device to which page should be DMA mapped.
- @node - Numa node for mem allocations
- @nr_tpages - Number of pages for the table entries.
- @nr_dpages - Number of pages for Data buffer.
- @pages - Optional list of virtual address of pages.
- */
+struct tmc_sg_table *tmc_alloc_sg_table(struct device *dev,
int node,
int nr_tpages,
int nr_dpages,
void **pages)
+{
- long rc;
- struct tmc_sg_table *sg_table;
- sg_table = kzalloc(sizeof(*sg_table), GFP_KERNEL);
- if (!sg_table)
return ERR_PTR(-ENOMEM);
- sg_table->data_pages.nr_pages = nr_dpages;
- sg_table->table_pages.nr_pages = nr_tpages;
- sg_table->node = node;
- sg_table->dev = dev;
- rc = tmc_alloc_data_pages(sg_table, pages);
- if (!rc)
rc = tmc_alloc_table_pages(sg_table);
- if (rc) {
tmc_free_sg_table(sg_table);
kfree(sg_table);
return ERR_PTR(rc);
- }
- return sg_table;
+}
+/*
- tmc_sg_table_sync_data_range: Sync the data buffer written
- by the device from @offset upto a @size bytes.
- */
+void tmc_sg_table_sync_data_range(struct tmc_sg_table *table,
u64 offset, u64 size)
+{
- int i, index, start;
- int npages = DIV_ROUND_UP(size, PAGE_SIZE);
- struct device *dev = table->dev;
- struct tmc_pages *data = &table->data_pages;
- start = offset >> PAGE_SHIFT;
- for (i = start; i < (start + npages); i++) {
index = i % data->nr_pages;
dma_sync_single_for_cpu(dev, data->daddrs[index],
PAGE_SIZE, DMA_FROM_DEVICE);
- }
+}
+/* tmc_sg_sync_table: Sync the page table */ +void tmc_sg_table_sync_table(struct tmc_sg_table *sg_table) +{
- int i;
- struct device *dev = sg_table->dev;
- struct tmc_pages *table_pages = &sg_table->table_pages;
- for (i = 0; i < table_pages->nr_pages; i++)
dma_sync_single_for_device(dev, table_pages->daddrs[i],
PAGE_SIZE, DMA_TO_DEVICE);
+}
+/*
- tmc_sg_table_get_data: Get the buffer pointer for data @offset
- in the SG buffer. The @bufpp is updated to point to the buffer.
- Returns :
- the length of linear data available at @offset.
- or
- <= 0 if no data is available.
- */
+ssize_t tmc_sg_table_get_data(struct tmc_sg_table *sg_table,
u64 offset, size_t len, char **bufpp)
+{
- size_t size;
- int pg_idx = offset >> PAGE_SHIFT;
- int pg_offset = offset & (PAGE_SIZE - 1);
- struct tmc_pages *data_pages = &sg_table->data_pages;
- size = tmc_sg_table_buf_size(sg_table);
- if (offset >= size)
return -EINVAL;
- len = (len < (size - offset)) ? len : size - offset;
- len = (len < (PAGE_SIZE - pg_offset)) ? len : (PAGE_SIZE - pg_offset);
- if (len > 0)
*bufpp = page_address(data_pages->pages[pg_idx]) + pg_offset;
- return len;
+}
static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata) { u32 axictl, sts; diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h index 6deb3afe9db8..5e49c035a1ac 100644 --- a/drivers/hwtracing/coresight/coresight-tmc.h +++ b/drivers/hwtracing/coresight/coresight-tmc.h @@ -19,6 +19,7 @@ #define _CORESIGHT_TMC_H #include <linux/miscdevice.h> +#include <linux/dma-mapping.h> #define TMC_RSZ 0x004 #define TMC_STS 0x00c @@ -171,6 +172,38 @@ struct tmc_drvdata { u32 etr_caps; }; +/**
- struct tmc_pages - Collection of pages used for SG.
- @nr_pages: Number of pages in the list.
- @daddr: DMA'able page address returned by dma_map_page().
- @vaddr: Virtual address returned by page_address().
This isn't accurate.
- */
+struct tmc_pages {
- int nr_pages;
- dma_addr_t *daddrs;
- struct page **pages;
+};
+/*
- struct tmc_sg_table : Generic SG table for TMC
Use a '-' as above or fix the above to be ':'. I don't mind which is used as long as they are the same.
- @dev: Device for DMA allocations
- @table_vaddr: Contiguous Virtual address for PageTable
- @data_vaddr: Contiguous Virtual address for Data Buffer
- @table_daddr: DMA address of the PageTable base
- @node: Node for Page allocations
- @table_pages: List of pages & dma address for Table
- @data_pages: List of pages & dma address for Data
- */
+struct tmc_sg_table {
- struct device *dev;
- void *table_vaddr;
- void *data_vaddr;
- dma_addr_t table_daddr;
- int node;
- struct tmc_pages table_pages;
- struct tmc_pages data_pages;
+};
/* Generic functions */ void tmc_wait_for_tmcready(struct tmc_drvdata *drvdata); void tmc_flush_and_stop(struct tmc_drvdata *drvdata); @@ -226,4 +259,15 @@ static inline bool tmc_etr_has_cap(struct tmc_drvdata *drvdata, u32 cap) return !!(drvdata->etr_caps & cap); } +struct tmc_sg_table *tmc_alloc_sg_table(struct device *dev,
int node,
int nr_tpages,
int nr_dpages,
void **pages);
+void tmc_free_sg_table(struct tmc_sg_table *sg_table); +void tmc_sg_table_sync_table(struct tmc_sg_table *sg_table); +void tmc_sg_table_sync_data_range(struct tmc_sg_table *table,
u64 offset, u64 size);
+ssize_t tmc_sg_table_get_data(struct tmc_sg_table *sg_table,
u64 offset, size_t len, char **bufpp);
#endif
I like this implementation, much cleaner than what I previously had.
-- 2.13.6
On 31/10/17 22:13, Mathieu Poirier wrote:
On Thu, Oct 19, 2017 at 06:15:40PM +0100, Suzuki K Poulose wrote:
This patch introduces a generic sg table data structure and associated operations. An SG table can be used to map a set of Data pages where the trace data could be stored by the TMC ETR. The information about the data pages could be stored in different formats, depending on the type of the underlying SG mechanism (e.g, TMC ETR SG vs Coresight CATU). The generic structure provides book keeping of the pages used for the data as well as the table contents. The table should be filled by the user of the infrastructure.
A table can be created by specifying the number of data pages as well as the number of table pages required to hold the pointers, where the latter could be different for different types of tables. The pages are mapped in the appropriate dma data direction mode (i.e, DMA_TO_DEVICE for table pages and DMA_FROM_DEVICE for data pages). The framework can optionally accept a set of allocated data pages (e.g, perf ring buffer) and map them accordingly. The table and data pages are vmap'ed to allow easier access by the drivers. The framework also provides helpers to sync the data written to the pages with appropriate directions.
This will be later used by the TMC ETR SG unit.
Cc: Mathieu Poirier matheiu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
drivers/hwtracing/coresight/coresight-tmc-etr.c | 289 +++++++++++++++++++++++- drivers/hwtracing/coresight/coresight-tmc.h | 44 ++++ 2 files changed, 332 insertions(+), 1 deletion(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index 41535fa6b6cf..4b9e2b276122 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -16,10 +16,297 @@ */ #include <linux/coresight.h> -#include <linux/dma-mapping.h> +#include <linux/slab.h> #include "coresight-priv.h" #include "coresight-tmc.h" +/*
- tmc_pages_get_offset: Go through all the pages in the tmc_pages
- and map @phys_addr to an offset within the buffer.
Did you mean "... map @addr"? It might also be worth it to explicitly mention that it maps a physical address to an offset in the contiguous range.
Yes, definitely. I will fix it.
...
+/*
- Alloc pages for the table. Since this will be used by the device,
- allocate the pages closer to the device (i.e, dev_to_node(dev)
- rather than the CPU nod).
- */
+static int tmc_alloc_table_pages(struct tmc_sg_table *sg_table) +{
- int rc;
- struct tmc_pages *table_pages = &sg_table->table_pages;
- rc = tmc_pages_alloc(table_pages, sg_table->dev,
dev_to_node(sg_table->dev),
DMA_TO_DEVICE, NULL);
- if (rc)
return rc;
- sg_table->table_vaddr = vmap(table_pages->pages,
table_pages->nr_pages,
VM_MAP,
PAGE_KERNEL);
- if (!sg_table->table_vaddr)
rc = -ENOMEM;
- else
sg_table->table_daddr = table_pages->daddrs[0];
- return rc;
+}
+static int tmc_alloc_data_pages(struct tmc_sg_table *sg_table, void **pages) +{
- int rc;
- rc = tmc_pages_alloc(&sg_table->data_pages,
sg_table->dev, sg_table->node,
Am I missing something very subtle here or sg_table->node should be the same as dev_to_node(sg_table->dev)? If the same both tmc_alloc_table_pages() and tmc_alloc_data_pages() should be using the same construct. Otherwise please add a comment to justify the difference.
Yes, it was a last minute change to switch the table to use dev_to_node(), while the data pages are allocated as requested by the user. Eventually the user would consume the data pages (even though the device produces it). However, the table pages are solely for the consumption of the device, hence the dev_to_node().
I will add a comment to make that explicit.
u32 axictl, sts; diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h index 6deb3afe9db8..5e49c035a1ac 100644 --- a/drivers/hwtracing/coresight/coresight-tmc.h +++ b/drivers/hwtracing/coresight/coresight-tmc.h @@ -19,6 +19,7 @@ #define _CORESIGHT_TMC_H #include <linux/miscdevice.h> +#include <linux/dma-mapping.h> #define TMC_RSZ 0x004 #define TMC_STS 0x00c @@ -171,6 +172,38 @@ struct tmc_drvdata { u32 etr_caps; }; +/**
- struct tmc_pages - Collection of pages used for SG.
- @nr_pages: Number of pages in the list.
- @daddr: DMA'able page address returned by dma_map_page().
- @vaddr: Virtual address returned by page_address().
This isn't accurate.
Yes, I will clean that up. It kind of shows the number of revisions this series has changed before reaching here ;-)
- */
+struct tmc_pages {
- int nr_pages;
- dma_addr_t *daddrs;
- struct page **pages;
+};
+/*
- struct tmc_sg_table : Generic SG table for TMC
Use a '-' as above or fix the above to be ':'. I don't mind which is used as long as they are the same.
Ok.
- @dev: Device for DMA allocations
- @table_vaddr: Contiguous Virtual address for PageTable
- @data_vaddr: Contiguous Virtual address for Data Buffer
- @table_daddr: DMA address of the PageTable base
- @node: Node for Page allocations
- @table_pages: List of pages & dma address for Table
- @data_pages: List of pages & dma address for Data
- */
+struct tmc_sg_table {
- struct device *dev;
- void *table_vaddr;
- void *data_vaddr;
- dma_addr_t table_daddr;
- int node;
- struct tmc_pages table_pages;
- struct tmc_pages data_pages;
+};
- /* Generic functions */ void tmc_wait_for_tmcready(struct tmc_drvdata *drvdata); void tmc_flush_and_stop(struct tmc_drvdata *drvdata);
@@ -226,4 +259,15 @@ static inline bool tmc_etr_has_cap(struct tmc_drvdata *drvdata, u32 cap) return !!(drvdata->etr_caps & cap); } +struct tmc_sg_table *tmc_alloc_sg_table(struct device *dev,
int node,
int nr_tpages,
int nr_dpages,
void **pages);
+void tmc_free_sg_table(struct tmc_sg_table *sg_table); +void tmc_sg_table_sync_table(struct tmc_sg_table *sg_table); +void tmc_sg_table_sync_data_range(struct tmc_sg_table *table,
u64 offset, u64 size);
+ssize_t tmc_sg_table_get_data(struct tmc_sg_table *sg_table,
#endifu64 offset, size_t len, char **bufpp);
I like this implementation, much cleaner than what I previously had.
Thanks for the review !
Cheers Suzuki
This patch adds support for setting up an SG table used by the TMC ETR inbuilt SG unit. The TMC ETR uses 4K page sized tables to hold pointers to the 4K data pages with the last entry in a table pointing to the next table with the entries, by kind of chaining. The 2 LSBs determine the type of the table entry, to one of :
Normal - Points to a 4KB data page. Last - Points to a 4KB data page, but is the last entry in the page table. Link - Points to another 4KB table page with pointers to data.
The code takes care of handling the system page size which could be different than 4K. So we could end up putting multiple ETR SG tables in a single system page, vice versa for the data pages.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com --- drivers/hwtracing/coresight/coresight-tmc-etr.c | 256 ++++++++++++++++++++++++ 1 file changed, 256 insertions(+)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index 4b9e2b276122..4424eb67a54c 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -21,6 +21,89 @@ #include "coresight-tmc.h"
/* + * The TMC ETR SG has a page size of 4K. The SG table contains pointers + * to 4KB buffers. However, the OS may be use PAGE_SIZE different from + * 4K (i.e, 16KB or 64KB). This implies that a single OS page could + * contain more than one SG buffer and tables. + * + * A table entry has the following format: + * + * ---Bit31------------Bit4-------Bit1-----Bit0-- + * | Address[39:12] | SBZ | Entry Type | + * ---------------------------------------------- + * + * Address: Bits [39:12] of a physical page address. Bits [11:0] are + * always zero. + * + * Entry type: + * b00 - Reserved. + * b01 - Last entry in the tables, points to 4K page buffer. + * b10 - Normal entry, points to 4K page buffer. + * b11 - Link. The address points to the base of next table. + */ + +typedef u32 sgte_t; + +#define ETR_SG_PAGE_SHIFT 12 +#define ETR_SG_PAGE_SIZE (1UL << ETR_SG_PAGE_SHIFT) +#define ETR_SG_PAGES_PER_SYSPAGE (1UL << \ + (PAGE_SHIFT - ETR_SG_PAGE_SHIFT)) +#define ETR_SG_PTRS_PER_PAGE (ETR_SG_PAGE_SIZE / sizeof(sgte_t)) +#define ETR_SG_PTRS_PER_SYSPAGE (PAGE_SIZE / sizeof(sgte_t)) + +#define ETR_SG_ET_MASK 0x3 +#define ETR_SG_ET_LAST 0x1 +#define ETR_SG_ET_NORMAL 0x2 +#define ETR_SG_ET_LINK 0x3 + +#define ETR_SG_ADDR_SHIFT 4 + +#define ETR_SG_ENTRY(addr, type) \ + (sgte_t)((((addr) >> ETR_SG_PAGE_SHIFT) << ETR_SG_ADDR_SHIFT) | \ + (type & ETR_SG_ET_MASK)) + +#define ETR_SG_ADDR(entry) \ + (((dma_addr_t)(entry) >> ETR_SG_ADDR_SHIFT) << ETR_SG_PAGE_SHIFT) +#define ETR_SG_ET(entry) ((entry) & ETR_SG_ET_MASK) + +/* + * struct etr_sg_table : ETR SG Table + * @sg_table: Generic SG Table holding the data/table pages. + * @hwaddr: hwaddress used by the TMC, which is the base + * address of the table. + */ +struct etr_sg_table { + struct tmc_sg_table *sg_table; + dma_addr_t hwaddr; +}; + +/* + * tmc_etr_sg_table_entries: Total number of table entries required to map + * @nr_pages system pages. + * + * We need to map @nr_pages * ETR_SG_PAGES_PER_SYSPAGE data pages. + * Each TMC page can map (ETR_SG_PTRS_PER_PAGE - 1) buffer pointers, + * with the last entry pointing to the page containing the table + * entries. If we spill over to a new page for mapping 1 entry, + * we could as well replace the link entry of the previous page + * with the last entry. + */ +static inline unsigned long __attribute_const__ +tmc_etr_sg_table_entries(int nr_pages) +{ + unsigned long nr_sgpages = nr_pages * ETR_SG_PAGES_PER_SYSPAGE; + unsigned long nr_sglinks = nr_sgpages / (ETR_SG_PTRS_PER_PAGE - 1); + /* + * If we spill over to a new page for 1 entry, we could as well + * make it the LAST entry in the previous page, skipping the Link + * address. + */ + if (nr_sglinks && (nr_sgpages % (ETR_SG_PTRS_PER_PAGE - 1) < 2)) + nr_sglinks--; + return nr_sgpages + nr_sglinks; +} + +/* * tmc_pages_get_offset: Go through all the pages in the tmc_pages * and map @phys_addr to an offset within the buffer. */ @@ -307,6 +390,179 @@ ssize_t tmc_sg_table_get_data(struct tmc_sg_table *sg_table, return len; }
+#ifdef ETR_SG_DEBUG +/* Map a dma address to virtual address */ +static unsigned long +tmc_sg_daddr_to_vaddr(struct tmc_sg_table *sg_table, + dma_addr_t addr, bool table) +{ + long offset; + unsigned long base; + struct tmc_pages *tmc_pages; + + if (table) { + tmc_pages = &sg_table->table_pages; + base = (unsigned long)sg_table->table_vaddr; + } else { + tmc_pages = &sg_table->data_pages; + base = (unsigned long)sg_table->data_vaddr; + } + + offset = tmc_pages_get_offset(tmc_pages, addr); + if (offset < 0) + return 0; + return base + offset; +} + +/* Dump the given sg_table */ +static void tmc_etr_sg_table_dump(struct etr_sg_table *etr_table) +{ + sgte_t *ptr; + int i = 0; + dma_addr_t addr; + struct tmc_sg_table *sg_table = etr_table->sg_table; + + ptr = (sgte_t *)tmc_sg_daddr_to_vaddr(sg_table, + etr_table->hwaddr, true); + while (ptr) { + addr = ETR_SG_ADDR(*ptr); + switch (ETR_SG_ET(*ptr)) { + case ETR_SG_ET_NORMAL: + pr_debug("%05d: %p\t:[N] 0x%llx\n", i, ptr, addr); + ptr++; + break; + case ETR_SG_ET_LINK: + pr_debug("%05d: *** %p\t:{L} 0x%llx ***\n", + i, ptr, addr); + ptr = (sgte_t *)tmc_sg_daddr_to_vaddr(sg_table, + addr, true); + break; + case ETR_SG_ET_LAST: + pr_debug("%05d: ### %p\t:[L] 0x%llx ###\n", + i, ptr, addr); + return; + } + i++; + } + pr_debug("******* End of Table *****\n"); +} +#endif + +/* + * Populate the SG Table page table entries from table/data + * pages allocated. Each Data page has ETR_SG_PAGES_PER_SYSPAGE SG pages. + * So does a Table page. So we keep track of indices of the tables + * in each system page and move the pointers accordingly. + */ +#define INC_IDX_ROUND(idx, size) (idx = (idx + 1) % size) +static void tmc_etr_sg_table_populate(struct etr_sg_table *etr_table) +{ + dma_addr_t paddr; + int i, type, nr_entries; + int tpidx = 0; /* index to the current system table_page */ + int sgtidx = 0; /* index to the sg_table within the current syspage */ + int sgtoffset = 0; /* offset to the next entry within the sg_table */ + int dpidx = 0; /* index to the current system data_page */ + int spidx = 0; /* index to the SG page within the current data page */ + sgte_t *ptr; /* pointer to the table entry to fill */ + struct tmc_sg_table *sg_table = etr_table->sg_table; + dma_addr_t *table_daddrs = sg_table->table_pages.daddrs; + dma_addr_t *data_daddrs = sg_table->data_pages.daddrs; + + nr_entries = tmc_etr_sg_table_entries(sg_table->data_pages.nr_pages); + /* + * Use the contiguous virtual address of the table to update entries. + */ + ptr = sg_table->table_vaddr; + /* + * Fill all the entries, except the last entry to avoid special + * checks within the loop. + */ + for (i = 0; i < nr_entries - 1; i++) { + if (sgtoffset == ETR_SG_PTRS_PER_PAGE - 1) { + /* + * Last entry in a sg_table page is a link address to + * the next table page. If this sg_table is the last + * one in the system page, it links to the first + * sg_table in the next system page. Otherwise, it + * links to the next sg_table page within the system + * page. + */ + if (sgtidx == ETR_SG_PAGES_PER_SYSPAGE - 1) { + paddr = table_daddrs[tpidx + 1]; + } else { + paddr = table_daddrs[tpidx] + + (ETR_SG_PAGE_SIZE * (sgtidx + 1)); + } + type = ETR_SG_ET_LINK; + } else { + /* + * Update the idices to the data_pages to point to the + * next sg_page in the data buffer. + */ + type = ETR_SG_ET_NORMAL; + paddr = data_daddrs[dpidx] + spidx * ETR_SG_PAGE_SIZE; + if (!INC_IDX_ROUND(spidx, ETR_SG_PAGES_PER_SYSPAGE)) + dpidx++; + } + *ptr++ = ETR_SG_ENTRY(paddr, type); + /* + * Move to the next table pointer, moving the table page index + * if necessary + */ + if (!INC_IDX_ROUND(sgtoffset, ETR_SG_PTRS_PER_PAGE)) { + if (!INC_IDX_ROUND(sgtidx, ETR_SG_PAGES_PER_SYSPAGE)) + tpidx++; + } + } + + /* Set up the last entry, which is always a data pointer */ + paddr = data_daddrs[dpidx] + spidx * ETR_SG_PAGE_SIZE; + *ptr++ = ETR_SG_ENTRY(paddr, ETR_SG_ET_LAST); +} + +/* + * tmc_init_etr_sg_table: Allocate a TMC ETR SG table, data buffer of @size and + * populate the table. + * + * @dev - Device pointer for the TMC + * @node - NUMA node where the memory should be allocated + * @size - Total size of the data buffer + * @pages - Optional list of page virtual address + */ +static struct etr_sg_table __maybe_unused * +tmc_init_etr_sg_table(struct device *dev, int node, + unsigned long size, void **pages) +{ + int nr_entries, nr_tpages; + int nr_dpages = size >> PAGE_SHIFT; + struct tmc_sg_table *sg_table; + struct etr_sg_table *etr_table; + + etr_table = kzalloc(sizeof(*etr_table), GFP_KERNEL); + if (!etr_table) + return ERR_PTR(-ENOMEM); + nr_entries = tmc_etr_sg_table_entries(nr_dpages); + nr_tpages = DIV_ROUND_UP(nr_entries, ETR_SG_PTRS_PER_SYSPAGE); + + sg_table = tmc_alloc_sg_table(dev, node, nr_tpages, nr_dpages, pages); + if (IS_ERR(sg_table)) { + kfree(etr_table); + return ERR_PTR(PTR_ERR(sg_table)); + } + + etr_table->sg_table = sg_table; + /* TMC should use table base address for DBA */ + etr_table->hwaddr = sg_table->table_daddr; + tmc_etr_sg_table_populate(etr_table); + /* Sync the table pages for the HW */ + tmc_sg_table_sync_table(sg_table); +#ifdef ETR_SG_DEBUG + tmc_etr_sg_table_dump(etr_table); +#endif + return etr_table; +} + static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata) { u32 axictl, sts;
Hi Suzuki,
On 19/10/17 18:15, Suzuki K Poulose wrote:
This patch adds support for setting up an SG table used by the TMC ETR inbuilt SG unit. The TMC ETR uses 4K page sized tables to hold pointers to the 4K data pages with the last entry in a table pointing to the next table with the entries, by kind of chaining. The 2 LSBs determine the type of the table entry, to one of :
Normal - Points to a 4KB data page. Last - Points to a 4KB data page, but is the last entry in the page table. Link - Points to another 4KB table page with pointers to data.
The code takes care of handling the system page size which could be different than 4K. So we could end up putting multiple ETR SG tables in a single system page, vice versa for the data pages.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
drivers/hwtracing/coresight/coresight-tmc-etr.c | 256 ++++++++++++++++++++++++ 1 file changed, 256 insertions(+)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index 4b9e2b276122..4424eb67a54c 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -21,6 +21,89 @@ #include "coresight-tmc.h" /*
- The TMC ETR SG has a page size of 4K. The SG table contains pointers
- to 4KB buffers. However, the OS may be use PAGE_SIZE different from
nit: "the OS may use a PAGE_SIZE different from".
- 4K (i.e, 16KB or 64KB). This implies that a single OS page could
- contain more than one SG buffer and tables.
- A table entry has the following format:
- ---Bit31------------Bit4-------Bit1-----Bit0--
- | Address[39:12] | SBZ | Entry Type |
- Address: Bits [39:12] of a physical page address. Bits [11:0] are
always zero.
- Entry type:
- b00 - Reserved.
- b01 - Last entry in the tables, points to 4K page buffer.
- b10 - Normal entry, points to 4K page buffer.
- b11 - Link. The address points to the base of next table.
- */
+typedef u32 sgte_t;
+#define ETR_SG_PAGE_SHIFT 12 +#define ETR_SG_PAGE_SIZE (1UL << ETR_SG_PAGE_SHIFT) +#define ETR_SG_PAGES_PER_SYSPAGE (1UL << \
(PAGE_SHIFT - ETR_SG_PAGE_SHIFT))
I think this would be slightly easier to understand if defined as: "(PAGE_SIZE / ETR_SG_PAGE_SIZE)".
+#define ETR_SG_PTRS_PER_PAGE (ETR_SG_PAGE_SIZE / sizeof(sgte_t)) +#define ETR_SG_PTRS_PER_SYSPAGE (PAGE_SIZE / sizeof(sgte_t))
+#define ETR_SG_ET_MASK 0x3 +#define ETR_SG_ET_LAST 0x1 +#define ETR_SG_ET_NORMAL 0x2 +#define ETR_SG_ET_LINK 0x3
+#define ETR_SG_ADDR_SHIFT 4
+#define ETR_SG_ENTRY(addr, type) \
- (sgte_t)((((addr) >> ETR_SG_PAGE_SHIFT) << ETR_SG_ADDR_SHIFT) | \
(type & ETR_SG_ET_MASK))
+#define ETR_SG_ADDR(entry) \
- (((dma_addr_t)(entry) >> ETR_SG_ADDR_SHIFT) << ETR_SG_PAGE_SHIFT)
+#define ETR_SG_ET(entry) ((entry) & ETR_SG_ET_MASK)
+/*
- struct etr_sg_table : ETR SG Table
- @sg_table: Generic SG Table holding the data/table pages.
- @hwaddr: hwaddress used by the TMC, which is the base
address of the table.
- */
+struct etr_sg_table {
- struct tmc_sg_table *sg_table;
- dma_addr_t hwaddr;
+};
+/*
- tmc_etr_sg_table_entries: Total number of table entries required to map
- @nr_pages system pages.
- We need to map @nr_pages * ETR_SG_PAGES_PER_SYSPAGE data pages.
- Each TMC page can map (ETR_SG_PTRS_PER_PAGE - 1) buffer pointers,
- with the last entry pointing to the page containing the table
- entries. If we spill over to a new page for mapping 1 entry,
- we could as well replace the link entry of the previous page
- with the last entry.
- */
+static inline unsigned long __attribute_const__ +tmc_etr_sg_table_entries(int nr_pages) +{
- unsigned long nr_sgpages = nr_pages * ETR_SG_PAGES_PER_SYSPAGE;
- unsigned long nr_sglinks = nr_sgpages / (ETR_SG_PTRS_PER_PAGE - 1);
- /*
* If we spill over to a new page for 1 entry, we could as well
* make it the LAST entry in the previous page, skipping the Link
* address.
*/
- if (nr_sglinks && (nr_sgpages % (ETR_SG_PTRS_PER_PAGE - 1) < 2))
nr_sglinks--;
- return nr_sgpages + nr_sglinks;
+}
+/*
- tmc_pages_get_offset: Go through all the pages in the tmc_pages
- and map @phys_addr to an offset within the buffer.
*/ @@ -307,6 +390,179 @@ ssize_t tmc_sg_table_get_data(struct tmc_sg_table *sg_table, return len; } +#ifdef ETR_SG_DEBUG +/* Map a dma address to virtual address */ +static unsigned long +tmc_sg_daddr_to_vaddr(struct tmc_sg_table *sg_table,
dma_addr_t addr, bool table)
+{
- long offset;
- unsigned long base;
- struct tmc_pages *tmc_pages;
- if (table) {
tmc_pages = &sg_table->table_pages;
base = (unsigned long)sg_table->table_vaddr;
- } else {
tmc_pages = &sg_table->data_pages;
base = (unsigned long)sg_table->data_vaddr;
- }
- offset = tmc_pages_get_offset(tmc_pages, addr);
- if (offset < 0)
return 0;
- return base + offset;
+}
+/* Dump the given sg_table */ +static void tmc_etr_sg_table_dump(struct etr_sg_table *etr_table) +{
- sgte_t *ptr;
- int i = 0;
- dma_addr_t addr;
- struct tmc_sg_table *sg_table = etr_table->sg_table;
- ptr = (sgte_t *)tmc_sg_daddr_to_vaddr(sg_table,
etr_table->hwaddr, true);
- while (ptr) {
addr = ETR_SG_ADDR(*ptr);
switch (ETR_SG_ET(*ptr)) {
case ETR_SG_ET_NORMAL:
pr_debug("%05d: %p\t:[N] 0x%llx\n", i, ptr, addr);
ptr++;
break;
case ETR_SG_ET_LINK:
pr_debug("%05d: *** %p\t:{L} 0x%llx ***\n",
i, ptr, addr);
ptr = (sgte_t *)tmc_sg_daddr_to_vaddr(sg_table,
addr, true);
break;
case ETR_SG_ET_LAST:
pr_debug("%05d: ### %p\t:[L] 0x%llx ###\n",
i, ptr, addr);
return;
I get this is debug code, but it seems like if ETR_SG_ET(*ptr) is 0 we get stuck in an infinite loop. I guess it is something that supposedly doesn't happen, still I'd prefer having a default case saying the table might be corrupted and either incrementing ptr to try and get more info or breaking out of the loop.
}
i++;
- }
- pr_debug("******* End of Table *****\n");
+} +#endif
+/*
- Populate the SG Table page table entries from table/data
- pages allocated. Each Data page has ETR_SG_PAGES_PER_SYSPAGE SG pages.
- So does a Table page. So we keep track of indices of the tables
- in each system page and move the pointers accordingly.
- */
+#define INC_IDX_ROUND(idx, size) (idx = (idx + 1) % size)
Needs more parenthesis around idx and size.
+static void tmc_etr_sg_table_populate(struct etr_sg_table *etr_table) +{
- dma_addr_t paddr;
- int i, type, nr_entries;
- int tpidx = 0; /* index to the current system table_page */
- int sgtidx = 0; /* index to the sg_table within the current syspage */
- int sgtoffset = 0; /* offset to the next entry within the sg_table */
That's misleading, this seems to be the index of an entry within an ETR_SG_PAGE rather than an offset in bytes.
Maybe ptridx or entryidx would be a better name.
- int dpidx = 0; /* index to the current system data_page */
- int spidx = 0; /* index to the SG page within the current data page */
- sgte_t *ptr; /* pointer to the table entry to fill */
- struct tmc_sg_table *sg_table = etr_table->sg_table;
- dma_addr_t *table_daddrs = sg_table->table_pages.daddrs;
- dma_addr_t *data_daddrs = sg_table->data_pages.daddrs;
- nr_entries = tmc_etr_sg_table_entries(sg_table->data_pages.nr_pages);
- /*
* Use the contiguous virtual address of the table to update entries.
*/
- ptr = sg_table->table_vaddr;
- /*
* Fill all the entries, except the last entry to avoid special
* checks within the loop.
*/
- for (i = 0; i < nr_entries - 1; i++) {
if (sgtoffset == ETR_SG_PTRS_PER_PAGE - 1) {
/*
* Last entry in a sg_table page is a link address to
* the next table page. If this sg_table is the last
* one in the system page, it links to the first
* sg_table in the next system page. Otherwise, it
* links to the next sg_table page within the system
* page.
*/
if (sgtidx == ETR_SG_PAGES_PER_SYSPAGE - 1) {
paddr = table_daddrs[tpidx + 1];
} else {
paddr = table_daddrs[tpidx] +
(ETR_SG_PAGE_SIZE * (sgtidx + 1));
}
type = ETR_SG_ET_LINK;
} else {
/*
* Update the idices to the data_pages to point to the
nit: indices
Cheers,
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index 4b9e2b276122..4424eb67a54c 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -21,6 +21,89 @@ #include "coresight-tmc.h" /*
- The TMC ETR SG has a page size of 4K. The SG table contains pointers
- to 4KB buffers. However, the OS may be use PAGE_SIZE different from
nit: "the OS may use a PAGE_SIZE different from".
+#define ETR_SG_PAGE_SHIFT 12 +#define ETR_SG_PAGE_SIZE (1UL << ETR_SG_PAGE_SHIFT) +#define ETR_SG_PAGES_PER_SYSPAGE (1UL << \ + (PAGE_SHIFT - ETR_SG_PAGE_SHIFT))
I think this would be slightly easier to understand if defined as: "(PAGE_SIZE / ETR_SG_PAGE_SIZE)".
+/* Dump the given sg_table */ +static void tmc_etr_sg_table_dump(struct etr_sg_table *etr_table) +{ + sgte_t *ptr; + int i = 0; + dma_addr_t addr; + struct tmc_sg_table *sg_table = etr_table->sg_table;
+ ptr = (sgte_t *)tmc_sg_daddr_to_vaddr(sg_table, + etr_table->hwaddr, true); + while (ptr) { + addr = ETR_SG_ADDR(*ptr); + switch (ETR_SG_ET(*ptr)) { + case ETR_SG_ET_NORMAL: + pr_debug("%05d: %p\t:[N] 0x%llx\n", i, ptr, addr); + ptr++; + break; + case ETR_SG_ET_LINK: + pr_debug("%05d: *** %p\t:{L} 0x%llx ***\n", + i, ptr, addr); + ptr = (sgte_t *)tmc_sg_daddr_to_vaddr(sg_table, + addr, true); + break; + case ETR_SG_ET_LAST: + pr_debug("%05d: ### %p\t:[L] 0x%llx ###\n", + i, ptr, addr); + return;
I get this is debug code, but it seems like if ETR_SG_ET(*ptr) is 0 we get stuck in an infinite loop. I guess it is something that supposedly doesn't happen, still I'd prefer having a default case saying the table might be corrupted and either incrementing ptr to try and get more info or breaking out of the loop.
+ } + i++; + } + pr_debug("******* End of Table *****\n"); +} +#endif
+/*
- Populate the SG Table page table entries from table/data
- pages allocated. Each Data page has ETR_SG_PAGES_PER_SYSPAGE SG pages.
- So does a Table page. So we keep track of indices of the tables
- in each system page and move the pointers accordingly.
- */
+#define INC_IDX_ROUND(idx, size) (idx = (idx + 1) % size)
Needs more parenthesis around idx and size.
+static void tmc_etr_sg_table_populate(struct etr_sg_table *etr_table) +{ + dma_addr_t paddr; + int i, type, nr_entries; + int tpidx = 0; /* index to the current system table_page */ + int sgtidx = 0; /* index to the sg_table within the current syspage */ + int sgtoffset = 0; /* offset to the next entry within the sg_table */
That's misleading, this seems to be the index of an entry within an ETR_SG_PAGE rather than an offset in bytes.
Maybe ptridx or entryidx would be a better name.
You're right, I have chosen sgtentry for now.
Thanks for the detailed look, I will fix all of them.
Cheers Suzuki
On 19 October 2017 at 11:15, Suzuki K Poulose suzuki.poulose@arm.com wrote:
This patch adds support for setting up an SG table used by the TMC ETR inbuilt SG unit. The TMC ETR uses 4K page sized tables to hold pointers to the 4K data pages with the last entry in a table pointing to the next table with the entries, by kind of chaining. The 2 LSBs determine the type of the table entry, to one of :
Normal - Points to a 4KB data page. Last - Points to a 4KB data page, but is the last entry in the page table. Link - Points to another 4KB table page with pointers to data.
The code takes care of handling the system page size which could be different than 4K. So we could end up putting multiple ETR SG tables in a single system page, vice versa for the data pages.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
drivers/hwtracing/coresight/coresight-tmc-etr.c | 256 ++++++++++++++++++++++++ 1 file changed, 256 insertions(+)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index 4b9e2b276122..4424eb67a54c 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -21,6 +21,89 @@ #include "coresight-tmc.h"
/*
- The TMC ETR SG has a page size of 4K. The SG table contains pointers
- to 4KB buffers. However, the OS may be use PAGE_SIZE different from
- 4K (i.e, 16KB or 64KB). This implies that a single OS page could
- contain more than one SG buffer and tables.
- A table entry has the following format:
- ---Bit31------------Bit4-------Bit1-----Bit0--
- | Address[39:12] | SBZ | Entry Type |
- Address: Bits [39:12] of a physical page address. Bits [11:0] are
always zero.
- Entry type:
b00 - Reserved.
b01 - Last entry in the tables, points to 4K page buffer.
b10 - Normal entry, points to 4K page buffer.
b11 - Link. The address points to the base of next table.
- */
+typedef u32 sgte_t;
+#define ETR_SG_PAGE_SHIFT 12 +#define ETR_SG_PAGE_SIZE (1UL << ETR_SG_PAGE_SHIFT) +#define ETR_SG_PAGES_PER_SYSPAGE (1UL << \
(PAGE_SHIFT - ETR_SG_PAGE_SHIFT))
+#define ETR_SG_PTRS_PER_PAGE (ETR_SG_PAGE_SIZE / sizeof(sgte_t)) +#define ETR_SG_PTRS_PER_SYSPAGE (PAGE_SIZE / sizeof(sgte_t))
+#define ETR_SG_ET_MASK 0x3 +#define ETR_SG_ET_LAST 0x1 +#define ETR_SG_ET_NORMAL 0x2 +#define ETR_SG_ET_LINK 0x3
+#define ETR_SG_ADDR_SHIFT 4
+#define ETR_SG_ENTRY(addr, type) \
(sgte_t)((((addr) >> ETR_SG_PAGE_SHIFT) << ETR_SG_ADDR_SHIFT) | \
(type & ETR_SG_ET_MASK))
+#define ETR_SG_ADDR(entry) \
(((dma_addr_t)(entry) >> ETR_SG_ADDR_SHIFT) << ETR_SG_PAGE_SHIFT)
+#define ETR_SG_ET(entry) ((entry) & ETR_SG_ET_MASK)
+/*
- struct etr_sg_table : ETR SG Table
- @sg_table: Generic SG Table holding the data/table pages.
- @hwaddr: hwaddress used by the TMC, which is the base
address of the table.
- */
+struct etr_sg_table {
struct tmc_sg_table *sg_table;
dma_addr_t hwaddr;
+};
+/*
- tmc_etr_sg_table_entries: Total number of table entries required to map
- @nr_pages system pages.
- We need to map @nr_pages * ETR_SG_PAGES_PER_SYSPAGE data pages.
- Each TMC page can map (ETR_SG_PTRS_PER_PAGE - 1) buffer pointers,
- with the last entry pointing to the page containing the table
... with the last entry pointing to another page of table entries. If we ...
- entries. If we spill over to a new page for mapping 1 entry,
- we could as well replace the link entry of the previous page
- with the last entry.
- */
+static inline unsigned long __attribute_const__ +tmc_etr_sg_table_entries(int nr_pages) +{
unsigned long nr_sgpages = nr_pages * ETR_SG_PAGES_PER_SYSPAGE;
unsigned long nr_sglinks = nr_sgpages / (ETR_SG_PTRS_PER_PAGE - 1);
/*
* If we spill over to a new page for 1 entry, we could as well
* make it the LAST entry in the previous page, skipping the Link
* address.
*/
if (nr_sglinks && (nr_sgpages % (ETR_SG_PTRS_PER_PAGE - 1) < 2))
nr_sglinks--;
return nr_sgpages + nr_sglinks;
+}
+/*
- tmc_pages_get_offset: Go through all the pages in the tmc_pages
- and map @phys_addr to an offset within the buffer.
*/ @@ -307,6 +390,179 @@ ssize_t tmc_sg_table_get_data(struct tmc_sg_table *sg_table, return len; }
+#ifdef ETR_SG_DEBUG +/* Map a dma address to virtual address */ +static unsigned long +tmc_sg_daddr_to_vaddr(struct tmc_sg_table *sg_table,
dma_addr_t addr, bool table)
+{
long offset;
unsigned long base;
struct tmc_pages *tmc_pages;
if (table) {
tmc_pages = &sg_table->table_pages;
base = (unsigned long)sg_table->table_vaddr;
} else {
tmc_pages = &sg_table->data_pages;
base = (unsigned long)sg_table->data_vaddr;
}
offset = tmc_pages_get_offset(tmc_pages, addr);
if (offset < 0)
return 0;
return base + offset;
+}
+/* Dump the given sg_table */ +static void tmc_etr_sg_table_dump(struct etr_sg_table *etr_table) +{
sgte_t *ptr;
int i = 0;
dma_addr_t addr;
struct tmc_sg_table *sg_table = etr_table->sg_table;
ptr = (sgte_t *)tmc_sg_daddr_to_vaddr(sg_table,
etr_table->hwaddr, true);
while (ptr) {
addr = ETR_SG_ADDR(*ptr);
switch (ETR_SG_ET(*ptr)) {
case ETR_SG_ET_NORMAL:
pr_debug("%05d: %p\t:[N] 0x%llx\n", i, ptr, addr);
ptr++;
break;
case ETR_SG_ET_LINK:
pr_debug("%05d: *** %p\t:{L} 0x%llx ***\n",
i, ptr, addr);
ptr = (sgte_t *)tmc_sg_daddr_to_vaddr(sg_table,
addr, true);
break;
case ETR_SG_ET_LAST:
pr_debug("%05d: ### %p\t:[L] 0x%llx ###\n",
i, ptr, addr);
return;
}
i++;
}
pr_debug("******* End of Table *****\n");
+} +#endif
+/*
- Populate the SG Table page table entries from table/data
- pages allocated. Each Data page has ETR_SG_PAGES_PER_SYSPAGE SG pages.
- So does a Table page. So we keep track of indices of the tables
- in each system page and move the pointers accordingly.
- */
+#define INC_IDX_ROUND(idx, size) (idx = (idx + 1) % size) +static void tmc_etr_sg_table_populate(struct etr_sg_table *etr_table) +{
dma_addr_t paddr;
int i, type, nr_entries;
int tpidx = 0; /* index to the current system table_page */
int sgtidx = 0; /* index to the sg_table within the current syspage */
int sgtoffset = 0; /* offset to the next entry within the sg_table */
int dpidx = 0; /* index to the current system data_page */
int spidx = 0; /* index to the SG page within the current data page */
sgte_t *ptr; /* pointer to the table entry to fill */
struct tmc_sg_table *sg_table = etr_table->sg_table;
dma_addr_t *table_daddrs = sg_table->table_pages.daddrs;
dma_addr_t *data_daddrs = sg_table->data_pages.daddrs;
nr_entries = tmc_etr_sg_table_entries(sg_table->data_pages.nr_pages);
/*
* Use the contiguous virtual address of the table to update entries.
*/
ptr = sg_table->table_vaddr;
/*
* Fill all the entries, except the last entry to avoid special
* checks within the loop.
*/
for (i = 0; i < nr_entries - 1; i++) {
if (sgtoffset == ETR_SG_PTRS_PER_PAGE - 1) {
/*
* Last entry in a sg_table page is a link address to
* the next table page. If this sg_table is the last
* one in the system page, it links to the first
* sg_table in the next system page. Otherwise, it
* links to the next sg_table page within the system
* page.
*/
if (sgtidx == ETR_SG_PAGES_PER_SYSPAGE - 1) {
paddr = table_daddrs[tpidx + 1];
} else {
paddr = table_daddrs[tpidx] +
(ETR_SG_PAGE_SIZE * (sgtidx + 1));
}
type = ETR_SG_ET_LINK;
} else {
/*
* Update the idices to the data_pages to point to the
* next sg_page in the data buffer.
*/
type = ETR_SG_ET_NORMAL;
paddr = data_daddrs[dpidx] + spidx * ETR_SG_PAGE_SIZE;
if (!INC_IDX_ROUND(spidx, ETR_SG_PAGES_PER_SYSPAGE))
dpidx++;
}
*ptr++ = ETR_SG_ENTRY(paddr, type);
/*
* Move to the next table pointer, moving the table page index
* if necessary
*/
if (!INC_IDX_ROUND(sgtoffset, ETR_SG_PTRS_PER_PAGE)) {
if (!INC_IDX_ROUND(sgtidx, ETR_SG_PAGES_PER_SYSPAGE))
tpidx++;
}
}
/* Set up the last entry, which is always a data pointer */
paddr = data_daddrs[dpidx] + spidx * ETR_SG_PAGE_SIZE;
*ptr++ = ETR_SG_ENTRY(paddr, ETR_SG_ET_LAST);
+}
+/*
- tmc_init_etr_sg_table: Allocate a TMC ETR SG table, data buffer of @size and
- populate the table.
- @dev - Device pointer for the TMC
- @node - NUMA node where the memory should be allocated
- @size - Total size of the data buffer
- @pages - Optional list of page virtual address
- */
+static struct etr_sg_table __maybe_unused * +tmc_init_etr_sg_table(struct device *dev, int node,
unsigned long size, void **pages)
+{
int nr_entries, nr_tpages;
int nr_dpages = size >> PAGE_SHIFT;
struct tmc_sg_table *sg_table;
struct etr_sg_table *etr_table;
etr_table = kzalloc(sizeof(*etr_table), GFP_KERNEL);
if (!etr_table)
return ERR_PTR(-ENOMEM);
nr_entries = tmc_etr_sg_table_entries(nr_dpages);
nr_tpages = DIV_ROUND_UP(nr_entries, ETR_SG_PTRS_PER_SYSPAGE);
sg_table = tmc_alloc_sg_table(dev, node, nr_tpages, nr_dpages, pages);
if (IS_ERR(sg_table)) {
kfree(etr_table);
return ERR_PTR(PTR_ERR(sg_table));
}
etr_table->sg_table = sg_table;
/* TMC should use table base address for DBA */
etr_table->hwaddr = sg_table->table_daddr;
tmc_etr_sg_table_populate(etr_table);
/* Sync the table pages for the HW */
tmc_sg_table_sync_table(sg_table);
+#ifdef ETR_SG_DEBUG
tmc_etr_sg_table_dump(etr_table);
+#endif
return etr_table;
+}
static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata) { u32 axictl, sts; -- 2.13.6
Make the ETR SG table Circular buffer so that we could start at any of the SG pages and use the entire buffer for tracing. This can be achieved by :
1) Keeping an additional LINK pointer at the very end of the SG table, i.e, after the LAST buffer entry, to point back to the beginning of the first table. This will allow us to use the buffer normally when we start the trace at offset 0 of the buffer, as the LAST buffer entry hints the TMC-ETR and it automatically wraps to the offset 0.
2) If we want to start at any other ETR SG page aligned offset, we could : a) Make the preceding page entry as LAST entry. b) Make the original LAST entry a normal entry. c) Use the table pointer to the "new" start offset as the base of the table address. This works as the TMC doesn't mandate that the page table base address should be 4K page aligned.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com --- drivers/hwtracing/coresight/coresight-tmc-etr.c | 159 +++++++++++++++++++++--- 1 file changed, 139 insertions(+), 20 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index 4424eb67a54c..c171b244e12a 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -71,36 +71,41 @@ typedef u32 sgte_t; * @sg_table: Generic SG Table holding the data/table pages. * @hwaddr: hwaddress used by the TMC, which is the base * address of the table. + * @nr_entries: Total number of pointers in the table. + * @first_entry: Index to the current "start" of the buffer. + * @last_entry: Index to the last entry of the buffer. */ struct etr_sg_table { struct tmc_sg_table *sg_table; dma_addr_t hwaddr; + u32 nr_entries; + u32 first_entry; + u32 last_entry; };
/* * tmc_etr_sg_table_entries: Total number of table entries required to map * @nr_pages system pages. * - * We need to map @nr_pages * ETR_SG_PAGES_PER_SYSPAGE data pages. + * We need to map @nr_pages * ETR_SG_PAGES_PER_SYSPAGE data pages and + * an additional Link pointer for making it a Circular buffer. * Each TMC page can map (ETR_SG_PTRS_PER_PAGE - 1) buffer pointers, * with the last entry pointing to the page containing the table - * entries. If we spill over to a new page for mapping 1 entry, - * we could as well replace the link entry of the previous page - * with the last entry. + * entries. If we fill the last table in full with the pointers, (i.e, + * nr_sgpages % (ETR_SG_PTRS_PER_PAGE - 1) == 0, we don't have to allocate + * another table and hence skip the Link pointer. Also we could use the + * link entry of the last page to make it circular. */ static inline unsigned long __attribute_const__ tmc_etr_sg_table_entries(int nr_pages) { unsigned long nr_sgpages = nr_pages * ETR_SG_PAGES_PER_SYSPAGE; unsigned long nr_sglinks = nr_sgpages / (ETR_SG_PTRS_PER_PAGE - 1); - /* - * If we spill over to a new page for 1 entry, we could as well - * make it the LAST entry in the previous page, skipping the Link - * address. - */ - if (nr_sglinks && (nr_sgpages % (ETR_SG_PTRS_PER_PAGE - 1) < 2)) + + if (nr_sglinks && !(nr_sgpages % (ETR_SG_PTRS_PER_PAGE - 1))) nr_sglinks--; - return nr_sgpages + nr_sglinks; + /* Add an entry for the circular link */ + return nr_sgpages + nr_sglinks + 1; }
/* @@ -417,14 +422,21 @@ tmc_sg_daddr_to_vaddr(struct tmc_sg_table *sg_table, /* Dump the given sg_table */ static void tmc_etr_sg_table_dump(struct etr_sg_table *etr_table) { - sgte_t *ptr; + sgte_t *ptr, *start; int i = 0; dma_addr_t addr; struct tmc_sg_table *sg_table = etr_table->sg_table;
- ptr = (sgte_t *)tmc_sg_daddr_to_vaddr(sg_table, + start = (sgte_t *)tmc_sg_daddr_to_vaddr(sg_table, etr_table->hwaddr, true); - while (ptr) { + if (!start) { + pr_debug("ERROR: Failed to translate table base: 0x%llx\n", + etr_table->hwaddr); + return; + } + + ptr = start; + do { addr = ETR_SG_ADDR(*ptr); switch (ETR_SG_ET(*ptr)) { case ETR_SG_ET_NORMAL: @@ -436,14 +448,17 @@ static void tmc_etr_sg_table_dump(struct etr_sg_table *etr_table) i, ptr, addr); ptr = (sgte_t *)tmc_sg_daddr_to_vaddr(sg_table, addr, true); + if (!ptr) + pr_debug("ERROR: Bad Link 0x%llx\n", addr); break; case ETR_SG_ET_LAST: pr_debug("%05d: ### %p\t:[L] 0x%llx ###\n", i, ptr, addr); - return; + ptr++; + break; } i++; - } + } while (ptr && ptr != start); pr_debug("******* End of Table *****\n"); } #endif @@ -458,7 +473,7 @@ static void tmc_etr_sg_table_dump(struct etr_sg_table *etr_table) static void tmc_etr_sg_table_populate(struct etr_sg_table *etr_table) { dma_addr_t paddr; - int i, type, nr_entries; + int i, type; int tpidx = 0; /* index to the current system table_page */ int sgtidx = 0; /* index to the sg_table within the current syspage */ int sgtoffset = 0; /* offset to the next entry within the sg_table */ @@ -469,16 +484,16 @@ static void tmc_etr_sg_table_populate(struct etr_sg_table *etr_table) dma_addr_t *table_daddrs = sg_table->table_pages.daddrs; dma_addr_t *data_daddrs = sg_table->data_pages.daddrs;
- nr_entries = tmc_etr_sg_table_entries(sg_table->data_pages.nr_pages); /* * Use the contiguous virtual address of the table to update entries. */ ptr = sg_table->table_vaddr; /* - * Fill all the entries, except the last entry to avoid special + * Fill all the entries, except the last two entries (i.e, the last + * buffer and the circular link back to the base) to avoid special * checks within the loop. */ - for (i = 0; i < nr_entries - 1; i++) { + for (i = 0; i < etr_table->nr_entries - 2; i++) { if (sgtoffset == ETR_SG_PTRS_PER_PAGE - 1) { /* * Last entry in a sg_table page is a link address to @@ -519,6 +534,107 @@ static void tmc_etr_sg_table_populate(struct etr_sg_table *etr_table) /* Set up the last entry, which is always a data pointer */ paddr = data_daddrs[dpidx] + spidx * ETR_SG_PAGE_SIZE; *ptr++ = ETR_SG_ENTRY(paddr, ETR_SG_ET_LAST); + /* followed by a circular link, back to the start of the table */ + *ptr++ = ETR_SG_ENTRY(sg_table->table_daddr, ETR_SG_ET_LINK); +} + +/* + * tmc_etr_sg_offset_to_table_index : Translate a given data @offset + * to the index of the page table "entry". Data pointers always have + * a fixed location, with ETR_SG_PTRS_PER_PAGE - 1 entries in an + * ETR_SG_PAGE and 1 link entry per (ETR_SG_PTRS_PER_PAGE -1) entries. + */ +static inline u32 +tmc_etr_sg_offset_to_table_index(u64 offset) +{ + u64 sgpage_idx = offset >> ETR_SG_PAGE_SHIFT; + + return sgpage_idx + sgpage_idx / (ETR_SG_PTRS_PER_PAGE - 1); +} + +/* + * tmc_etr_sg_update_type: Update the type of a given entry in the + * table to the requested entry. This is only used for data buffers + * to toggle the "NORMAL" vs "LAST" buffer entries. + */ +static inline void tmc_etr_sg_update_type(sgte_t *entry, u32 type) +{ + WARN_ON(ETR_SG_ET(*entry) == ETR_SG_ET_LINK); + WARN_ON(!ETR_SG_ET(*entry)); + *entry &= ~ETR_SG_ET_MASK; + *entry |= type; +} + +/* + * tmc_etr_sg_table_index_to_daddr: Return the hardware address to the table + * entry @index. Use this address to let the table begin @index. + */ +static inline dma_addr_t +tmc_etr_sg_table_index_to_daddr(struct tmc_sg_table *sg_table, u32 index) +{ + u32 sys_page_idx = index / ETR_SG_PTRS_PER_SYSPAGE; + u32 sys_page_offset = index % ETR_SG_PTRS_PER_SYSPAGE; + sgte_t *ptr; + + ptr = (sgte_t *)sg_table->table_pages.daddrs[sys_page_idx]; + return (dma_addr_t)&ptr[sys_page_offset]; +} + +/* + * tmc_etr_sg_table_rotate : Rotate the SG circular buffer, moving + * the "base" to a requested offset. We do so by : + * + * 1) Reset the current LAST buffer. + * 2) Mark the "previous" buffer in the table to the "base" as LAST. + * 3) Update the hwaddr to point to the table pointer for the buffer + * which starts at "base". + */ +static int __maybe_unused +tmc_etr_sg_table_rotate(struct etr_sg_table *etr_table, u64 base_offset) +{ + u32 last_entry, first_entry; + u64 last_offset; + struct tmc_sg_table *sg_table = etr_table->sg_table; + sgte_t *table_ptr = sg_table->table_vaddr; + ssize_t buf_size = tmc_sg_table_buf_size(sg_table); + + /* Offset should always be SG PAGE_SIZE aligned */ + if (base_offset & (ETR_SG_PAGE_SIZE - 1)) { + pr_debug("unaligned base offset %llx\n", base_offset); + return -EINVAL; + } + /* Make sure the offset is within the range */ + if (base_offset < 0 || base_offset > buf_size) { + base_offset = (base_offset + buf_size) % buf_size; + pr_debug("Resetting offset to %llx\n", base_offset); + } + first_entry = tmc_etr_sg_offset_to_table_index(base_offset); + if (first_entry == etr_table->first_entry) { + pr_debug("Head is already at %llx, skipping\n", base_offset); + return 0; + } + + /* Last entry should be the previous one to the new "base" */ + last_offset = ((base_offset - ETR_SG_PAGE_SIZE) + buf_size) % buf_size; + last_entry = tmc_etr_sg_offset_to_table_index(last_offset); + + /* Reset the current Last page to Normal and new Last page to NORMAL */ + tmc_etr_sg_update_type(&table_ptr[etr_table->last_entry], + ETR_SG_ET_NORMAL); + tmc_etr_sg_update_type(&table_ptr[last_entry], ETR_SG_ET_LAST); + etr_table->hwaddr = tmc_etr_sg_table_index_to_daddr(sg_table, + first_entry); + etr_table->first_entry = first_entry; + etr_table->last_entry = last_entry; + pr_debug("table rotated to offset %llx-%llx, entries (%d - %d), dba: %llx\n", + base_offset, last_offset, first_entry, last_entry, + etr_table->hwaddr); + /* Sync the table for device */ + tmc_sg_table_sync_table(sg_table); +#ifdef ETR_SG_DEBUG + tmc_etr_sg_table_dump(etr_table); +#endif + return 0; }
/* @@ -552,6 +668,9 @@ tmc_init_etr_sg_table(struct device *dev, int node, }
etr_table->sg_table = sg_table; + etr_table->nr_entries = nr_entries; + etr_table->first_entry = 0; + etr_table->last_entry = nr_entries - 2; /* TMC should use table base address for DBA */ etr_table->hwaddr = sg_table->table_daddr; tmc_etr_sg_table_populate(etr_table);
Hi Suzuki,
On 19/10/17 18:15, Suzuki K Poulose wrote:
Make the ETR SG table Circular buffer so that we could start at any of the SG pages and use the entire buffer for tracing. This can be achieved by :
- Keeping an additional LINK pointer at the very end of the
SG table, i.e, after the LAST buffer entry, to point back to the beginning of the first table. This will allow us to use the buffer normally when we start the trace at offset 0 of the buffer, as the LAST buffer entry hints the TMC-ETR and it automatically wraps to the offset 0.
- If we want to start at any other ETR SG page aligned offset,
we could : a) Make the preceding page entry as LAST entry. b) Make the original LAST entry a normal entry. c) Use the table pointer to the "new" start offset as the base of the table address. This works as the TMC doesn't mandate that the page table base address should be 4K page aligned.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
drivers/hwtracing/coresight/coresight-tmc-etr.c | 159 +++++++++++++++++++++--- 1 file changed, 139 insertions(+), 20 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index 4424eb67a54c..c171b244e12a 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
[...]
@@ -519,6 +534,107 @@ static void tmc_etr_sg_table_populate(struct etr_sg_table *etr_table) /* Set up the last entry, which is always a data pointer */ paddr = data_daddrs[dpidx] + spidx * ETR_SG_PAGE_SIZE; *ptr++ = ETR_SG_ENTRY(paddr, ETR_SG_ET_LAST);
- /* followed by a circular link, back to the start of the table */
- *ptr++ = ETR_SG_ENTRY(sg_table->table_daddr, ETR_SG_ET_LINK);
+}
+/*
- tmc_etr_sg_offset_to_table_index : Translate a given data @offset
- to the index of the page table "entry". Data pointers always have
- a fixed location, with ETR_SG_PTRS_PER_PAGE - 1 entries in an
- ETR_SG_PAGE and 1 link entry per (ETR_SG_PTRS_PER_PAGE -1) entries.
- */
+static inline u32 +tmc_etr_sg_offset_to_table_index(u64 offset) +{
- u64 sgpage_idx = offset >> ETR_SG_PAGE_SHIFT;
- return sgpage_idx + sgpage_idx / (ETR_SG_PTRS_PER_PAGE - 1);
+}
+/*
- tmc_etr_sg_update_type: Update the type of a given entry in the
- table to the requested entry. This is only used for data buffers
- to toggle the "NORMAL" vs "LAST" buffer entries.
- */
+static inline void tmc_etr_sg_update_type(sgte_t *entry, u32 type) +{
- WARN_ON(ETR_SG_ET(*entry) == ETR_SG_ET_LINK);
- WARN_ON(!ETR_SG_ET(*entry));
- *entry &= ~ETR_SG_ET_MASK;
- *entry |= type;
+}
+/*
- tmc_etr_sg_table_index_to_daddr: Return the hardware address to the table
- entry @index. Use this address to let the table begin @index.
- */
+static inline dma_addr_t +tmc_etr_sg_table_index_to_daddr(struct tmc_sg_table *sg_table, u32 index) +{
- u32 sys_page_idx = index / ETR_SG_PTRS_PER_SYSPAGE;
- u32 sys_page_offset = index % ETR_SG_PTRS_PER_SYSPAGE;
- sgte_t *ptr;
- ptr = (sgte_t *)sg_table->table_pages.daddrs[sys_page_idx];
- return (dma_addr_t)&ptr[sys_page_offset];
+}
+/*
- tmc_etr_sg_table_rotate : Rotate the SG circular buffer, moving
- the "base" to a requested offset. We do so by :
- Reset the current LAST buffer.
- Mark the "previous" buffer in the table to the "base" as LAST.
- Update the hwaddr to point to the table pointer for the buffer
- which starts at "base".
- */
+static int __maybe_unused +tmc_etr_sg_table_rotate(struct etr_sg_table *etr_table, u64 base_offset) +{
- u32 last_entry, first_entry;
- u64 last_offset;
- struct tmc_sg_table *sg_table = etr_table->sg_table;
- sgte_t *table_ptr = sg_table->table_vaddr;
- ssize_t buf_size = tmc_sg_table_buf_size(sg_table);
- /* Offset should always be SG PAGE_SIZE aligned */
- if (base_offset & (ETR_SG_PAGE_SIZE - 1)) {
pr_debug("unaligned base offset %llx\n", base_offset);
return -EINVAL;
- }
- /* Make sure the offset is within the range */
- if (base_offset < 0 || base_offset > buf_size) {
base_offset is unsigned, so the left operand of the '||' is useless (would've expected the compiler to emit a warning for this).
base_offset = (base_offset + buf_size) % buf_size;
pr_debug("Resetting offset to %llx\n", base_offset);
- }
- first_entry = tmc_etr_sg_offset_to_table_index(base_offset);
- if (first_entry == etr_table->first_entry) {
pr_debug("Head is already at %llx, skipping\n", base_offset);
return 0;
- }
- /* Last entry should be the previous one to the new "base" */
- last_offset = ((base_offset - ETR_SG_PAGE_SIZE) + buf_size) % buf_size;
- last_entry = tmc_etr_sg_offset_to_table_index(last_offset);
- /* Reset the current Last page to Normal and new Last page to NORMAL */
Current Last page to NORMAL and new Last page to LAST?
Cheers,
On 20/10/17 18:11, Julien Thierry wrote:
+static int __maybe_unused +tmc_etr_sg_table_rotate(struct etr_sg_table *etr_table, u64 base_offset) +{ + u32 last_entry, first_entry; + u64 last_offset; + struct tmc_sg_table *sg_table = etr_table->sg_table; + sgte_t *table_ptr = sg_table->table_vaddr; + ssize_t buf_size = tmc_sg_table_buf_size(sg_table);
+ /* Offset should always be SG PAGE_SIZE aligned */ + if (base_offset & (ETR_SG_PAGE_SIZE - 1)) { + pr_debug("unaligned base offset %llx\n", base_offset); + return -EINVAL; + } + /* Make sure the offset is within the range */ + if (base_offset < 0 || base_offset > buf_size) {
base_offset is unsigned, so the left operand of the '||' is useless (would've expected the compiler to emit a warning for this).
+ base_offset = (base_offset + buf_size) % buf_size; + pr_debug("Resetting offset to %llx\n", base_offset); + } + first_entry = tmc_etr_sg_offset_to_table_index(base_offset); + if (first_entry == etr_table->first_entry) { + pr_debug("Head is already at %llx, skipping\n", base_offset); + return 0; + }
+ /* Last entry should be the previous one to the new "base" */ + last_offset = ((base_offset - ETR_SG_PAGE_SIZE) + buf_size) % buf_size; + last_entry = tmc_etr_sg_offset_to_table_index(last_offset);
+ /* Reset the current Last page to Normal and new Last page to NORMAL */
Current Last page to NORMAL and new Last page to LAST?
Thanks again, will fix them
Cheers Suzuki
On Thu, Oct 19, 2017 at 06:15:42PM +0100, Suzuki K Poulose wrote:
Make the ETR SG table Circular buffer so that we could start at any of the SG pages and use the entire buffer for tracing. This can be achieved by :
- Keeping an additional LINK pointer at the very end of the
SG table, i.e, after the LAST buffer entry, to point back to the beginning of the first table. This will allow us to use the buffer normally when we start the trace at offset 0 of the buffer, as the LAST buffer entry hints the TMC-ETR and it automatically wraps to the offset 0.
- If we want to start at any other ETR SG page aligned offset,
we could : a) Make the preceding page entry as LAST entry. b) Make the original LAST entry a normal entry. c) Use the table pointer to the "new" start offset as the base of the table address. This works as the TMC doesn't mandate that the page table base address should be 4K page aligned.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
drivers/hwtracing/coresight/coresight-tmc-etr.c | 159 +++++++++++++++++++++--- 1 file changed, 139 insertions(+), 20 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index 4424eb67a54c..c171b244e12a 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -71,36 +71,41 @@ typedef u32 sgte_t;
- @sg_table: Generic SG Table holding the data/table pages.
- @hwaddr: hwaddress used by the TMC, which is the base
address of the table.
- @nr_entries: Total number of pointers in the table.
- @first_entry: Index to the current "start" of the buffer.
*/
- @last_entry: Index to the last entry of the buffer.
struct etr_sg_table { struct tmc_sg_table *sg_table; dma_addr_t hwaddr;
- u32 nr_entries;
- u32 first_entry;
- u32 last_entry;
}; /*
- tmc_etr_sg_table_entries: Total number of table entries required to map
- @nr_pages system pages.
- We need to map @nr_pages * ETR_SG_PAGES_PER_SYSPAGE data pages.
- We need to map @nr_pages * ETR_SG_PAGES_PER_SYSPAGE data pages and
- an additional Link pointer for making it a Circular buffer.
- Each TMC page can map (ETR_SG_PTRS_PER_PAGE - 1) buffer pointers,
- with the last entry pointing to the page containing the table
- entries. If we spill over to a new page for mapping 1 entry,
- we could as well replace the link entry of the previous page
- with the last entry.
- entries. If we fill the last table in full with the pointers, (i.e,
- nr_sgpages % (ETR_SG_PTRS_PER_PAGE - 1) == 0, we don't have to allocate
- another table and hence skip the Link pointer. Also we could use the
*/
- link entry of the last page to make it circular.
static inline unsigned long __attribute_const__ tmc_etr_sg_table_entries(int nr_pages) { unsigned long nr_sgpages = nr_pages * ETR_SG_PAGES_PER_SYSPAGE; unsigned long nr_sglinks = nr_sgpages / (ETR_SG_PTRS_PER_PAGE - 1);
- /*
* If we spill over to a new page for 1 entry, we could as well
* make it the LAST entry in the previous page, skipping the Link
* address.
*/
- if (nr_sglinks && (nr_sgpages % (ETR_SG_PTRS_PER_PAGE - 1) < 2))
- if (nr_sglinks && !(nr_sgpages % (ETR_SG_PTRS_PER_PAGE - 1))) nr_sglinks--;
- return nr_sgpages + nr_sglinks;
- /* Add an entry for the circular link */
- return nr_sgpages + nr_sglinks + 1;
} /* @@ -417,14 +422,21 @@ tmc_sg_daddr_to_vaddr(struct tmc_sg_table *sg_table, /* Dump the given sg_table */ static void tmc_etr_sg_table_dump(struct etr_sg_table *etr_table) {
- sgte_t *ptr;
- sgte_t *ptr, *start; int i = 0; dma_addr_t addr; struct tmc_sg_table *sg_table = etr_table->sg_table;
- ptr = (sgte_t *)tmc_sg_daddr_to_vaddr(sg_table,
- start = (sgte_t *)tmc_sg_daddr_to_vaddr(sg_table, etr_table->hwaddr, true);
- while (ptr) {
- if (!start) {
pr_debug("ERROR: Failed to translate table base: 0x%llx\n",
etr_table->hwaddr);
return;
- }
- ptr = start;
- do { addr = ETR_SG_ADDR(*ptr); switch (ETR_SG_ET(*ptr)) { case ETR_SG_ET_NORMAL:
@@ -436,14 +448,17 @@ static void tmc_etr_sg_table_dump(struct etr_sg_table *etr_table) i, ptr, addr); ptr = (sgte_t *)tmc_sg_daddr_to_vaddr(sg_table, addr, true);
if (!ptr)
case ETR_SG_ET_LAST: pr_debug("%05d: ### %p\t:[L] 0x%llx ###\n", i, ptr, addr);pr_debug("ERROR: Bad Link 0x%llx\n", addr); break;
return;
ptr++;
} i++;break;
- }
- } while (ptr && ptr != start); pr_debug("******* End of Table *****\n");
} #endif @@ -458,7 +473,7 @@ static void tmc_etr_sg_table_dump(struct etr_sg_table *etr_table) static void tmc_etr_sg_table_populate(struct etr_sg_table *etr_table) { dma_addr_t paddr;
- int i, type, nr_entries;
- int i, type; int tpidx = 0; /* index to the current system table_page */ int sgtidx = 0; /* index to the sg_table within the current syspage */ int sgtoffset = 0; /* offset to the next entry within the sg_table */
@@ -469,16 +484,16 @@ static void tmc_etr_sg_table_populate(struct etr_sg_table *etr_table) dma_addr_t *table_daddrs = sg_table->table_pages.daddrs; dma_addr_t *data_daddrs = sg_table->data_pages.daddrs;
- nr_entries = tmc_etr_sg_table_entries(sg_table->data_pages.nr_pages); /*
*/ ptr = sg_table->table_vaddr; /*
- Use the contiguous virtual address of the table to update entries.
* Fill all the entries, except the last entry to avoid special
* Fill all the entries, except the last two entries (i.e, the last
* buffer and the circular link back to the base) to avoid special
*/
- checks within the loop.
- for (i = 0; i < nr_entries - 1; i++) {
- for (i = 0; i < etr_table->nr_entries - 2; i++) { if (sgtoffset == ETR_SG_PTRS_PER_PAGE - 1) { /* * Last entry in a sg_table page is a link address to
@@ -519,6 +534,107 @@ static void tmc_etr_sg_table_populate(struct etr_sg_table *etr_table) /* Set up the last entry, which is always a data pointer */ paddr = data_daddrs[dpidx] + spidx * ETR_SG_PAGE_SIZE; *ptr++ = ETR_SG_ENTRY(paddr, ETR_SG_ET_LAST);
- /* followed by a circular link, back to the start of the table */
- *ptr++ = ETR_SG_ENTRY(sg_table->table_daddr, ETR_SG_ET_LINK);
+}
+/*
- tmc_etr_sg_offset_to_table_index : Translate a given data @offset
- to the index of the page table "entry". Data pointers always have
- a fixed location, with ETR_SG_PTRS_PER_PAGE - 1 entries in an
- ETR_SG_PAGE and 1 link entry per (ETR_SG_PTRS_PER_PAGE -1) entries.
- */
+static inline u32 +tmc_etr_sg_offset_to_table_index(u64 offset) +{
- u64 sgpage_idx = offset >> ETR_SG_PAGE_SHIFT;
- return sgpage_idx + sgpage_idx / (ETR_SG_PTRS_PER_PAGE - 1);
+}
+/*
- tmc_etr_sg_update_type: Update the type of a given entry in the
- table to the requested entry. This is only used for data buffers
- to toggle the "NORMAL" vs "LAST" buffer entries.
- */
+static inline void tmc_etr_sg_update_type(sgte_t *entry, u32 type) +{
- WARN_ON(ETR_SG_ET(*entry) == ETR_SG_ET_LINK);
- WARN_ON(!ETR_SG_ET(*entry));
- *entry &= ~ETR_SG_ET_MASK;
- *entry |= type;
+}
+/*
- tmc_etr_sg_table_index_to_daddr: Return the hardware address to the table
- entry @index. Use this address to let the table begin @index.
- */
+static inline dma_addr_t +tmc_etr_sg_table_index_to_daddr(struct tmc_sg_table *sg_table, u32 index) +{
- u32 sys_page_idx = index / ETR_SG_PTRS_PER_SYSPAGE;
- u32 sys_page_offset = index % ETR_SG_PTRS_PER_SYSPAGE;
- sgte_t *ptr;
- ptr = (sgte_t *)sg_table->table_pages.daddrs[sys_page_idx];
- return (dma_addr_t)&ptr[sys_page_offset];
+}
+/*
- tmc_etr_sg_table_rotate : Rotate the SG circular buffer, moving
- the "base" to a requested offset. We do so by :
- Reset the current LAST buffer.
- Mark the "previous" buffer in the table to the "base" as LAST.
- Update the hwaddr to point to the table pointer for the buffer
- which starts at "base".
- */
+static int __maybe_unused +tmc_etr_sg_table_rotate(struct etr_sg_table *etr_table, u64 base_offset) +{
- u32 last_entry, first_entry;
- u64 last_offset;
- struct tmc_sg_table *sg_table = etr_table->sg_table;
- sgte_t *table_ptr = sg_table->table_vaddr;
- ssize_t buf_size = tmc_sg_table_buf_size(sg_table);
- /* Offset should always be SG PAGE_SIZE aligned */
- if (base_offset & (ETR_SG_PAGE_SIZE - 1)) {
pr_debug("unaligned base offset %llx\n", base_offset);
return -EINVAL;
- }
- /* Make sure the offset is within the range */
- if (base_offset < 0 || base_offset > buf_size) {
base_offset = (base_offset + buf_size) % buf_size;
pr_debug("Resetting offset to %llx\n", base_offset);
- }
- first_entry = tmc_etr_sg_offset_to_table_index(base_offset);
- if (first_entry == etr_table->first_entry) {
pr_debug("Head is already at %llx, skipping\n", base_offset);
return 0;
- }
- /* Last entry should be the previous one to the new "base" */
- last_offset = ((base_offset - ETR_SG_PAGE_SIZE) + buf_size) % buf_size;
- last_entry = tmc_etr_sg_offset_to_table_index(last_offset);
- /* Reset the current Last page to Normal and new Last page to NORMAL */
- tmc_etr_sg_update_type(&table_ptr[etr_table->last_entry],
ETR_SG_ET_NORMAL);
- tmc_etr_sg_update_type(&table_ptr[last_entry], ETR_SG_ET_LAST);
- etr_table->hwaddr = tmc_etr_sg_table_index_to_daddr(sg_table,
first_entry);
- etr_table->first_entry = first_entry;
- etr_table->last_entry = last_entry;
- pr_debug("table rotated to offset %llx-%llx, entries (%d - %d), dba: %llx\n",
base_offset, last_offset, first_entry, last_entry,
etr_table->hwaddr);
The above line generates a warning when compiling for ARMv7.
- /* Sync the table for device */
- tmc_sg_table_sync_table(sg_table);
+#ifdef ETR_SG_DEBUG
- tmc_etr_sg_table_dump(etr_table);
+#endif
- return 0;
} /* @@ -552,6 +668,9 @@ tmc_init_etr_sg_table(struct device *dev, int node, } etr_table->sg_table = sg_table;
- etr_table->nr_entries = nr_entries;
- etr_table->first_entry = 0;
- etr_table->last_entry = nr_entries - 2; /* TMC should use table base address for DBA */ etr_table->hwaddr = sg_table->table_daddr; tmc_etr_sg_table_populate(etr_table);
-- 2.13.6
On 01/11/17 23:47, Mathieu Poirier wrote:
On Thu, Oct 19, 2017 at 06:15:42PM +0100, Suzuki K Poulose wrote:
Make the ETR SG table Circular buffer so that we could start at any of the SG pages and use the entire buffer for tracing. This can be achieved by :
- Keeping an additional LINK pointer at the very end of the
SG table, i.e, after the LAST buffer entry, to point back to the beginning of the first table. This will allow us to use the buffer normally when we start the trace at offset 0 of the buffer, as the LAST buffer entry hints the TMC-ETR and it automatically wraps to the offset 0.
- If we want to start at any other ETR SG page aligned offset,
we could : a) Make the preceding page entry as LAST entry. b) Make the original LAST entry a normal entry. c) Use the table pointer to the "new" start offset as the base of the table address. This works as the TMC doesn't mandate that the page table base address should be 4K page aligned.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
+static int __maybe_unused +tmc_etr_sg_table_rotate(struct etr_sg_table *etr_table, u64 base_offset) +{
- u32 last_entry, first_entry;
- u64 last_offset;
- struct tmc_sg_table *sg_table = etr_table->sg_table;
- sgte_t *table_ptr = sg_table->table_vaddr;
- ssize_t buf_size = tmc_sg_table_buf_size(sg_table);
- /* Offset should always be SG PAGE_SIZE aligned */
- if (base_offset & (ETR_SG_PAGE_SIZE - 1)) {
pr_debug("unaligned base offset %llx\n", base_offset);
return -EINVAL;
- }
- /* Make sure the offset is within the range */
- if (base_offset < 0 || base_offset > buf_size) {
base_offset = (base_offset + buf_size) % buf_size;
pr_debug("Resetting offset to %llx\n", base_offset);
- }
- first_entry = tmc_etr_sg_offset_to_table_index(base_offset);
- if (first_entry == etr_table->first_entry) {
pr_debug("Head is already at %llx, skipping\n", base_offset);
return 0;
- }
- /* Last entry should be the previous one to the new "base" */
- last_offset = ((base_offset - ETR_SG_PAGE_SIZE) + buf_size) % buf_size;
- last_entry = tmc_etr_sg_offset_to_table_index(last_offset);
- /* Reset the current Last page to Normal and new Last page to NORMAL */
- tmc_etr_sg_update_type(&table_ptr[etr_table->last_entry],
ETR_SG_ET_NORMAL);
- tmc_etr_sg_update_type(&table_ptr[last_entry], ETR_SG_ET_LAST);
- etr_table->hwaddr = tmc_etr_sg_table_index_to_daddr(sg_table,
first_entry);
- etr_table->first_entry = first_entry;
- etr_table->last_entry = last_entry;
- pr_debug("table rotated to offset %llx-%llx, entries (%d - %d), dba: %llx\n",
base_offset, last_offset, first_entry, last_entry,
etr_table->hwaddr);
The above line generates a warning when compiling for ARMv7.
Where you running with LPAE off ? That could probably be the case, where hwaddr could be 32bit or 64bit depending on whether LPAE is enabled. I will fix it.
I have fixed some other warnings with ARMv7 with LPAE.
Suzuki
On 2 November 2017 at 06:00, Suzuki K Poulose Suzuki.Poulose@arm.com wrote:
On 01/11/17 23:47, Mathieu Poirier wrote:
On Thu, Oct 19, 2017 at 06:15:42PM +0100, Suzuki K Poulose wrote:
Make the ETR SG table Circular buffer so that we could start at any of the SG pages and use the entire buffer for tracing. This can be achieved by :
- Keeping an additional LINK pointer at the very end of the
SG table, i.e, after the LAST buffer entry, to point back to the beginning of the first table. This will allow us to use the buffer normally when we start the trace at offset 0 of the buffer, as the LAST buffer entry hints the TMC-ETR and it automatically wraps to the offset 0.
- If we want to start at any other ETR SG page aligned offset,
we could : a) Make the preceding page entry as LAST entry. b) Make the original LAST entry a normal entry. c) Use the table pointer to the "new" start offset as the base of the table address. This works as the TMC doesn't mandate that the page table base address should be 4K page aligned.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
+static int __maybe_unused +tmc_etr_sg_table_rotate(struct etr_sg_table *etr_table, u64 base_offset) +{
u32 last_entry, first_entry;
u64 last_offset;
struct tmc_sg_table *sg_table = etr_table->sg_table;
sgte_t *table_ptr = sg_table->table_vaddr;
ssize_t buf_size = tmc_sg_table_buf_size(sg_table);
/* Offset should always be SG PAGE_SIZE aligned */
if (base_offset & (ETR_SG_PAGE_SIZE - 1)) {
pr_debug("unaligned base offset %llx\n", base_offset);
return -EINVAL;
}
/* Make sure the offset is within the range */
if (base_offset < 0 || base_offset > buf_size) {
base_offset = (base_offset + buf_size) % buf_size;
pr_debug("Resetting offset to %llx\n", base_offset);
}
first_entry = tmc_etr_sg_offset_to_table_index(base_offset);
if (first_entry == etr_table->first_entry) {
pr_debug("Head is already at %llx, skipping\n",
base_offset);
return 0;
}
/* Last entry should be the previous one to the new "base" */
last_offset = ((base_offset - ETR_SG_PAGE_SIZE) + buf_size) %
buf_size;
last_entry = tmc_etr_sg_offset_to_table_index(last_offset);
/* Reset the current Last page to Normal and new Last page to
NORMAL */
tmc_etr_sg_update_type(&table_ptr[etr_table->last_entry],
ETR_SG_ET_NORMAL);
tmc_etr_sg_update_type(&table_ptr[last_entry], ETR_SG_ET_LAST);
etr_table->hwaddr = tmc_etr_sg_table_index_to_daddr(sg_table,
first_entry);
etr_table->first_entry = first_entry;
etr_table->last_entry = last_entry;
pr_debug("table rotated to offset %llx-%llx, entries (%d - %d),
dba: %llx\n",
base_offset, last_offset, first_entry,
last_entry,
etr_table->hwaddr);
The above line generates a warning when compiling for ARMv7.
Where you running with LPAE off ? That could probably be the case, where hwaddr could be 32bit or 64bit depending on whether LPAE is enabled. I will fix it.
My original setup did not have LPAE configured but even when I do configure it I can generate the warnings.
Compiler:
arm-linux-gnueabi-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
Let me know if you want my .config file.
I have fixed some other warnings with ARMv7 with LPAE.
Suzuki
On Thu, Nov 02, 2017 at 08:40:16AM -0600, Mathieu Poirier wrote:
On 2 November 2017 at 06:00, Suzuki K Poulose Suzuki.Poulose@arm.com wrote:
On 01/11/17 23:47, Mathieu Poirier wrote:
On Thu, Oct 19, 2017 at 06:15:42PM +0100, Suzuki K Poulose wrote:
Make the ETR SG table Circular buffer so that we could start at any of the SG pages and use the entire buffer for tracing. This can be achieved by :
- Keeping an additional LINK pointer at the very end of the
SG table, i.e, after the LAST buffer entry, to point back to the beginning of the first table. This will allow us to use the buffer normally when we start the trace at offset 0 of the buffer, as the LAST buffer entry hints the TMC-ETR and it automatically wraps to the offset 0.
- If we want to start at any other ETR SG page aligned offset,
we could : a) Make the preceding page entry as LAST entry. b) Make the original LAST entry a normal entry. c) Use the table pointer to the "new" start offset as the base of the table address. This works as the TMC doesn't mandate that the page table base address should be 4K page aligned.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
+static int __maybe_unused +tmc_etr_sg_table_rotate(struct etr_sg_table *etr_table, u64 base_offset) +{
u32 last_entry, first_entry;
u64 last_offset;
struct tmc_sg_table *sg_table = etr_table->sg_table;
sgte_t *table_ptr = sg_table->table_vaddr;
ssize_t buf_size = tmc_sg_table_buf_size(sg_table);
/* Offset should always be SG PAGE_SIZE aligned */
if (base_offset & (ETR_SG_PAGE_SIZE - 1)) {
pr_debug("unaligned base offset %llx\n", base_offset);
return -EINVAL;
}
/* Make sure the offset is within the range */
if (base_offset < 0 || base_offset > buf_size) {
base_offset = (base_offset + buf_size) % buf_size;
pr_debug("Resetting offset to %llx\n", base_offset);
}
first_entry = tmc_etr_sg_offset_to_table_index(base_offset);
if (first_entry == etr_table->first_entry) {
pr_debug("Head is already at %llx, skipping\n",
base_offset);
return 0;
}
/* Last entry should be the previous one to the new "base" */
last_offset = ((base_offset - ETR_SG_PAGE_SIZE) + buf_size) %
buf_size;
last_entry = tmc_etr_sg_offset_to_table_index(last_offset);
/* Reset the current Last page to Normal and new Last page to
NORMAL */
tmc_etr_sg_update_type(&table_ptr[etr_table->last_entry],
ETR_SG_ET_NORMAL);
tmc_etr_sg_update_type(&table_ptr[last_entry], ETR_SG_ET_LAST);
etr_table->hwaddr = tmc_etr_sg_table_index_to_daddr(sg_table,
first_entry);
etr_table->first_entry = first_entry;
etr_table->last_entry = last_entry;
pr_debug("table rotated to offset %llx-%llx, entries (%d - %d),
dba: %llx\n",
base_offset, last_offset, first_entry,
last_entry,
etr_table->hwaddr);
The above line generates a warning when compiling for ARMv7.
Where you running with LPAE off ? That could probably be the case, where hwaddr could be 32bit or 64bit depending on whether LPAE is enabled. I will fix it.
My original setup did not have LPAE configured but even when I do configure it I can generate the warnings.
Compiler:
arm-linux-gnueabi-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
Let me know if you want my .config file.
For those who don't know... (it seems it's an all too common mistake).
In the printf format definition, the flag can be used to determine the data type for the format. Of those flags, the two which are relevant here are:
l printf expects a "long" or "unsigned long" type. ll printf expects a "long long" or "unsigned long long" type.
The size of "long" or "long long" is ABI dependent. Typically, on 32-bit platforms, "long" is 32-bit and "long long" is 64-bit. On 64-bit platforms, "long" is 64-bit and "long long" is 128-bit.
Moreover, ABIs can mandate alignment requirements for these data types. This can cause problems if you do something like:
u64 var = 1;
printk("foo %lx\n", var);
If a 32-bit architecture mandates that arguments are passed in ascending 32-bit registers from r0, and 64-bit arguments are to be passed in an even,odd register pair, then the above printk() is a problem.
The format specifies a "long" type, which is 32-bit type, printk() will expect the value in r1 for the above, but the compiler has arranged to pass "var" in r2,r3. So the end result is that the above prints an undefined value - whatever happens to be in r1 at the time.
Don't laugh, exactly this problem is in kexec-tools right now!
The kernel data types (and C99 data types) that are typed using the bit width map to the standard C types. What this means is that a "u64" might be "long" if building on a 64-bit platform, or "long long" if building on a 32-bit platform:
include/uapi/asm-generic/int-ll64.h:typedef unsigned long long __u64; include/uapi/asm-generic/int-l64.h:typedef unsigned long __u64;
This means if you try and pass a u64 integer variable to printf, you really can't use either of the "l" or "ll" flags, because you don't know which you should be using.
The guidance at the top of Documentation/printk-formats.txt concerning s64/u64 is basically incorrect:
Integer types =============
::
If variable is of Type, use printk format specifier: ------------------------------------------------------------ s64 %lld or %llx u64 %llu or %llx
The only way around this is:
1) to cast to the appropriate non-bitwidth defined type and use it's format flag (eg, unsigned long long if you need at least 64-bit precision, and use "ll" in the format. Yes, on 64-bit it means you get 128-bit values, but that's a small price to pay for stuff working correctly.)
2) we do as done in userspace, and define a PRI64 which can be inserted into the format, but this is messy (iow, eg, "%"PRI64"x"). I personally find this rather horrid, and I suspect it'll trip other kernel developers sanity filters too.
On Thu, Oct 19, 2017 at 06:15:42PM +0100, Suzuki K Poulose wrote:
Make the ETR SG table Circular buffer so that we could start at any of the SG pages and use the entire buffer for tracing. This can be achieved by :
- Keeping an additional LINK pointer at the very end of the
SG table, i.e, after the LAST buffer entry, to point back to the beginning of the first table. This will allow us to use the buffer normally when we start the trace at offset 0 of the buffer, as the LAST buffer entry hints the TMC-ETR and it automatically wraps to the offset 0.
- If we want to start at any other ETR SG page aligned offset,
we could : a) Make the preceding page entry as LAST entry. b) Make the original LAST entry a normal entry. c) Use the table pointer to the "new" start offset as the base of the table address. This works as the TMC doesn't mandate that the page table base address should be 4K page aligned.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
drivers/hwtracing/coresight/coresight-tmc-etr.c | 159 +++++++++++++++++++++--- 1 file changed, 139 insertions(+), 20 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index 4424eb67a54c..c171b244e12a 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -71,36 +71,41 @@ typedef u32 sgte_t;
- @sg_table: Generic SG Table holding the data/table pages.
- @hwaddr: hwaddress used by the TMC, which is the base
address of the table.
- @nr_entries: Total number of pointers in the table.
- @first_entry: Index to the current "start" of the buffer.
*/
- @last_entry: Index to the last entry of the buffer.
struct etr_sg_table { struct tmc_sg_table *sg_table; dma_addr_t hwaddr;
- u32 nr_entries;
- u32 first_entry;
- u32 last_entry;
}; /*
- tmc_etr_sg_table_entries: Total number of table entries required to map
- @nr_pages system pages.
- We need to map @nr_pages * ETR_SG_PAGES_PER_SYSPAGE data pages.
- We need to map @nr_pages * ETR_SG_PAGES_PER_SYSPAGE data pages and
- an additional Link pointer for making it a Circular buffer.
- Each TMC page can map (ETR_SG_PTRS_PER_PAGE - 1) buffer pointers,
- with the last entry pointing to the page containing the table
- entries. If we spill over to a new page for mapping 1 entry,
- we could as well replace the link entry of the previous page
- with the last entry.
- entries. If we fill the last table in full with the pointers, (i.e,
- nr_sgpages % (ETR_SG_PTRS_PER_PAGE - 1) == 0, we don't have to allocate
- another table and hence skip the Link pointer. Also we could use the
*/
- link entry of the last page to make it circular.
static inline unsigned long __attribute_const__ tmc_etr_sg_table_entries(int nr_pages) { unsigned long nr_sgpages = nr_pages * ETR_SG_PAGES_PER_SYSPAGE; unsigned long nr_sglinks = nr_sgpages / (ETR_SG_PTRS_PER_PAGE - 1);
- /*
* If we spill over to a new page for 1 entry, we could as well
* make it the LAST entry in the previous page, skipping the Link
* address.
*/
- if (nr_sglinks && (nr_sgpages % (ETR_SG_PTRS_PER_PAGE - 1) < 2))
- if (nr_sglinks && !(nr_sgpages % (ETR_SG_PTRS_PER_PAGE - 1))) nr_sglinks--;
- return nr_sgpages + nr_sglinks;
- /* Add an entry for the circular link */
- return nr_sgpages + nr_sglinks + 1;
} /* @@ -417,14 +422,21 @@ tmc_sg_daddr_to_vaddr(struct tmc_sg_table *sg_table, /* Dump the given sg_table */ static void tmc_etr_sg_table_dump(struct etr_sg_table *etr_table) {
- sgte_t *ptr;
- sgte_t *ptr, *start; int i = 0; dma_addr_t addr; struct tmc_sg_table *sg_table = etr_table->sg_table;
- ptr = (sgte_t *)tmc_sg_daddr_to_vaddr(sg_table,
- start = (sgte_t *)tmc_sg_daddr_to_vaddr(sg_table, etr_table->hwaddr, true);
- while (ptr) {
- if (!start) {
pr_debug("ERROR: Failed to translate table base: 0x%llx\n",
etr_table->hwaddr);
return;
- }
- ptr = start;
- do { addr = ETR_SG_ADDR(*ptr); switch (ETR_SG_ET(*ptr)) { case ETR_SG_ET_NORMAL:
@@ -436,14 +448,17 @@ static void tmc_etr_sg_table_dump(struct etr_sg_table *etr_table) i, ptr, addr); ptr = (sgte_t *)tmc_sg_daddr_to_vaddr(sg_table, addr, true);
if (!ptr)
case ETR_SG_ET_LAST: pr_debug("%05d: ### %p\t:[L] 0x%llx ###\n", i, ptr, addr);pr_debug("ERROR: Bad Link 0x%llx\n", addr); break;
return;
ptr++;
} i++;break;
- }
- } while (ptr && ptr != start); pr_debug("******* End of Table *****\n");
} #endif @@ -458,7 +473,7 @@ static void tmc_etr_sg_table_dump(struct etr_sg_table *etr_table) static void tmc_etr_sg_table_populate(struct etr_sg_table *etr_table) { dma_addr_t paddr;
- int i, type, nr_entries;
- int i, type; int tpidx = 0; /* index to the current system table_page */ int sgtidx = 0; /* index to the sg_table within the current syspage */ int sgtoffset = 0; /* offset to the next entry within the sg_table */
@@ -469,16 +484,16 @@ static void tmc_etr_sg_table_populate(struct etr_sg_table *etr_table) dma_addr_t *table_daddrs = sg_table->table_pages.daddrs; dma_addr_t *data_daddrs = sg_table->data_pages.daddrs;
- nr_entries = tmc_etr_sg_table_entries(sg_table->data_pages.nr_pages); /*
*/ ptr = sg_table->table_vaddr; /*
- Use the contiguous virtual address of the table to update entries.
* Fill all the entries, except the last entry to avoid special
* Fill all the entries, except the last two entries (i.e, the last
* buffer and the circular link back to the base) to avoid special
*/
- checks within the loop.
- for (i = 0; i < nr_entries - 1; i++) {
- for (i = 0; i < etr_table->nr_entries - 2; i++) { if (sgtoffset == ETR_SG_PTRS_PER_PAGE - 1) { /* * Last entry in a sg_table page is a link address to
@@ -519,6 +534,107 @@ static void tmc_etr_sg_table_populate(struct etr_sg_table *etr_table) /* Set up the last entry, which is always a data pointer */ paddr = data_daddrs[dpidx] + spidx * ETR_SG_PAGE_SIZE; *ptr++ = ETR_SG_ENTRY(paddr, ETR_SG_ET_LAST);
- /* followed by a circular link, back to the start of the table */
- *ptr++ = ETR_SG_ENTRY(sg_table->table_daddr, ETR_SG_ET_LINK);
+}
+/*
- tmc_etr_sg_offset_to_table_index : Translate a given data @offset
- to the index of the page table "entry". Data pointers always have
- a fixed location, with ETR_SG_PTRS_PER_PAGE - 1 entries in an
- ETR_SG_PAGE and 1 link entry per (ETR_SG_PTRS_PER_PAGE -1) entries.
- */
+static inline u32 +tmc_etr_sg_offset_to_table_index(u64 offset) +{
- u64 sgpage_idx = offset >> ETR_SG_PAGE_SHIFT;
- return sgpage_idx + sgpage_idx / (ETR_SG_PTRS_PER_PAGE - 1);
+}
This function is the source of a bizarre linking error when compiling [14/17] on armv7 as pasted here:
UPD include/generated/compile.h CC init/version.o AR init/built-in.o AR built-in.o LD vmlinux.o MODPOST vmlinux.o drivers/hwtracing/coresight/coresight-tmc-etr.o: In function `tmc_etr_sg_offset_to_table_index': /home/mpoirier/work/linaro/coresight/kernel-maint/drivers/hwtracing/coresight/coresight-tmc-etr.c:553: undefined reference to `__aeabi_uldivmod' /home/mpoirier/work/linaro/coresight/kernel-maint/drivers/hwtracing/coresight/coresight-tmc-etr.c:551: undefined reference to `__aeabi_uldivmod' /home/mpoirier/work/linaro/coresight/kernel-maint/drivers/hwtracing/coresight/coresight-tmc-etr.c:553: undefined reference to `__aeabi_uldivmod' drivers/hwtracing/coresight/coresight-tmc-etr.o: In function `tmc_etr_sg_table_rotate': /home/mpoirier/work/linaro/coresight/kernel-maint/drivers/hwtracing/coresight/coresight-tmc-etr.c:609: undefined reference to `__aeabi_uldivmod'
Please see if you can reproduce on your side.
Thanks, Mathieu
+/*
- tmc_etr_sg_update_type: Update the type of a given entry in the
- table to the requested entry. This is only used for data buffers
- to toggle the "NORMAL" vs "LAST" buffer entries.
- */
+static inline void tmc_etr_sg_update_type(sgte_t *entry, u32 type) +{
- WARN_ON(ETR_SG_ET(*entry) == ETR_SG_ET_LINK);
- WARN_ON(!ETR_SG_ET(*entry));
- *entry &= ~ETR_SG_ET_MASK;
- *entry |= type;
+}
+/*
- tmc_etr_sg_table_index_to_daddr: Return the hardware address to the table
- entry @index. Use this address to let the table begin @index.
- */
+static inline dma_addr_t +tmc_etr_sg_table_index_to_daddr(struct tmc_sg_table *sg_table, u32 index) +{
- u32 sys_page_idx = index / ETR_SG_PTRS_PER_SYSPAGE;
- u32 sys_page_offset = index % ETR_SG_PTRS_PER_SYSPAGE;
- sgte_t *ptr;
- ptr = (sgte_t *)sg_table->table_pages.daddrs[sys_page_idx];
- return (dma_addr_t)&ptr[sys_page_offset];
+}
+/*
- tmc_etr_sg_table_rotate : Rotate the SG circular buffer, moving
- the "base" to a requested offset. We do so by :
- Reset the current LAST buffer.
- Mark the "previous" buffer in the table to the "base" as LAST.
- Update the hwaddr to point to the table pointer for the buffer
- which starts at "base".
- */
+static int __maybe_unused +tmc_etr_sg_table_rotate(struct etr_sg_table *etr_table, u64 base_offset) +{
- u32 last_entry, first_entry;
- u64 last_offset;
- struct tmc_sg_table *sg_table = etr_table->sg_table;
- sgte_t *table_ptr = sg_table->table_vaddr;
- ssize_t buf_size = tmc_sg_table_buf_size(sg_table);
- /* Offset should always be SG PAGE_SIZE aligned */
- if (base_offset & (ETR_SG_PAGE_SIZE - 1)) {
pr_debug("unaligned base offset %llx\n", base_offset);
return -EINVAL;
- }
- /* Make sure the offset is within the range */
- if (base_offset < 0 || base_offset > buf_size) {
base_offset = (base_offset + buf_size) % buf_size;
pr_debug("Resetting offset to %llx\n", base_offset);
- }
- first_entry = tmc_etr_sg_offset_to_table_index(base_offset);
- if (first_entry == etr_table->first_entry) {
pr_debug("Head is already at %llx, skipping\n", base_offset);
return 0;
- }
- /* Last entry should be the previous one to the new "base" */
- last_offset = ((base_offset - ETR_SG_PAGE_SIZE) + buf_size) % buf_size;
- last_entry = tmc_etr_sg_offset_to_table_index(last_offset);
- /* Reset the current Last page to Normal and new Last page to NORMAL */
- tmc_etr_sg_update_type(&table_ptr[etr_table->last_entry],
ETR_SG_ET_NORMAL);
- tmc_etr_sg_update_type(&table_ptr[last_entry], ETR_SG_ET_LAST);
- etr_table->hwaddr = tmc_etr_sg_table_index_to_daddr(sg_table,
first_entry);
- etr_table->first_entry = first_entry;
- etr_table->last_entry = last_entry;
- pr_debug("table rotated to offset %llx-%llx, entries (%d - %d), dba: %llx\n",
base_offset, last_offset, first_entry, last_entry,
etr_table->hwaddr);
- /* Sync the table for device */
- tmc_sg_table_sync_table(sg_table);
+#ifdef ETR_SG_DEBUG
- tmc_etr_sg_table_dump(etr_table);
+#endif
- return 0;
} /* @@ -552,6 +668,9 @@ tmc_init_etr_sg_table(struct device *dev, int node, } etr_table->sg_table = sg_table;
- etr_table->nr_entries = nr_entries;
- etr_table->first_entry = 0;
- etr_table->last_entry = nr_entries - 2; /* TMC should use table base address for DBA */ etr_table->hwaddr = sg_table->table_daddr; tmc_etr_sg_table_populate(etr_table);
-- 2.13.6
On 06/11/17 19:07, Mathieu Poirier wrote:
On Thu, Oct 19, 2017 at 06:15:42PM +0100, Suzuki K Poulose wrote:
...
+/*
- tmc_etr_sg_offset_to_table_index : Translate a given data @offset
- to the index of the page table "entry". Data pointers always have
- a fixed location, with ETR_SG_PTRS_PER_PAGE - 1 entries in an
- ETR_SG_PAGE and 1 link entry per (ETR_SG_PTRS_PER_PAGE -1) entries.
- */
+static inline u32 +tmc_etr_sg_offset_to_table_index(u64 offset) +{
- u64 sgpage_idx = offset >> ETR_SG_PAGE_SHIFT;
- return sgpage_idx + sgpage_idx / (ETR_SG_PTRS_PER_PAGE - 1);
+}
This function is the source of a bizarre linking error when compiling [14/17] on armv7 as pasted here:
UPD include/generated/compile.h CC init/version.o AR init/built-in.o AR built-in.o LD vmlinux.o MODPOST vmlinux.o drivers/hwtracing/coresight/coresight-tmc-etr.o: In function `tmc_etr_sg_offset_to_table_index': /home/mpoirier/work/linaro/coresight/kernel-maint/drivers/hwtracing/coresight/coresight-tmc-etr.c:553: undefined reference to `__aeabi_uldivmod' /home/mpoirier/work/linaro/coresight/kernel-maint/drivers/hwtracing/coresight/coresight-tmc-etr.c:551: undefined reference to `__aeabi_uldivmod' /home/mpoirier/work/linaro/coresight/kernel-maint/drivers/hwtracing/coresight/coresight-tmc-etr.c:553: undefined reference to `__aeabi_uldivmod' drivers/hwtracing/coresight/coresight-tmc-etr.o: In function `tmc_etr_sg_table_rotate': /home/mpoirier/work/linaro/coresight/kernel-maint/drivers/hwtracing/coresight/coresight-tmc-etr.c:609: undefined reference to `__aeabi_uldivmod'
Please see if you can reproduce on your side.
Uh ! I had gcc-7, which didn't complain about it. But if switch to 4.9, it does. It looks like division of 64bit entity on arm32 is triggering it. We don't need this to be u64 above, as it is the page_idx and could simply switch to "unsigned long" rather than explicitly using div64 helpers.
The following change fixes the issue for me. Please could you check if it solves the problem for you ?
@@ -551,7 +553,7 @@ static void tmc_etr_sg_table_populate(struct etr_sg_table *etr_table) static inline u32 tmc_etr_sg_offset_to_table_index(u64 offset) { - u64 sgpage_idx = offset >> ETR_SG_PAGE_SHIFT; + unsigned long sgpage_idx = offset >> ETR_SG_PAGE_SHIFT;
return sgpage_idx + sgpage_idx / (ETR_SG_PTRS_PER_PAGE - 1); }
Thanks for the testing !
Suzuki
On 7 November 2017 at 03:36, Suzuki K Poulose Suzuki.Poulose@arm.com wrote:
On 06/11/17 19:07, Mathieu Poirier wrote:
On Thu, Oct 19, 2017 at 06:15:42PM +0100, Suzuki K Poulose wrote:
...
+/*
- tmc_etr_sg_offset_to_table_index : Translate a given data @offset
- to the index of the page table "entry". Data pointers always have
- a fixed location, with ETR_SG_PTRS_PER_PAGE - 1 entries in an
- ETR_SG_PAGE and 1 link entry per (ETR_SG_PTRS_PER_PAGE -1) entries.
- */
+static inline u32 +tmc_etr_sg_offset_to_table_index(u64 offset) +{
u64 sgpage_idx = offset >> ETR_SG_PAGE_SHIFT;
return sgpage_idx + sgpage_idx / (ETR_SG_PTRS_PER_PAGE - 1);
+}
This function is the source of a bizarre linking error when compiling [14/17] on armv7 as pasted here:
UPD include/generated/compile.h CC init/version.o AR init/built-in.o AR built-in.o LD vmlinux.o MODPOST vmlinux.o drivers/hwtracing/coresight/coresight-tmc-etr.o: In function `tmc_etr_sg_offset_to_table_index':
/home/mpoirier/work/linaro/coresight/kernel-maint/drivers/hwtracing/coresight/coresight-tmc-etr.c:553: undefined reference to `__aeabi_uldivmod'
/home/mpoirier/work/linaro/coresight/kernel-maint/drivers/hwtracing/coresight/coresight-tmc-etr.c:551: undefined reference to `__aeabi_uldivmod'
/home/mpoirier/work/linaro/coresight/kernel-maint/drivers/hwtracing/coresight/coresight-tmc-etr.c:553: undefined reference to `__aeabi_uldivmod' drivers/hwtracing/coresight/coresight-tmc-etr.o: In function `tmc_etr_sg_table_rotate':
/home/mpoirier/work/linaro/coresight/kernel-maint/drivers/hwtracing/coresight/coresight-tmc-etr.c:609: undefined reference to `__aeabi_uldivmod'
Please see if you can reproduce on your side.
Uh ! I had gcc-7, which didn't complain about it. But if switch to 4.9, it does. It looks like division of 64bit entity on arm32 is triggering it. We don't need this to be u64 above, as it is the page_idx and could simply switch to "unsigned long" rather than explicitly using div64 helpers.
The following change fixes the issue for me. Please could you check if it solves the problem for you ?
Unfortunately it doesn't.
Mathieu
@@ -551,7 +553,7 @@ static void tmc_etr_sg_table_populate(struct etr_sg_table *etr_table) static inline u32 tmc_etr_sg_offset_to_table_index(u64 offset) {
u64 sgpage_idx = offset >> ETR_SG_PAGE_SHIFT;
unsigned long sgpage_idx = offset >> ETR_SG_PAGE_SHIFT; return sgpage_idx + sgpage_idx / (ETR_SG_PTRS_PER_PAGE - 1);
}
Thanks for the testing !
Suzuki
At the moment we always use contiguous memory for TMC ETR tracing when used from sysfs. The size of the buffer is fixed at boot time and can only be changed by modifiying the DT. With the introduction of SG support we could support really large buffers in that mode. This patch abstracts the buffer used for ETR to switch between a contiguous buffer or a SG table depending on the availability of the memory.
This also enables the sysfs mode to use the ETR in SG mode depending on configured the trace buffer size. Also, since ETR will use the new infrastructure to manage the buffer, we can get rid of some of the members in the tmc_drvdata and clean up the fields a bit.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com --- drivers/hwtracing/coresight/coresight-tmc-etr.c | 433 +++++++++++++++++++----- drivers/hwtracing/coresight/coresight-tmc.h | 60 +++- 2 files changed, 403 insertions(+), 90 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index c171b244e12a..9e41eeaa5284 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -16,6 +16,7 @@ */
#include <linux/coresight.h> +#include <linux/iommu.h> #include <linux/slab.h> #include "coresight-priv.h" #include "coresight-tmc.h" @@ -646,7 +647,7 @@ tmc_etr_sg_table_rotate(struct etr_sg_table *etr_table, u64 base_offset) * @size - Total size of the data buffer * @pages - Optional list of page virtual address */ -static struct etr_sg_table __maybe_unused * +static struct etr_sg_table * tmc_init_etr_sg_table(struct device *dev, int node, unsigned long size, void **pages) { @@ -682,19 +683,298 @@ tmc_init_etr_sg_table(struct device *dev, int node, return etr_table; }
+/* + * tmc_etr_alloc_flat_buf: Allocate a contiguous DMA buffer. + * We keep the tmc_drvdata in the @private field to retrieve the + * device information, while the DMA address and virtual address are + * stored already in @hwaddr and @vaddr respectively, which never changes. + */ +static int tmc_etr_alloc_flat_buf(struct tmc_drvdata *drvdata, + struct etr_buf *etr_buf, int node, + void **pages) +{ + dma_addr_t paddr; + void *vaddr = dma_alloc_coherent(drvdata->dev, etr_buf->size, + &paddr, GFP_KERNEL); + if (!vaddr) + return -ENOMEM; + etr_buf->vaddr = vaddr; + etr_buf->hwaddr = paddr; + etr_buf->mode = ETR_MODE_FLAT; + etr_buf->private = drvdata; + return 0; +} + +static void tmc_etr_free_flat_buf(struct etr_buf *etr_buf) +{ + struct tmc_drvdata *drvdata = etr_buf->private; + + if (etr_buf->hwaddr) + dma_free_coherent(drvdata->dev, etr_buf->size, + etr_buf->vaddr, etr_buf->hwaddr); +} + +static void tmc_etr_sync_flat_buf(struct etr_buf *etr_buf, u64 rrp, u64 rwp) +{ + /* + * Adjust the buffer to point to the beginning of the trace data + * and update the available trace data. + */ + etr_buf->offset = rrp - etr_buf->hwaddr; + if (etr_buf->full) + etr_buf->len = etr_buf->size; + else + etr_buf->len = rwp - rrp; +} + +static ssize_t tmc_etr_get_data_flat_buf(struct etr_buf *etr_buf, + u64 offset, size_t len, char **bufpp) +{ + /* + * tmc_etr_buf_get_data already adjusts the length to handle + * buffer wrapping around. + */ + *bufpp = (char *)((unsigned long)etr_buf->vaddr + offset); + return len; +} + +static const struct etr_buf_operations etr_flat_buf_ops = { + .alloc = tmc_etr_alloc_flat_buf, + .free = tmc_etr_free_flat_buf, + .sync = tmc_etr_sync_flat_buf, + .get_data = tmc_etr_get_data_flat_buf, +}; + +/* + * tmc_etr_alloc_sg_buf: Allocate an SG buf @etr_buf. Setup the parameters + * appropriately. + */ +static int tmc_etr_alloc_sg_buf(struct tmc_drvdata *drvdata, + struct etr_buf *etr_buf, int node, + void **pages) +{ + struct etr_sg_table *etr_table; + + etr_table = tmc_init_etr_sg_table(drvdata->dev, node, + etr_buf->size, pages); + if (IS_ERR(etr_table)) + return -ENOMEM; + etr_buf->vaddr = tmc_sg_table_data_vaddr(etr_table->sg_table); + etr_buf->hwaddr = etr_table->hwaddr; + etr_buf->mode = ETR_MODE_ETR_SG; + etr_buf->private = etr_table; + return 0; +} + +static void tmc_etr_free_sg_buf(struct etr_buf *etr_buf) +{ + struct etr_sg_table *etr_table = etr_buf->private; + + if (etr_table) { + tmc_free_sg_table(etr_table->sg_table); + kfree(etr_table); + } +} + +static ssize_t tmc_etr_get_data_sg_buf(struct etr_buf *etr_buf, u64 offset, + size_t len, char **bufpp) +{ + struct etr_sg_table *etr_table = etr_buf->private; + + return tmc_sg_table_get_data(etr_table->sg_table, offset, len, bufpp); +} + +static void tmc_etr_sync_sg_buf(struct etr_buf *etr_buf, u64 rrp, u64 rwp) +{ + long r_offset, w_offset; + struct etr_sg_table *etr_table = etr_buf->private; + struct tmc_sg_table *table = etr_table->sg_table; + + r_offset = tmc_sg_get_data_page_offset(table, rrp); + if (r_offset < 0) { + dev_warn(table->dev, "Unable to map RRP %llx to offset\n", + rrp); + etr_buf->len = 0; + return; + } + + w_offset = tmc_sg_get_data_page_offset(table, rwp); + if (w_offset < 0) { + dev_warn(table->dev, "Unable to map RWP %llx to offset\n", + rwp); + etr_buf->len = 0; + return; + } + + etr_buf->offset = r_offset; + if (etr_buf->full) + etr_buf->len = etr_buf->size; + else + etr_buf->len = (w_offset < r_offset) ? + etr_buf->size + w_offset - r_offset : + w_offset - r_offset; + tmc_sg_table_sync_data_range(table, r_offset, etr_buf->len); +} + +static const struct etr_buf_operations etr_sg_buf_ops = { + .alloc = tmc_etr_alloc_sg_buf, + .free = tmc_etr_free_sg_buf, + .sync = tmc_etr_sync_sg_buf, + .get_data = tmc_etr_get_data_sg_buf, +}; + +static const struct etr_buf_operations *etr_buf_ops[] = { + [ETR_MODE_FLAT] = &etr_flat_buf_ops, + [ETR_MODE_ETR_SG] = &etr_sg_buf_ops, +}; + +static inline int tmc_etr_mode_alloc_buf(int mode, + struct tmc_drvdata *drvdata, + struct etr_buf *etr_buf, int node, + void **pages) +{ + int rc; + + switch (mode) { + case ETR_MODE_FLAT: + case ETR_MODE_ETR_SG: + rc = etr_buf_ops[mode]->alloc(drvdata, etr_buf, node, pages); + if (!rc) + etr_buf->ops = etr_buf_ops[mode]; + return rc; + default: + return -EINVAL; + } +} + +/* + * tmc_alloc_etr_buf: Allocate a buffer use by ETR. + * @drvdata : ETR device details. + * @size : size of the requested buffer. + * @flags : Required properties of the type of buffer. + * @node : Node for memory allocations. + * @pages : An optional list of pages. + */ +static struct etr_buf *tmc_alloc_etr_buf(struct tmc_drvdata *drvdata, + ssize_t size, int flags, + int node, void **pages) +{ + int rc = -ENOMEM; + bool has_etr_sg, has_iommu; + struct etr_buf *etr_buf; + + has_etr_sg = tmc_etr_has_cap(drvdata, TMC_ETR_SG); + has_iommu = iommu_get_domain_for_dev(drvdata->dev); + + etr_buf = kzalloc(sizeof(*etr_buf), GFP_KERNEL); + if (!etr_buf) + return ERR_PTR(-ENOMEM); + + etr_buf->size = size; + + /* + * If we have to use an existing list of pages, we cannot reliably + * use a contiguous DMA memory (even if we have an IOMMU). Otherwise, + * we use the contiguous DMA memory if : + * a) The ETR cannot use Scatter-Gather. + * b) if not a, we have an IOMMU backup + * c) if none of the above holds, use it for smaller memory (< 1M). + * + * Fallback to available mechanisms. + * + */ + if (!pages && + (!has_etr_sg || has_iommu || size < SZ_1M)) + rc = tmc_etr_mode_alloc_buf(ETR_MODE_FLAT, drvdata, + etr_buf, node, pages); + if (rc && has_etr_sg) + rc = tmc_etr_mode_alloc_buf(ETR_MODE_ETR_SG, drvdata, + etr_buf, node, pages); + if (rc) { + kfree(etr_buf); + return ERR_PTR(rc); + } + + return etr_buf; +} + +static void tmc_free_etr_buf(struct etr_buf *etr_buf) +{ + WARN_ON(!etr_buf->ops || !etr_buf->ops->free); + etr_buf->ops->free(etr_buf); + kfree(etr_buf); +} + +/* + * tmc_etr_buf_get_data: Get the pointer the trace data at @offset + * with a maximum of @len bytes. + * Returns: The size of the linear data available @pos, with *bufpp + * updated to point to the buffer. + */ +static ssize_t tmc_etr_buf_get_data(struct etr_buf *etr_buf, + u64 offset, size_t len, char **bufpp) +{ + /* Adjust the length to limit this transaction to end of buffer */ + len = (len < (etr_buf->size - offset)) ? len : etr_buf->size - offset; + + return etr_buf->ops->get_data(etr_buf, (u64)offset, len, bufpp); +} + +static inline s64 +tmc_etr_buf_insert_barrier_packet(struct etr_buf *etr_buf, u64 offset) +{ + ssize_t len; + char *bufp; + + len = tmc_etr_buf_get_data(etr_buf, offset, + CORESIGHT_BARRIER_PKT_SIZE, &bufp); + if (WARN_ON(len <= CORESIGHT_BARRIER_PKT_SIZE)) + return -EINVAL; + coresight_insert_barrier_packet(bufp); + return offset + CORESIGHT_BARRIER_PKT_SIZE; +} + +/* + * tmc_sync_etr_buf: Sync the trace buffer availability with drvdata. + * Makes sure the trace data is synced to the memory for consumption. + * @etr_buf->offset will hold the offset to the beginning of the trace data + * within the buffer, with @etr_buf->len bytes to consume. @etr_buf->vaddr + * will always point to the beginning of the "trace buffer". + */ +static void tmc_sync_etr_buf(struct tmc_drvdata *drvdata) +{ + struct etr_buf *etr_buf = drvdata->etr_buf; + u64 rrp, rwp; + u32 status; + + rrp = tmc_read_rrp(drvdata); + rwp = tmc_read_rwp(drvdata); + status = readl_relaxed(drvdata->base + TMC_STS); + etr_buf->full = status & TMC_STS_FULL; + + WARN_ON(!etr_buf->ops || !etr_buf->ops->sync); + + etr_buf->ops->sync(etr_buf, rrp, rwp); + + /* Insert barrier packets at the beginning, if there was an overflow */ + if (etr_buf->full) + tmc_etr_buf_insert_barrier_packet(etr_buf, etr_buf->offset); +} + static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata) { u32 axictl, sts; + struct etr_buf *etr_buf = drvdata->etr_buf;
/* Zero out the memory to help with debug */ - memset(drvdata->vaddr, 0, drvdata->size); + memset(etr_buf->vaddr, 0, etr_buf->size);
CS_UNLOCK(drvdata->base);
/* Wait for TMCSReady bit to be set */ tmc_wait_for_tmcready(drvdata);
- writel_relaxed(drvdata->size / 4, drvdata->base + TMC_RSZ); + writel_relaxed(etr_buf->size / 4, drvdata->base + TMC_RSZ); writel_relaxed(TMC_MODE_CIRCULAR_BUFFER, drvdata->base + TMC_MODE);
axictl = readl_relaxed(drvdata->base + TMC_AXICTL); @@ -707,16 +987,22 @@ static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata) axictl |= TMC_AXICTL_ARCACHE_OS; }
+ if (etr_buf->mode == ETR_MODE_ETR_SG) { + if (WARN_ON(!tmc_etr_has_cap(drvdata, TMC_ETR_SG))) + return; + axictl |= TMC_AXICTL_SCT_GAT_MODE; + } + writel_relaxed(axictl, drvdata->base + TMC_AXICTL); - tmc_write_dba(drvdata, drvdata->paddr); + tmc_write_dba(drvdata, etr_buf->hwaddr); /* * If the TMC pointers must be programmed before the session, * we have to set it properly (i.e, RRP/RWP to base address and * STS to "not full"). */ if (tmc_etr_has_cap(drvdata, TMC_ETR_SAVE_RESTORE)) { - tmc_write_rrp(drvdata, drvdata->paddr); - tmc_write_rwp(drvdata, drvdata->paddr); + tmc_write_rrp(drvdata, etr_buf->hwaddr); + tmc_write_rwp(drvdata, etr_buf->hwaddr); sts = readl_relaxed(drvdata->base + TMC_STS) & ~TMC_STS_FULL; writel_relaxed(sts, drvdata->base + TMC_STS); } @@ -732,62 +1018,52 @@ static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata) }
/* - * Return the available trace data in the buffer @pos, with a maximum - * limit of @len, also updating the @bufpp on where to find it. + * Return the available trace data in the buffer (starts at etr_buf->offset, + * limited by etr_buf->len) from @pos, with a maximum limit of @len, + * also updating the @bufpp on where to find it. Since the trace data + * starts at anywhere in the buffer, depending on the RRP, we adjust the + * @len returned to handle buffer wrapping around. */ ssize_t tmc_etr_get_sysfs_trace(struct tmc_drvdata *drvdata, loff_t pos, size_t len, char **bufpp) { - char *bufp = drvdata->buf + pos; - char *bufend = (char *)(drvdata->vaddr + drvdata->size); - - /* Adjust the len to available size @pos */ - if (pos + len > drvdata->len) - len = drvdata->len - pos; + s64 offset; + struct etr_buf *etr_buf = drvdata->etr_buf;
+ if (pos + len > etr_buf->len) + len = etr_buf->len - pos; if (len <= 0) return len;
- /* - * Since we use a circular buffer, with trace data starting - * @drvdata->buf, possibly anywhere in the buffer @drvdata->vaddr, - * wrap the current @pos to within the buffer. - */ - if (bufp >= bufend) - bufp -= drvdata->size; - /* - * For simplicity, avoid copying over a wrapped around buffer. - */ - if ((bufp + len) > bufend) - len = bufend - bufp; - *bufpp = bufp; - return len; + /* Compute the offset from which we read the data */ + offset = etr_buf->offset + pos; + if (offset >= etr_buf->size) + offset -= etr_buf->size; + return tmc_etr_buf_get_data(etr_buf, offset, len, bufpp); }
-static void tmc_etr_dump_hw(struct tmc_drvdata *drvdata) +static struct etr_buf * +tmc_etr_setup_sysfs_buf(struct tmc_drvdata *drvdata) { - u32 val; - u64 rwp; + return tmc_alloc_etr_buf(drvdata, drvdata->size, 0, + cpu_to_node(0), NULL); +}
- rwp = tmc_read_rwp(drvdata); - val = readl_relaxed(drvdata->base + TMC_STS); +static void +tmc_etr_free_sysfs_buf(struct etr_buf *buf) +{ + if (buf) + tmc_free_etr_buf(buf); +}
- /* - * Adjust the buffer to point to the beginning of the trace data - * and update the available trace data. - */ - if (val & TMC_STS_FULL) { - drvdata->buf = drvdata->vaddr + rwp - drvdata->paddr; - drvdata->len = drvdata->size; - coresight_insert_barrier_packet(drvdata->buf); - } else { - drvdata->buf = drvdata->vaddr; - drvdata->len = rwp - drvdata->paddr; - } +static void tmc_etr_sync_sysfs_buf(struct tmc_drvdata *drvdata) +{ + tmc_sync_etr_buf(drvdata); }
static void tmc_etr_disable_hw(struct tmc_drvdata *drvdata) { + CS_UNLOCK(drvdata->base);
tmc_flush_and_stop(drvdata); @@ -796,7 +1072,8 @@ static void tmc_etr_disable_hw(struct tmc_drvdata *drvdata) * read before the TMC is disabled. */ if (drvdata->mode == CS_MODE_SYSFS) - tmc_etr_dump_hw(drvdata); + tmc_etr_sync_sysfs_buf(drvdata); + tmc_disable_hw(drvdata);
CS_LOCK(drvdata->base); @@ -807,34 +1084,31 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev) int ret = 0; bool used = false; unsigned long flags; - void __iomem *vaddr = NULL; - dma_addr_t paddr; struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); + struct etr_buf *new_buf = NULL, *free_buf = NULL;
/* - * If we don't have a buffer release the lock and allocate memory. - * Otherwise keep the lock and move along. + * If we are enabling the ETR from disabled state, we need to make + * sure we have a buffer with the right size. The etr_buf is not reset + * immediately after we stop the tracing in SYSFS mode as we wait for + * the user to collect the data. We may be able to reuse the existing + * buffer, provided the size matches. Any allocation has to be done + * with the lock released. */ spin_lock_irqsave(&drvdata->spinlock, flags); - if (!drvdata->vaddr) { + if (!drvdata->etr_buf || (drvdata->etr_buf->size != drvdata->size)) { spin_unlock_irqrestore(&drvdata->spinlock, flags); - - /* - * Contiguous memory can't be allocated while a spinlock is - * held. As such allocate memory here and free it if a buffer - * has already been allocated (from a previous session). - */ - vaddr = dma_alloc_coherent(drvdata->dev, drvdata->size, - &paddr, GFP_KERNEL); - if (!vaddr) - return -ENOMEM; + /* Allocate memory with the spinlock released */ + free_buf = new_buf = tmc_etr_setup_sysfs_buf(drvdata); + if (IS_ERR(new_buf)) + return PTR_ERR(new_buf);
/* Let's try again */ spin_lock_irqsave(&drvdata->spinlock, flags); }
- if (drvdata->reading) { + if (drvdata->reading || drvdata->mode == CS_MODE_PERF) { ret = -EBUSY; goto out; } @@ -842,21 +1116,20 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev) /* * In sysFS mode we can have multiple writers per sink. Since this * sink is already enabled no memory is needed and the HW need not be - * touched. + * touched, even if the buffer size has changed. */ if (drvdata->mode == CS_MODE_SYSFS) goto out;
/* - * If drvdata::buf == NULL, use the memory allocated above. - * Otherwise a buffer still exists from a previous session, so - * simply use that. + * If we don't have a buffer or it doesn't match the requested size, + * use the memory allocated above. Otherwise reuse it. */ - if (drvdata->buf == NULL) { + if (!drvdata->etr_buf || + (new_buf && drvdata->etr_buf->size != new_buf->size)) { used = true; - drvdata->vaddr = vaddr; - drvdata->paddr = paddr; - drvdata->buf = drvdata->vaddr; + free_buf = drvdata->etr_buf; + drvdata->etr_buf = new_buf; }
drvdata->mode = CS_MODE_SYSFS; @@ -865,8 +1138,8 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev) spin_unlock_irqrestore(&drvdata->spinlock, flags);
/* Free memory outside the spinlock if need be */ - if (!used && vaddr) - dma_free_coherent(drvdata->dev, drvdata->size, vaddr, paddr); + if (free_buf) + tmc_etr_free_sysfs_buf(free_buf);
if (!ret) dev_info(drvdata->dev, "TMC-ETR enabled\n"); @@ -945,8 +1218,8 @@ int tmc_read_prepare_etr(struct tmc_drvdata *drvdata) goto out; }
- /* If drvdata::buf is NULL the trace data has been read already */ - if (drvdata->buf == NULL) { + /* If drvdata::etr_buf is NULL the trace data has been read already */ + if (drvdata->etr_buf == NULL) { ret = -EINVAL; goto out; } @@ -965,8 +1238,7 @@ int tmc_read_prepare_etr(struct tmc_drvdata *drvdata) int tmc_read_unprepare_etr(struct tmc_drvdata *drvdata) { unsigned long flags; - dma_addr_t paddr; - void __iomem *vaddr = NULL; + struct etr_buf *etr_buf = NULL;
/* config types are set a boot time and never change */ if (WARN_ON_ONCE(drvdata->config_type != TMC_CONFIG_TYPE_ETR)) @@ -988,17 +1260,16 @@ int tmc_read_unprepare_etr(struct tmc_drvdata *drvdata) * The ETR is not tracing and the buffer was just read. * As such prepare to free the trace buffer. */ - vaddr = drvdata->vaddr; - paddr = drvdata->paddr; - drvdata->buf = drvdata->vaddr = NULL; + etr_buf = drvdata->etr_buf; + drvdata->etr_buf = NULL; }
drvdata->reading = false; spin_unlock_irqrestore(&drvdata->spinlock, flags);
/* Free allocated memory out side of the spinlock */ - if (vaddr) - dma_free_coherent(drvdata->dev, drvdata->size, vaddr, paddr); + if (etr_buf) + tmc_free_etr_buf(etr_buf);
return 0; } diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h index 5e49c035a1ac..50ebc17c4645 100644 --- a/drivers/hwtracing/coresight/coresight-tmc.h +++ b/drivers/hwtracing/coresight/coresight-tmc.h @@ -55,6 +55,8 @@ #define TMC_STS_TMCREADY_BIT 2 #define TMC_STS_FULL BIT(0) #define TMC_STS_TRIGGERED BIT(1) +#define TMC_STS_MEMERR BIT(5) + /* * TMC_AXICTL - 0x110 * @@ -134,6 +136,37 @@ enum tmc_mem_intf_width { #define CORESIGHT_SOC_600_ETR_CAPS \ (TMC_ETR_SAVE_RESTORE | TMC_ETR_AXI_ARCACHE)
+enum etr_mode { + ETR_MODE_FLAT, /* Uses contiguous flat buffer */ + ETR_MODE_ETR_SG, /* Uses in-built TMC ETR SG mechanism */ +}; + +struct etr_buf_operations; + +/** + * struct etr_buf - Details of the buffer used by ETR + * @mode : Mode of the ETR buffer, contiguous, Scatter Gather etc. + * @full : Trace data overflow + * @size : Size of the buffer. + * @hwaddr : Address to be programmed in the TMC:DBA{LO,HI} + * @vaddr : Virtual address of the buffer used for trace. + * @offset : Offset of the trace data in the buffer for consumption. + * @len : Available trace data @buf (may round up to the beginning). + * @ops : ETR buffer operations for the mode. + * @private : Backend specific information for the buf + */ +struct etr_buf { + enum etr_mode mode; + bool full; + ssize_t size; + dma_addr_t hwaddr; + void *vaddr; + unsigned long offset; + u64 len; + const struct etr_buf_operations *ops; + void *private; +}; + /** * struct tmc_drvdata - specifics associated to an TMC component * @base: memory mapped base address for this component. @@ -141,11 +174,10 @@ enum tmc_mem_intf_width { * @csdev: component vitals needed by the framework. * @miscdev: specifics to handle "/dev/xyz.tmc" entry. * @spinlock: only one at a time pls. - * @buf: area of memory where trace data get sent. - * @paddr: DMA start location in RAM. - * @vaddr: virtual representation of @paddr. - * @size: trace buffer size. - * @len: size of the available trace. + * @buf: Snapshot of the trace data for ETF/ETB. + * @etr_buf: details of buffer used in TMC-ETR + * @len: size of the available trace for ETF/ETB. + * @size: trace buffer size for this TMC (common for all modes). * @mode: how this TMC is being used. * @config_type: TMC variant, must be of type @tmc_config_type. * @memwidth: width of the memory interface databus, in bytes. @@ -160,11 +192,12 @@ struct tmc_drvdata { struct miscdevice miscdev; spinlock_t spinlock; bool reading; - char *buf; - dma_addr_t paddr; - void __iomem *vaddr; - u32 size; + union { + char *buf; /* TMC ETB */ + struct etr_buf *etr_buf; /* TMC ETR */ + }; u32 len; + u32 size; u32 mode; enum tmc_config_type config_type; enum tmc_mem_intf_width memwidth; @@ -172,6 +205,15 @@ struct tmc_drvdata { u32 etr_caps; };
+struct etr_buf_operations { + int (*alloc)(struct tmc_drvdata *drvdata, struct etr_buf *etr_buf, + int node, void **pages); + void (*sync)(struct etr_buf *etr_buf, u64 rrp, u64 rwp); + ssize_t (*get_data)(struct etr_buf *etr_buf, u64 offset, size_t len, + char **bufpp); + void (*free)(struct etr_buf *etr_buf); +}; + /** * struct tmc_pages - Collection of pages used for SG. * @nr_pages: Number of pages in the list.
On Thu, Oct 19, 2017 at 06:15:43PM +0100, Suzuki K Poulose wrote:
At the moment we always use contiguous memory for TMC ETR tracing when used from sysfs. The size of the buffer is fixed at boot time and can only be changed by modifiying the DT. With the introduction of SG support we could support really large buffers in that mode. This patch abstracts the buffer used for ETR to switch between a contiguous buffer or a SG table depending on the availability of the memory.
This also enables the sysfs mode to use the ETR in SG mode depending on configured the trace buffer size. Also, since ETR will use the new infrastructure to manage the buffer, we can get rid of some of the members in the tmc_drvdata and clean up the fields a bit.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
drivers/hwtracing/coresight/coresight-tmc-etr.c | 433 +++++++++++++++++++----- drivers/hwtracing/coresight/coresight-tmc.h | 60 +++- 2 files changed, 403 insertions(+), 90 deletions(-)
[..]
+static void tmc_etr_sync_sg_buf(struct etr_buf *etr_buf, u64 rrp, u64 rwp) +{
- long r_offset, w_offset;
- struct etr_sg_table *etr_table = etr_buf->private;
- struct tmc_sg_table *table = etr_table->sg_table;
- r_offset = tmc_sg_get_data_page_offset(table, rrp);
- if (r_offset < 0) {
dev_warn(table->dev, "Unable to map RRP %llx to offset\n",
rrp);
etr_buf->len = 0;
return;
- }
- w_offset = tmc_sg_get_data_page_offset(table, rwp);
- if (w_offset < 0) {
dev_warn(table->dev, "Unable to map RWP %llx to offset\n",
rwp);
dev_warn(table->dev, "Unable to map RWP %llx to offset\n", rwq);
It looks a little better and we respect indentation rules. Same for r_offset.
etr_buf->len = 0;
return;
- }
- etr_buf->offset = r_offset;
- if (etr_buf->full)
etr_buf->len = etr_buf->size;
- else
etr_buf->len = (w_offset < r_offset) ?
etr_buf->size + w_offset - r_offset :
w_offset - r_offset;
- tmc_sg_table_sync_data_range(table, r_offset, etr_buf->len);
+}
+static const struct etr_buf_operations etr_sg_buf_ops = {
- .alloc = tmc_etr_alloc_sg_buf,
- .free = tmc_etr_free_sg_buf,
- .sync = tmc_etr_sync_sg_buf,
- .get_data = tmc_etr_get_data_sg_buf,
+};
+static const struct etr_buf_operations *etr_buf_ops[] = {
- [ETR_MODE_FLAT] = &etr_flat_buf_ops,
- [ETR_MODE_ETR_SG] = &etr_sg_buf_ops,
+};
+static inline int tmc_etr_mode_alloc_buf(int mode,
struct tmc_drvdata *drvdata,
struct etr_buf *etr_buf, int node,
void **pages)
static inline int tmc_etr_mode_alloc_buf(int mode, struct tmc_drvdata *drvdata, struct etr_buf *etr_buf, int node, void **pages)
+{
- int rc;
- switch (mode) {
- case ETR_MODE_FLAT:
- case ETR_MODE_ETR_SG:
rc = etr_buf_ops[mode]->alloc(drvdata, etr_buf, node, pages);
if (!rc)
etr_buf->ops = etr_buf_ops[mode];
return rc;
- default:
return -EINVAL;
- }
+}
+/*
- tmc_alloc_etr_buf: Allocate a buffer use by ETR.
- @drvdata : ETR device details.
- @size : size of the requested buffer.
- @flags : Required properties of the type of buffer.
- @node : Node for memory allocations.
- @pages : An optional list of pages.
- */
+static struct etr_buf *tmc_alloc_etr_buf(struct tmc_drvdata *drvdata,
ssize_t size, int flags,
int node, void **pages)
Please fix indentation. Also @flags isn't used.
+{
- int rc = -ENOMEM;
- bool has_etr_sg, has_iommu;
- struct etr_buf *etr_buf;
- has_etr_sg = tmc_etr_has_cap(drvdata, TMC_ETR_SG);
- has_iommu = iommu_get_domain_for_dev(drvdata->dev);
- etr_buf = kzalloc(sizeof(*etr_buf), GFP_KERNEL);
- if (!etr_buf)
return ERR_PTR(-ENOMEM);
- etr_buf->size = size;
- /*
* If we have to use an existing list of pages, we cannot reliably
* use a contiguous DMA memory (even if we have an IOMMU). Otherwise,
* we use the contiguous DMA memory if :
* a) The ETR cannot use Scatter-Gather.
* b) if not a, we have an IOMMU backup
Please rework the above sentence.
* c) if none of the above holds, use it for smaller memory (< 1M).
*
* Fallback to available mechanisms.
*
*/
- if (!pages &&
(!has_etr_sg || has_iommu || size < SZ_1M))
rc = tmc_etr_mode_alloc_buf(ETR_MODE_FLAT, drvdata,
etr_buf, node, pages);
- if (rc && has_etr_sg)
rc = tmc_etr_mode_alloc_buf(ETR_MODE_ETR_SG, drvdata,
etr_buf, node, pages);
- if (rc) {
kfree(etr_buf);
return ERR_PTR(rc);
- }
- return etr_buf;
+}
+static void tmc_free_etr_buf(struct etr_buf *etr_buf) +{
- WARN_ON(!etr_buf->ops || !etr_buf->ops->free);
- etr_buf->ops->free(etr_buf);
- kfree(etr_buf);
+}
+/*
- tmc_etr_buf_get_data: Get the pointer the trace data at @offset
- with a maximum of @len bytes.
- Returns: The size of the linear data available @pos, with *bufpp
- updated to point to the buffer.
- */
+static ssize_t tmc_etr_buf_get_data(struct etr_buf *etr_buf,
u64 offset, size_t len, char **bufpp)
+{
- /* Adjust the length to limit this transaction to end of buffer */
- len = (len < (etr_buf->size - offset)) ? len : etr_buf->size - offset;
- return etr_buf->ops->get_data(etr_buf, (u64)offset, len, bufpp);
+}
+static inline s64 +tmc_etr_buf_insert_barrier_packet(struct etr_buf *etr_buf, u64 offset) +{
- ssize_t len;
- char *bufp;
- len = tmc_etr_buf_get_data(etr_buf, offset,
CORESIGHT_BARRIER_PKT_SIZE, &bufp);
- if (WARN_ON(len <= CORESIGHT_BARRIER_PKT_SIZE))
return -EINVAL;
- coresight_insert_barrier_packet(bufp);
- return offset + CORESIGHT_BARRIER_PKT_SIZE;
+}
+/*
- tmc_sync_etr_buf: Sync the trace buffer availability with drvdata.
- Makes sure the trace data is synced to the memory for consumption.
- @etr_buf->offset will hold the offset to the beginning of the trace data
- within the buffer, with @etr_buf->len bytes to consume. @etr_buf->vaddr
- will always point to the beginning of the "trace buffer".
- */
+static void tmc_sync_etr_buf(struct tmc_drvdata *drvdata) +{
- struct etr_buf *etr_buf = drvdata->etr_buf;
- u64 rrp, rwp;
- u32 status;
- rrp = tmc_read_rrp(drvdata);
- rwp = tmc_read_rwp(drvdata);
- status = readl_relaxed(drvdata->base + TMC_STS);
- etr_buf->full = status & TMC_STS_FULL;
- WARN_ON(!etr_buf->ops || !etr_buf->ops->sync);
- etr_buf->ops->sync(etr_buf, rrp, rwp);
- /* Insert barrier packets at the beginning, if there was an overflow */
- if (etr_buf->full)
tmc_etr_buf_insert_barrier_packet(etr_buf, etr_buf->offset);
+}
static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata) { u32 axictl, sts;
- struct etr_buf *etr_buf = drvdata->etr_buf;
/* Zero out the memory to help with debug */
- memset(drvdata->vaddr, 0, drvdata->size);
- memset(etr_buf->vaddr, 0, etr_buf->size);
CS_UNLOCK(drvdata->base); /* Wait for TMCSReady bit to be set */ tmc_wait_for_tmcready(drvdata);
- writel_relaxed(drvdata->size / 4, drvdata->base + TMC_RSZ);
- writel_relaxed(etr_buf->size / 4, drvdata->base + TMC_RSZ); writel_relaxed(TMC_MODE_CIRCULAR_BUFFER, drvdata->base + TMC_MODE);
axictl = readl_relaxed(drvdata->base + TMC_AXICTL); @@ -707,16 +987,22 @@ static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata) axictl |= TMC_AXICTL_ARCACHE_OS; }
- if (etr_buf->mode == ETR_MODE_ETR_SG) {
if (WARN_ON(!tmc_etr_has_cap(drvdata, TMC_ETR_SG)))
return;
axictl |= TMC_AXICTL_SCT_GAT_MODE;
- }
- writel_relaxed(axictl, drvdata->base + TMC_AXICTL);
- tmc_write_dba(drvdata, drvdata->paddr);
- tmc_write_dba(drvdata, etr_buf->hwaddr); /*
*/ if (tmc_etr_has_cap(drvdata, TMC_ETR_SAVE_RESTORE)) {
- If the TMC pointers must be programmed before the session,
- we have to set it properly (i.e, RRP/RWP to base address and
- STS to "not full").
tmc_write_rrp(drvdata, drvdata->paddr);
tmc_write_rwp(drvdata, drvdata->paddr);
tmc_write_rrp(drvdata, etr_buf->hwaddr);
sts = readl_relaxed(drvdata->base + TMC_STS) & ~TMC_STS_FULL; writel_relaxed(sts, drvdata->base + TMC_STS); }tmc_write_rwp(drvdata, etr_buf->hwaddr);
@@ -732,62 +1018,52 @@ static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata) } /*
- Return the available trace data in the buffer @pos, with a maximum
- limit of @len, also updating the @bufpp on where to find it.
- Return the available trace data in the buffer (starts at etr_buf->offset,
- limited by etr_buf->len) from @pos, with a maximum limit of @len,
- also updating the @bufpp on where to find it. Since the trace data
- starts at anywhere in the buffer, depending on the RRP, we adjust the
*/
- @len returned to handle buffer wrapping around.
ssize_t tmc_etr_get_sysfs_trace(struct tmc_drvdata *drvdata, loff_t pos, size_t len, char **bufpp)
Please fix indentation
{
- char *bufp = drvdata->buf + pos;
- char *bufend = (char *)(drvdata->vaddr + drvdata->size);
- /* Adjust the len to available size @pos */
- if (pos + len > drvdata->len)
len = drvdata->len - pos;
- s64 offset;
- struct etr_buf *etr_buf = drvdata->etr_buf;
- if (pos + len > etr_buf->len)
if (len <= 0) return len;len = etr_buf->len - pos;
- /*
* Since we use a circular buffer, with trace data starting
* @drvdata->buf, possibly anywhere in the buffer @drvdata->vaddr,
* wrap the current @pos to within the buffer.
*/
- if (bufp >= bufend)
bufp -= drvdata->size;
- /*
* For simplicity, avoid copying over a wrapped around buffer.
*/
- if ((bufp + len) > bufend)
len = bufend - bufp;
- *bufpp = bufp;
- return len;
- /* Compute the offset from which we read the data */
- offset = etr_buf->offset + pos;
- if (offset >= etr_buf->size)
offset -= etr_buf->size;
- return tmc_etr_buf_get_data(etr_buf, offset, len, bufpp);
} -static void tmc_etr_dump_hw(struct tmc_drvdata *drvdata) +static struct etr_buf * +tmc_etr_setup_sysfs_buf(struct tmc_drvdata *drvdata) {
- u32 val;
- u64 rwp;
- return tmc_alloc_etr_buf(drvdata, drvdata->size, 0,
cpu_to_node(0), NULL);
Indentation
+}
- rwp = tmc_read_rwp(drvdata);
- val = readl_relaxed(drvdata->base + TMC_STS);
+static void +tmc_etr_free_sysfs_buf(struct etr_buf *buf) +{
- if (buf)
tmc_free_etr_buf(buf);
+}
- /*
* Adjust the buffer to point to the beginning of the trace data
* and update the available trace data.
*/
- if (val & TMC_STS_FULL) {
drvdata->buf = drvdata->vaddr + rwp - drvdata->paddr;
drvdata->len = drvdata->size;
coresight_insert_barrier_packet(drvdata->buf);
- } else {
drvdata->buf = drvdata->vaddr;
drvdata->len = rwp - drvdata->paddr;
- }
+static void tmc_etr_sync_sysfs_buf(struct tmc_drvdata *drvdata) +{
- tmc_sync_etr_buf(drvdata);
} static void tmc_etr_disable_hw(struct tmc_drvdata *drvdata) {
- CS_UNLOCK(drvdata->base);
tmc_flush_and_stop(drvdata); @@ -796,7 +1072,8 @@ static void tmc_etr_disable_hw(struct tmc_drvdata *drvdata) * read before the TMC is disabled. */ if (drvdata->mode == CS_MODE_SYSFS)
tmc_etr_dump_hw(drvdata);
tmc_etr_sync_sysfs_buf(drvdata);
- tmc_disable_hw(drvdata);
CS_LOCK(drvdata->base); @@ -807,34 +1084,31 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev) int ret = 0; bool used = false; unsigned long flags;
- void __iomem *vaddr = NULL;
- dma_addr_t paddr; struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
- struct etr_buf *new_buf = NULL, *free_buf = NULL;
/*
* If we don't have a buffer release the lock and allocate memory.
* Otherwise keep the lock and move along.
* If we are enabling the ETR from disabled state, we need to make
* sure we have a buffer with the right size. The etr_buf is not reset
* immediately after we stop the tracing in SYSFS mode as we wait for
* the user to collect the data. We may be able to reuse the existing
* buffer, provided the size matches. Any allocation has to be done
*/ spin_lock_irqsave(&drvdata->spinlock, flags);* with the lock released.
- if (!drvdata->vaddr) {
- if (!drvdata->etr_buf || (drvdata->etr_buf->size != drvdata->size)) { spin_unlock_irqrestore(&drvdata->spinlock, flags);
/*
* Contiguous memory can't be allocated while a spinlock is
* held. As such allocate memory here and free it if a buffer
* has already been allocated (from a previous session).
*/
vaddr = dma_alloc_coherent(drvdata->dev, drvdata->size,
&paddr, GFP_KERNEL);
if (!vaddr)
return -ENOMEM;
/* Allocate memory with the spinlock released */
free_buf = new_buf = tmc_etr_setup_sysfs_buf(drvdata);
if (IS_ERR(new_buf))
return PTR_ERR(new_buf);
/* Let's try again */ spin_lock_irqsave(&drvdata->spinlock, flags); }
- if (drvdata->reading) {
- if (drvdata->reading || drvdata->mode == CS_MODE_PERF) { ret = -EBUSY; goto out; }
@@ -842,21 +1116,20 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev) /* * In sysFS mode we can have multiple writers per sink. Since this * sink is already enabled no memory is needed and the HW need not be
* touched.
*/ if (drvdata->mode == CS_MODE_SYSFS) goto out;* touched, even if the buffer size has changed.
/*
* If drvdata::buf == NULL, use the memory allocated above.
* Otherwise a buffer still exists from a previous session, so
* simply use that.
* If we don't have a buffer or it doesn't match the requested size,
*/* use the memory allocated above. Otherwise reuse it.
- if (drvdata->buf == NULL) {
- if (!drvdata->etr_buf ||
used = true;(new_buf && drvdata->etr_buf->size != new_buf->size)) {
drvdata->vaddr = vaddr;
drvdata->paddr = paddr;
drvdata->buf = drvdata->vaddr;
free_buf = drvdata->etr_buf;
}drvdata->etr_buf = new_buf;
drvdata->mode = CS_MODE_SYSFS; @@ -865,8 +1138,8 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev) spin_unlock_irqrestore(&drvdata->spinlock, flags); /* Free memory outside the spinlock if need be */
- if (!used && vaddr)
dma_free_coherent(drvdata->dev, drvdata->size, vaddr, paddr);
- if (free_buf)
tmc_etr_free_sysfs_buf(free_buf);
if (!ret) dev_info(drvdata->dev, "TMC-ETR enabled\n"); @@ -945,8 +1218,8 @@ int tmc_read_prepare_etr(struct tmc_drvdata *drvdata) goto out; }
- /* If drvdata::buf is NULL the trace data has been read already */
- if (drvdata->buf == NULL) {
- /* If drvdata::etr_buf is NULL the trace data has been read already */
- if (drvdata->etr_buf == NULL) { ret = -EINVAL; goto out; }
@@ -965,8 +1238,7 @@ int tmc_read_prepare_etr(struct tmc_drvdata *drvdata) int tmc_read_unprepare_etr(struct tmc_drvdata *drvdata) { unsigned long flags;
- dma_addr_t paddr;
- void __iomem *vaddr = NULL;
- struct etr_buf *etr_buf = NULL;
/* config types are set a boot time and never change */ if (WARN_ON_ONCE(drvdata->config_type != TMC_CONFIG_TYPE_ETR)) @@ -988,17 +1260,16 @@ int tmc_read_unprepare_etr(struct tmc_drvdata *drvdata) * The ETR is not tracing and the buffer was just read. * As such prepare to free the trace buffer. */
vaddr = drvdata->vaddr;
paddr = drvdata->paddr;
drvdata->buf = drvdata->vaddr = NULL;
etr_buf = drvdata->etr_buf;
}drvdata->etr_buf = NULL;
drvdata->reading = false; spin_unlock_irqrestore(&drvdata->spinlock, flags); /* Free allocated memory out side of the spinlock */
- if (vaddr)
dma_free_coherent(drvdata->dev, drvdata->size, vaddr, paddr);
- if (etr_buf)
tmc_free_etr_buf(etr_buf);
return 0; } diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h index 5e49c035a1ac..50ebc17c4645 100644 --- a/drivers/hwtracing/coresight/coresight-tmc.h +++ b/drivers/hwtracing/coresight/coresight-tmc.h @@ -55,6 +55,8 @@ #define TMC_STS_TMCREADY_BIT 2 #define TMC_STS_FULL BIT(0) #define TMC_STS_TRIGGERED BIT(1) +#define TMC_STS_MEMERR BIT(5)
/*
- TMC_AXICTL - 0x110
@@ -134,6 +136,37 @@ enum tmc_mem_intf_width { #define CORESIGHT_SOC_600_ETR_CAPS \ (TMC_ETR_SAVE_RESTORE | TMC_ETR_AXI_ARCACHE) +enum etr_mode {
- ETR_MODE_FLAT, /* Uses contiguous flat buffer */
- ETR_MODE_ETR_SG, /* Uses in-built TMC ETR SG mechanism */
+};
+struct etr_buf_operations;
+/**
- struct etr_buf - Details of the buffer used by ETR
- @mode : Mode of the ETR buffer, contiguous, Scatter Gather etc.
- @full : Trace data overflow
- @size : Size of the buffer.
- @hwaddr : Address to be programmed in the TMC:DBA{LO,HI}
- @vaddr : Virtual address of the buffer used for trace.
- @offset : Offset of the trace data in the buffer for consumption.
- @len : Available trace data @buf (may round up to the beginning).
- @ops : ETR buffer operations for the mode.
- @private : Backend specific information for the buf
- */
+struct etr_buf {
- enum etr_mode mode;
- bool full;
- ssize_t size;
- dma_addr_t hwaddr;
- void *vaddr;
- unsigned long offset;
- u64 len;
- const struct etr_buf_operations *ops;
- void *private;
+};
/**
- struct tmc_drvdata - specifics associated to an TMC component
- @base: memory mapped base address for this component.
@@ -141,11 +174,10 @@ enum tmc_mem_intf_width {
- @csdev: component vitals needed by the framework.
- @miscdev: specifics to handle "/dev/xyz.tmc" entry.
- @spinlock: only one at a time pls.
- @buf: area of memory where trace data get sent.
- @paddr: DMA start location in RAM.
- @vaddr: virtual representation of @paddr.
- @size: trace buffer size.
- @len: size of the available trace.
- @buf: Snapshot of the trace data for ETF/ETB.
- @etr_buf: details of buffer used in TMC-ETR
- @len: size of the available trace for ETF/ETB.
- @size: trace buffer size for this TMC (common for all modes).
- @mode: how this TMC is being used.
- @config_type: TMC variant, must be of type @tmc_config_type.
- @memwidth: width of the memory interface databus, in bytes.
@@ -160,11 +192,12 @@ struct tmc_drvdata { struct miscdevice miscdev; spinlock_t spinlock; bool reading;
- char *buf;
- dma_addr_t paddr;
- void __iomem *vaddr;
- u32 size;
- union {
char *buf; /* TMC ETB */
struct etr_buf *etr_buf; /* TMC ETR */
- }; u32 len;
- u32 size; u32 mode; enum tmc_config_type config_type; enum tmc_mem_intf_width memwidth;
@@ -172,6 +205,15 @@ struct tmc_drvdata { u32 etr_caps; }; +struct etr_buf_operations {
- int (*alloc)(struct tmc_drvdata *drvdata, struct etr_buf *etr_buf,
int node, void **pages);
- void (*sync)(struct etr_buf *etr_buf, u64 rrp, u64 rwp);
- ssize_t (*get_data)(struct etr_buf *etr_buf, u64 offset, size_t len,
char **bufpp);
- void (*free)(struct etr_buf *etr_buf);
+};
/**
- struct tmc_pages - Collection of pages used for SG.
- @nr_pages: Number of pages in the list.
-- 2.13.6
On 02/11/17 17:48, Mathieu Poirier wrote:
On Thu, Oct 19, 2017 at 06:15:43PM +0100, Suzuki K Poulose wrote:
At the moment we always use contiguous memory for TMC ETR tracing when used from sysfs. The size of the buffer is fixed at boot time and can only be changed by modifiying the DT. With the introduction of SG support we could support really large buffers in that mode. This patch abstracts the buffer used for ETR to switch between a contiguous buffer or a SG table depending on the availability of the memory.
This also enables the sysfs mode to use the ETR in SG mode depending on configured the trace buffer size. Also, since ETR will use the new infrastructure to manage the buffer, we can get rid of some of the members in the tmc_drvdata and clean up the fields a bit.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
drivers/hwtracing/coresight/coresight-tmc-etr.c | 433 +++++++++++++++++++----- drivers/hwtracing/coresight/coresight-tmc.h | 60 +++- 2 files changed, 403 insertions(+), 90 deletions(-)
[..]
+static void tmc_etr_sync_sg_buf(struct etr_buf *etr_buf, u64 rrp, u64 rwp)
- w_offset = tmc_sg_get_data_page_offset(table, rwp);
- if (w_offset < 0) {
dev_warn(table->dev, "Unable to map RWP %llx to offset\n",
rwp);
dev_warn(table->dev, "Unable to map RWP %llx to offset\n", rwq);
It looks a little better and we respect indentation rules. Same for r_offset.
+static inline int tmc_etr_mode_alloc_buf(int mode,
struct tmc_drvdata *drvdata,
struct etr_buf *etr_buf, int node,
void **pages)
static inline int tmc_etr_mode_alloc_buf(int mode, struct tmc_drvdata *drvdata, struct etr_buf *etr_buf, int node, void **pages)
- tmc_alloc_etr_buf: Allocate a buffer use by ETR.
- @drvdata : ETR device details.
- @size : size of the requested buffer.
- @flags : Required properties of the type of buffer.
- @node : Node for memory allocations.
- @pages : An optional list of pages.
- */
+static struct etr_buf *tmc_alloc_etr_buf(struct tmc_drvdata *drvdata,
ssize_t size, int flags,
int node, void **pages)
Please fix indentation. Also @flags isn't used.
Yep, flags is only used later and can move it to the patch where we use it.
+{
- int rc = -ENOMEM;
- bool has_etr_sg, has_iommu;
- struct etr_buf *etr_buf;
- has_etr_sg = tmc_etr_has_cap(drvdata, TMC_ETR_SG);
- has_iommu = iommu_get_domain_for_dev(drvdata->dev);
- etr_buf = kzalloc(sizeof(*etr_buf), GFP_KERNEL);
- if (!etr_buf)
return ERR_PTR(-ENOMEM);
- etr_buf->size = size;
- /*
* If we have to use an existing list of pages, we cannot reliably
* use a contiguous DMA memory (even if we have an IOMMU). Otherwise,
* we use the contiguous DMA memory if :
* a) The ETR cannot use Scatter-Gather.
* b) if not a, we have an IOMMU backup
Please rework the above sentence.
How about : b) if (a) is not true and we have an IOMMU connected to the ETR.
I will address the other comments on indentation.
Thanks for the detailed look
Cheers Suzuki
On 3 November 2017 at 04:02, Suzuki K Poulose Suzuki.Poulose@arm.com wrote:
On 02/11/17 17:48, Mathieu Poirier wrote:
On Thu, Oct 19, 2017 at 06:15:43PM +0100, Suzuki K Poulose wrote:
At the moment we always use contiguous memory for TMC ETR tracing when used from sysfs. The size of the buffer is fixed at boot time and can only be changed by modifiying the DT. With the introduction of SG support we could support really large buffers in that mode. This patch abstracts the buffer used for ETR to switch between a contiguous buffer or a SG table depending on the availability of the memory.
This also enables the sysfs mode to use the ETR in SG mode depending on configured the trace buffer size. Also, since ETR will use the new infrastructure to manage the buffer, we can get rid of some of the members in the tmc_drvdata and clean up the fields a bit.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
drivers/hwtracing/coresight/coresight-tmc-etr.c | 433 +++++++++++++++++++----- drivers/hwtracing/coresight/coresight-tmc.h | 60 +++- 2 files changed, 403 insertions(+), 90 deletions(-)
[..]
+static void tmc_etr_sync_sg_buf(struct etr_buf *etr_buf, u64 rrp, u64 rwp)
w_offset = tmc_sg_get_data_page_offset(table, rwp);
if (w_offset < 0) {
dev_warn(table->dev, "Unable to map RWP %llx to
offset\n",
rwp);
dev_warn(table->dev, "Unable to map RWP %llx to offset\n", rwq);
It looks a little better and we respect indentation rules. Same for r_offset.
+static inline int tmc_etr_mode_alloc_buf(int mode,
struct tmc_drvdata *drvdata,
struct etr_buf *etr_buf, int node,
void **pages)
static inline int tmc_etr_mode_alloc_buf(int mode, struct tmc_drvdata *drvdata, struct etr_buf *etr_buf, int node, void **pages)
- tmc_alloc_etr_buf: Allocate a buffer use by ETR.
- @drvdata : ETR device details.
- @size : size of the requested buffer.
- @flags : Required properties of the type of buffer.
- @node : Node for memory allocations.
- @pages : An optional list of pages.
- */
+static struct etr_buf *tmc_alloc_etr_buf(struct tmc_drvdata *drvdata,
ssize_t size, int flags,
int node, void **pages)
Please fix indentation. Also @flags isn't used.
Ok, I haven't made it that far yet. If it's used later one just leave it as it is.
Yep, flags is only used later and can move it to the patch where we use it.
+{
int rc = -ENOMEM;
bool has_etr_sg, has_iommu;
struct etr_buf *etr_buf;
has_etr_sg = tmc_etr_has_cap(drvdata, TMC_ETR_SG);
has_iommu = iommu_get_domain_for_dev(drvdata->dev);
etr_buf = kzalloc(sizeof(*etr_buf), GFP_KERNEL);
if (!etr_buf)
return ERR_PTR(-ENOMEM);
etr_buf->size = size;
/*
* If we have to use an existing list of pages, we cannot
reliably
* use a contiguous DMA memory (even if we have an IOMMU).
Otherwise,
* we use the contiguous DMA memory if :
* a) The ETR cannot use Scatter-Gather.
* b) if not a, we have an IOMMU backup
Please rework the above sentence.
How about : b) if (a) is not true and we have an IOMMU connected to the ETR.
I'm good with that.
I will address the other comments on indentation.
Thanks for the detailed look
Cheers Suzuki
Now that we can dynamically switch between contiguous memory and SG table depending on the trace buffer size, provide the support for selecting an appropriate buffer size.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com --- .../ABI/testing/sysfs-bus-coresight-devices-tmc | 8 ++++++ drivers/hwtracing/coresight/coresight-tmc.c | 32 ++++++++++++++++++++++ 2 files changed, 40 insertions(+)
diff --git a/Documentation/ABI/testing/sysfs-bus-coresight-devices-tmc b/Documentation/ABI/testing/sysfs-bus-coresight-devices-tmc index 4fe677ed1305..3675c380caf8 100644 --- a/Documentation/ABI/testing/sysfs-bus-coresight-devices-tmc +++ b/Documentation/ABI/testing/sysfs-bus-coresight-devices-tmc @@ -83,3 +83,11 @@ KernelVersion: 4.7 Contact: Mathieu Poirier mathieu.poirier@linaro.org Description: (R) Indicates the capabilities of the Coresight TMC. The value is read directly from the DEVID register, 0xFC8, + +What: /sys/bus/coresight/devices/<memory_map>.tmc/buffer-size +Date: September 2017 +KernelVersion: 4.15 +Contact: Mathieu Poirier mathieu.poirier@linaro.org +Description: (RW) Size of the trace buffer for TMC-ETR when used in SYSFS + mode. Writable only for TMC-ETR configurations. The value + should be aligned to the kernel pagesize. diff --git a/drivers/hwtracing/coresight/coresight-tmc.c b/drivers/hwtracing/coresight/coresight-tmc.c index c7201e40d737..2349b1805694 100644 --- a/drivers/hwtracing/coresight/coresight-tmc.c +++ b/drivers/hwtracing/coresight/coresight-tmc.c @@ -283,8 +283,40 @@ static ssize_t trigger_cntr_store(struct device *dev, } static DEVICE_ATTR_RW(trigger_cntr);
+static ssize_t buffer_size_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct tmc_drvdata *drvdata = dev_get_drvdata(dev->parent); + + return sprintf(buf, "%#x\n", drvdata->size); +} + +static ssize_t buffer_size_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t size) +{ + int ret; + unsigned long val; + struct tmc_drvdata *drvdata = dev_get_drvdata(dev->parent); + + if (drvdata->config_type != TMC_CONFIG_TYPE_ETR) + return -EPERM; + + ret = kstrtoul(buf, 0, &val); + if (ret) + return ret; + /* The buffer size should be page aligned */ + if (val & (PAGE_SIZE - 1)) + return -EINVAL; + drvdata->size = val; + return size; +} + +static DEVICE_ATTR_RW(buffer_size); + static struct attribute *coresight_tmc_attrs[] = { &dev_attr_trigger_cntr.attr, + &dev_attr_buffer_size.attr, NULL, };
On Thu, Oct 19, 2017 at 06:15:44PM +0100, Suzuki K Poulose wrote:
Now that we can dynamically switch between contiguous memory and SG table depending on the trace buffer size, provide the support for selecting an appropriate buffer size.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
.../ABI/testing/sysfs-bus-coresight-devices-tmc | 8 ++++++ drivers/hwtracing/coresight/coresight-tmc.c | 32 ++++++++++++++++++++++ 2 files changed, 40 insertions(+)
diff --git a/Documentation/ABI/testing/sysfs-bus-coresight-devices-tmc b/Documentation/ABI/testing/sysfs-bus-coresight-devices-tmc index 4fe677ed1305..3675c380caf8 100644 --- a/Documentation/ABI/testing/sysfs-bus-coresight-devices-tmc +++ b/Documentation/ABI/testing/sysfs-bus-coresight-devices-tmc @@ -83,3 +83,11 @@ KernelVersion: 4.7 Contact: Mathieu Poirier mathieu.poirier@linaro.org Description: (R) Indicates the capabilities of the Coresight TMC. The value is read directly from the DEVID register, 0xFC8,
+What: /sys/bus/coresight/devices/<memory_map>.tmc/buffer-size +Date: September 2017 +KernelVersion: 4.15
More like 4.16 now.
+Contact: Mathieu Poirier mathieu.poirier@linaro.org +Description: (RW) Size of the trace buffer for TMC-ETR when used in SYSFS
mode. Writable only for TMC-ETR configurations. The value
should be aligned to the kernel pagesize.
diff --git a/drivers/hwtracing/coresight/coresight-tmc.c b/drivers/hwtracing/coresight/coresight-tmc.c index c7201e40d737..2349b1805694 100644 --- a/drivers/hwtracing/coresight/coresight-tmc.c +++ b/drivers/hwtracing/coresight/coresight-tmc.c @@ -283,8 +283,40 @@ static ssize_t trigger_cntr_store(struct device *dev, } static DEVICE_ATTR_RW(trigger_cntr); +static ssize_t buffer_size_show(struct device *dev,
struct device_attribute *attr, char *buf)
+{
- struct tmc_drvdata *drvdata = dev_get_drvdata(dev->parent);
- return sprintf(buf, "%#x\n", drvdata->size);
+}
+static ssize_t buffer_size_store(struct device *dev,
struct device_attribute *attr,
const char *buf, size_t size)
Indentation (I know trigger_cntr_store() is wrong).
+{
- int ret;
- unsigned long val;
- struct tmc_drvdata *drvdata = dev_get_drvdata(dev->parent);
- if (drvdata->config_type != TMC_CONFIG_TYPE_ETR)
return -EPERM;
I think -EINVAL would be more appropriate but definitely not a big deal.
- ret = kstrtoul(buf, 0, &val);
- if (ret)
return ret;
- /* The buffer size should be page aligned */
- if (val & (PAGE_SIZE - 1))
return -EINVAL;
- drvdata->size = val;
- return size;
+}
+static DEVICE_ATTR_RW(buffer_size);
static struct attribute *coresight_tmc_attrs[] = { &dev_attr_trigger_cntr.attr,
- &dev_attr_buffer_size.attr, NULL,
}; -- 2.13.6
Convert component enable/disable messages from dev_info to dev_dbg. This is required to prevent LOCKDEP splats when operating in perf mode where we could be called with locks held to enable a coresight path. If someone wants to really see the messages, they can always enable it at runtime via dynamic_debug.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com --- drivers/hwtracing/coresight/coresight-dynamic-replicator.c | 4 ++-- drivers/hwtracing/coresight/coresight-etb10.c | 6 +++--- drivers/hwtracing/coresight/coresight-etm3x.c | 4 ++-- drivers/hwtracing/coresight/coresight-etm4x.c | 4 ++-- drivers/hwtracing/coresight/coresight-funnel.c | 4 ++-- drivers/hwtracing/coresight/coresight-replicator.c | 4 ++-- drivers/hwtracing/coresight/coresight-stm.c | 4 ++-- drivers/hwtracing/coresight/coresight-tmc-etf.c | 8 ++++---- drivers/hwtracing/coresight/coresight-tmc-etr.c | 4 ++-- drivers/hwtracing/coresight/coresight-tmc.c | 4 ++-- drivers/hwtracing/coresight/coresight-tpiu.c | 4 ++-- 11 files changed, 25 insertions(+), 25 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-dynamic-replicator.c b/drivers/hwtracing/coresight/coresight-dynamic-replicator.c index accc2056f7c6..49efa9d90367 100644 --- a/drivers/hwtracing/coresight/coresight-dynamic-replicator.c +++ b/drivers/hwtracing/coresight/coresight-dynamic-replicator.c @@ -64,7 +64,7 @@ static int replicator_enable(struct coresight_device *csdev, int inport,
CS_LOCK(drvdata->base);
- dev_info(drvdata->dev, "REPLICATOR enabled\n"); + dev_dbg(drvdata->dev, "REPLICATOR enabled\n"); return 0; }
@@ -83,7 +83,7 @@ static void replicator_disable(struct coresight_device *csdev, int inport,
CS_LOCK(drvdata->base);
- dev_info(drvdata->dev, "REPLICATOR disabled\n"); + dev_dbg(drvdata->dev, "REPLICATOR disabled\n"); }
static const struct coresight_ops_link replicator_link_ops = { diff --git a/drivers/hwtracing/coresight/coresight-etb10.c b/drivers/hwtracing/coresight/coresight-etb10.c index d7164ab8e229..757f556975f7 100644 --- a/drivers/hwtracing/coresight/coresight-etb10.c +++ b/drivers/hwtracing/coresight/coresight-etb10.c @@ -164,7 +164,7 @@ static int etb_enable(struct coresight_device *csdev, u32 mode) spin_unlock_irqrestore(&drvdata->spinlock, flags);
out: - dev_info(drvdata->dev, "ETB enabled\n"); + dev_dbg(drvdata->dev, "ETB enabled\n"); return 0; }
@@ -270,7 +270,7 @@ static void etb_disable(struct coresight_device *csdev)
local_set(&drvdata->mode, CS_MODE_DISABLED);
- dev_info(drvdata->dev, "ETB disabled\n"); + dev_dbg(drvdata->dev, "ETB disabled\n"); }
static void *etb_alloc_buffer(struct coresight_device *csdev, int cpu, @@ -513,7 +513,7 @@ static void etb_dump(struct etb_drvdata *drvdata) } spin_unlock_irqrestore(&drvdata->spinlock, flags);
- dev_info(drvdata->dev, "ETB dumped\n"); + dev_dbg(drvdata->dev, "ETB dumped\n"); }
static int etb_open(struct inode *inode, struct file *file) diff --git a/drivers/hwtracing/coresight/coresight-etm3x.c b/drivers/hwtracing/coresight/coresight-etm3x.c index e5b1ec57dbde..aa8a2b076ad4 100644 --- a/drivers/hwtracing/coresight/coresight-etm3x.c +++ b/drivers/hwtracing/coresight/coresight-etm3x.c @@ -510,7 +510,7 @@ static int etm_enable_sysfs(struct coresight_device *csdev) drvdata->sticky_enable = true; spin_unlock(&drvdata->spinlock);
- dev_info(drvdata->dev, "ETM tracing enabled\n"); + dev_dbg(drvdata->dev, "ETM tracing enabled\n"); return 0;
err: @@ -613,7 +613,7 @@ static void etm_disable_sysfs(struct coresight_device *csdev) spin_unlock(&drvdata->spinlock); cpus_read_unlock();
- dev_info(drvdata->dev, "ETM tracing disabled\n"); + dev_dbg(drvdata->dev, "ETM tracing disabled\n"); }
static void etm_disable(struct coresight_device *csdev, diff --git a/drivers/hwtracing/coresight/coresight-etm4x.c b/drivers/hwtracing/coresight/coresight-etm4x.c index e84d80b008fc..c9c73c2f7fd8 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x.c +++ b/drivers/hwtracing/coresight/coresight-etm4x.c @@ -274,7 +274,7 @@ static int etm4_enable_sysfs(struct coresight_device *csdev) drvdata->sticky_enable = true; spin_unlock(&drvdata->spinlock);
- dev_info(drvdata->dev, "ETM tracing enabled\n"); + dev_dbg(drvdata->dev, "ETM tracing enabled\n"); return 0;
err: @@ -387,7 +387,7 @@ static void etm4_disable_sysfs(struct coresight_device *csdev) spin_unlock(&drvdata->spinlock); cpus_read_unlock();
- dev_info(drvdata->dev, "ETM tracing disabled\n"); + dev_dbg(drvdata->dev, "ETM tracing disabled\n"); }
static void etm4_disable(struct coresight_device *csdev, diff --git a/drivers/hwtracing/coresight/coresight-funnel.c b/drivers/hwtracing/coresight/coresight-funnel.c index 77642e0e955b..afdf4807c2dc 100644 --- a/drivers/hwtracing/coresight/coresight-funnel.c +++ b/drivers/hwtracing/coresight/coresight-funnel.c @@ -72,7 +72,7 @@ static int funnel_enable(struct coresight_device *csdev, int inport,
funnel_enable_hw(drvdata, inport);
- dev_info(drvdata->dev, "FUNNEL inport %d enabled\n", inport); + dev_dbg(drvdata->dev, "FUNNEL inport %d enabled\n", inport); return 0; }
@@ -96,7 +96,7 @@ static void funnel_disable(struct coresight_device *csdev, int inport,
funnel_disable_hw(drvdata, inport);
- dev_info(drvdata->dev, "FUNNEL inport %d disabled\n", inport); + dev_dbg(drvdata->dev, "FUNNEL inport %d disabled\n", inport); }
static const struct coresight_ops_link funnel_link_ops = { diff --git a/drivers/hwtracing/coresight/coresight-replicator.c b/drivers/hwtracing/coresight/coresight-replicator.c index 3756e71cb8f5..4f7781203fd4 100644 --- a/drivers/hwtracing/coresight/coresight-replicator.c +++ b/drivers/hwtracing/coresight/coresight-replicator.c @@ -42,7 +42,7 @@ static int replicator_enable(struct coresight_device *csdev, int inport, { struct replicator_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
- dev_info(drvdata->dev, "REPLICATOR enabled\n"); + dev_dbg(drvdata->dev, "REPLICATOR enabled\n"); return 0; }
@@ -51,7 +51,7 @@ static void replicator_disable(struct coresight_device *csdev, int inport, { struct replicator_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
- dev_info(drvdata->dev, "REPLICATOR disabled\n"); + dev_dbg(drvdata->dev, "REPLICATOR disabled\n"); }
static const struct coresight_ops_link replicator_link_ops = { diff --git a/drivers/hwtracing/coresight/coresight-stm.c b/drivers/hwtracing/coresight/coresight-stm.c index 92a780a6df1d..696455891ec4 100644 --- a/drivers/hwtracing/coresight/coresight-stm.c +++ b/drivers/hwtracing/coresight/coresight-stm.c @@ -218,7 +218,7 @@ static int stm_enable(struct coresight_device *csdev, stm_enable_hw(drvdata); spin_unlock(&drvdata->spinlock);
- dev_info(drvdata->dev, "STM tracing enabled\n"); + dev_dbg(drvdata->dev, "STM tracing enabled\n"); return 0; }
@@ -281,7 +281,7 @@ static void stm_disable(struct coresight_device *csdev, pm_runtime_put(drvdata->dev);
local_set(&drvdata->mode, CS_MODE_DISABLED); - dev_info(drvdata->dev, "STM tracing disabled\n"); + dev_dbg(drvdata->dev, "STM tracing disabled\n"); } }
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c index d89bfb3042a2..aa4e8f03ef49 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c @@ -242,7 +242,7 @@ static int tmc_enable_etf_sink(struct coresight_device *csdev, u32 mode) if (ret) return ret;
- dev_info(drvdata->dev, "TMC-ETB/ETF enabled\n"); + dev_dbg(drvdata->dev, "TMC-ETB/ETF enabled\n"); return 0; }
@@ -265,7 +265,7 @@ static void tmc_disable_etf_sink(struct coresight_device *csdev)
spin_unlock_irqrestore(&drvdata->spinlock, flags);
- dev_info(drvdata->dev, "TMC-ETB/ETF disabled\n"); + dev_dbg(drvdata->dev, "TMC-ETB/ETF disabled\n"); }
static int tmc_enable_etf_link(struct coresight_device *csdev, @@ -284,7 +284,7 @@ static int tmc_enable_etf_link(struct coresight_device *csdev, drvdata->mode = CS_MODE_SYSFS; spin_unlock_irqrestore(&drvdata->spinlock, flags);
- dev_info(drvdata->dev, "TMC-ETF enabled\n"); + dev_dbg(drvdata->dev, "TMC-ETF enabled\n"); return 0; }
@@ -304,7 +304,7 @@ static void tmc_disable_etf_link(struct coresight_device *csdev, drvdata->mode = CS_MODE_DISABLED; spin_unlock_irqrestore(&drvdata->spinlock, flags);
- dev_info(drvdata->dev, "TMC-ETF disabled\n"); + dev_dbg(drvdata->dev, "TMC-ETF disabled\n"); }
static void *tmc_alloc_etf_buffer(struct coresight_device *csdev, int cpu, diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index 9e41eeaa5284..f12b7c5f68b2 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -1142,7 +1142,7 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev) tmc_etr_free_sysfs_buf(free_buf);
if (!ret) - dev_info(drvdata->dev, "TMC-ETR enabled\n"); + dev_dbg(drvdata->dev, "TMC-ETR enabled\n");
return ret; } @@ -1185,7 +1185,7 @@ static void tmc_disable_etr_sink(struct coresight_device *csdev)
spin_unlock_irqrestore(&drvdata->spinlock, flags);
- dev_info(drvdata->dev, "TMC-ETR disabled\n"); + dev_dbg(drvdata->dev, "TMC-ETR disabled\n"); }
static const struct coresight_ops_sink tmc_etr_sink_ops = { diff --git a/drivers/hwtracing/coresight/coresight-tmc.c b/drivers/hwtracing/coresight/coresight-tmc.c index 2349b1805694..4939333cc6c7 100644 --- a/drivers/hwtracing/coresight/coresight-tmc.c +++ b/drivers/hwtracing/coresight/coresight-tmc.c @@ -88,7 +88,7 @@ static int tmc_read_prepare(struct tmc_drvdata *drvdata) }
if (!ret) - dev_info(drvdata->dev, "TMC read start\n"); + dev_dbg(drvdata->dev, "TMC read start\n");
return ret; } @@ -110,7 +110,7 @@ static int tmc_read_unprepare(struct tmc_drvdata *drvdata) }
if (!ret) - dev_info(drvdata->dev, "TMC read end\n"); + dev_dbg(drvdata->dev, "TMC read end\n");
return ret; } diff --git a/drivers/hwtracing/coresight/coresight-tpiu.c b/drivers/hwtracing/coresight/coresight-tpiu.c index d7a3e453016d..7b105001dc32 100644 --- a/drivers/hwtracing/coresight/coresight-tpiu.c +++ b/drivers/hwtracing/coresight/coresight-tpiu.c @@ -77,7 +77,7 @@ static int tpiu_enable(struct coresight_device *csdev, u32 mode)
tpiu_enable_hw(drvdata);
- dev_info(drvdata->dev, "TPIU enabled\n"); + dev_dbg(drvdata->dev, "TPIU enabled\n"); return 0; }
@@ -99,7 +99,7 @@ static void tpiu_disable(struct coresight_device *csdev)
tpiu_disable_hw(drvdata);
- dev_info(drvdata->dev, "TPIU disabled\n"); + dev_dbg(drvdata->dev, "TPIU disabled\n"); }
static const struct coresight_ops_sink tpiu_sink_ops = {
Track if the ETR is dma-coherent or not. This will be useful in deciding if we should use software buffering for perf.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com --- drivers/hwtracing/coresight/coresight-tmc.c | 5 ++++- drivers/hwtracing/coresight/coresight-tmc.h | 1 + 2 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc.c b/drivers/hwtracing/coresight/coresight-tmc.c index 4939333cc6c7..5a8c41130f96 100644 --- a/drivers/hwtracing/coresight/coresight-tmc.c +++ b/drivers/hwtracing/coresight/coresight-tmc.c @@ -347,6 +347,9 @@ static int tmc_etr_setup_caps(struct tmc_drvdata *drvdata, if (!(devid & TMC_DEVID_NOSCAT)) tmc_etr_set_cap(drvdata, TMC_ETR_SG);
+ if (device_get_dma_attr(drvdata->dev) == DEV_DMA_COHERENT) + tmc_etr_set_cap(drvdata, TMC_ETR_COHERENT); + /* Check if the AXI address width is available */ if (devid & TMC_DEVID_AXIAW_VALID) dma_mask = ((devid >> TMC_DEVID_AXIAW_SHIFT) & @@ -397,7 +400,7 @@ static int tmc_probe(struct amba_device *adev, const struct amba_id *id) if (!drvdata) goto out;
- drvdata->dev = &adev->dev; + drvdata->dev = dev; dev_set_drvdata(dev, drvdata);
/* Validity for the resource is already checked by the AMBA core */ diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h index 50ebc17c4645..69da0b584a6b 100644 --- a/drivers/hwtracing/coresight/coresight-tmc.h +++ b/drivers/hwtracing/coresight/coresight-tmc.h @@ -131,6 +131,7 @@ enum tmc_mem_intf_width { * so we have to rely on PID of the IP to detect the functionality. */ #define TMC_ETR_SAVE_RESTORE (0x1U << 2) +#define TMC_ETR_COHERENT (0x1U << 3)
/* Coresight SoC-600 TMC-ETR unadvertised capabilities */ #define CORESIGHT_SOC_600_ETR_CAPS \
On Thu, Oct 19, 2017 at 06:15:46PM +0100, Suzuki K Poulose wrote:
Track if the ETR is dma-coherent or not. This will be useful in deciding if we should use software buffering for perf.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
drivers/hwtracing/coresight/coresight-tmc.c | 5 ++++- drivers/hwtracing/coresight/coresight-tmc.h | 1 + 2 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc.c b/drivers/hwtracing/coresight/coresight-tmc.c index 4939333cc6c7..5a8c41130f96 100644 --- a/drivers/hwtracing/coresight/coresight-tmc.c +++ b/drivers/hwtracing/coresight/coresight-tmc.c @@ -347,6 +347,9 @@ static int tmc_etr_setup_caps(struct tmc_drvdata *drvdata, if (!(devid & TMC_DEVID_NOSCAT)) tmc_etr_set_cap(drvdata, TMC_ETR_SG);
- if (device_get_dma_attr(drvdata->dev) == DEV_DMA_COHERENT)
tmc_etr_set_cap(drvdata, TMC_ETR_COHERENT);
- /* Check if the AXI address width is available */ if (devid & TMC_DEVID_AXIAW_VALID) dma_mask = ((devid >> TMC_DEVID_AXIAW_SHIFT) &
@@ -397,7 +400,7 @@ static int tmc_probe(struct amba_device *adev, const struct amba_id *id) if (!drvdata) goto out;
- drvdata->dev = &adev->dev;
- drvdata->dev = dev;
What is that one for?
dev_set_drvdata(dev, drvdata); /* Validity for the resource is already checked by the AMBA core */ diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h index 50ebc17c4645..69da0b584a6b 100644 --- a/drivers/hwtracing/coresight/coresight-tmc.h +++ b/drivers/hwtracing/coresight/coresight-tmc.h @@ -131,6 +131,7 @@ enum tmc_mem_intf_width {
- so we have to rely on PID of the IP to detect the functionality.
*/ #define TMC_ETR_SAVE_RESTORE (0x1U << 2) +#define TMC_ETR_COHERENT (0x1U << 3) /* Coresight SoC-600 TMC-ETR unadvertised capabilities */
#define CORESIGHT_SOC_600_ETR_CAPS \
2.13.6
On 02/11/17 19:40, Mathieu Poirier wrote:
On Thu, Oct 19, 2017 at 06:15:46PM +0100, Suzuki K Poulose wrote:
Track if the ETR is dma-coherent or not. This will be useful in deciding if we should use software buffering for perf.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
drivers/hwtracing/coresight/coresight-tmc.c | 5 ++++- drivers/hwtracing/coresight/coresight-tmc.h | 1 + 2 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc.c b/drivers/hwtracing/coresight/coresight-tmc.c index 4939333cc6c7..5a8c41130f96 100644 --- a/drivers/hwtracing/coresight/coresight-tmc.c +++ b/drivers/hwtracing/coresight/coresight-tmc.c @@ -347,6 +347,9 @@ static int tmc_etr_setup_caps(struct tmc_drvdata *drvdata, if (!(devid & TMC_DEVID_NOSCAT)) tmc_etr_set_cap(drvdata, TMC_ETR_SG);
- if (device_get_dma_attr(drvdata->dev) == DEV_DMA_COHERENT)
tmc_etr_set_cap(drvdata, TMC_ETR_COHERENT);
- /* Check if the AXI address width is available */ if (devid & TMC_DEVID_AXIAW_VALID) dma_mask = ((devid >> TMC_DEVID_AXIAW_SHIFT) &
@@ -397,7 +400,7 @@ static int tmc_probe(struct amba_device *adev, const struct amba_id *id) if (!drvdata) goto out;
- drvdata->dev = &adev->dev;
- drvdata->dev = dev;
What is that one for?
Oops, that was a minor cleanup and need not be part of this patch. I will leave things as it is. It is not worth a separate patch.
Cheers Suzuki
Since the ETR could be driven either by SYSFS or by perf, it becomes complicated how we deal with the buffers used for each of these modes. The ETR driver cannot simply free the current attached buffer without knowing the provider (i.e, sysfs vs perf).
To solve this issue, we provide: 1) the driver-mode specific etr buffer to be retained in the drvdata 2) the etr_buf for a session should be passed on when enabling the hardware, which will be stored in drvdata->etr_buf. This will be replaced (not free'd) as soon as the hardware is disabled, after necessary sync operation.
The advantages of this are :
1) The common code path doesn't need to worry about how to dispose an existing buffer, if it is about to start a new session with a different buffer, possibly in a different mode. 2) The driver mode can control its buffers and can get access to the saved session even when the hardware is operating in a different mode. (e.g, we can still access a trace buffer from a sysfs mode even if the etr is now used in perf mode, without disrupting the current session.)
Towards this, we introduce a sysfs specific data which will hold the etr_buf used for sysfs mode of operation, controlled solely by the sysfs mode handling code.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com --- drivers/hwtracing/coresight/coresight-tmc-etr.c | 59 ++++++++++++++++--------- drivers/hwtracing/coresight/coresight-tmc.h | 2 + 2 files changed, 41 insertions(+), 20 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index f12b7c5f68b2..ef7498f05b34 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -961,11 +961,16 @@ static void tmc_sync_etr_buf(struct tmc_drvdata *drvdata) tmc_etr_buf_insert_barrier_packet(etr_buf, etr_buf->offset); }
-static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata) +static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata, + struct etr_buf *etr_buf) { u32 axictl, sts; - struct etr_buf *etr_buf = drvdata->etr_buf;
+ /* Callers should provide an appropriate buffer for use */ + if (WARN_ON(!etr_buf || drvdata->etr_buf)) + return; + + drvdata->etr_buf = etr_buf; /* Zero out the memory to help with debug */ memset(etr_buf->vaddr, 0, etr_buf->size);
@@ -1023,12 +1028,15 @@ static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata) * also updating the @bufpp on where to find it. Since the trace data * starts at anywhere in the buffer, depending on the RRP, we adjust the * @len returned to handle buffer wrapping around. + * + * We are protected here by drvdata->reading != 0, which ensures the + * sysfs_buf stays alive. */ ssize_t tmc_etr_get_sysfs_trace(struct tmc_drvdata *drvdata, loff_t pos, size_t len, char **bufpp) { s64 offset; - struct etr_buf *etr_buf = drvdata->etr_buf; + struct etr_buf *etr_buf = drvdata->sysfs_buf;
if (pos + len > etr_buf->len) len = etr_buf->len - pos; @@ -1058,7 +1066,14 @@ tmc_etr_free_sysfs_buf(struct etr_buf *buf)
static void tmc_etr_sync_sysfs_buf(struct tmc_drvdata *drvdata) { - tmc_sync_etr_buf(drvdata); + struct etr_buf *etr_buf = drvdata->etr_buf; + + if (WARN_ON(drvdata->sysfs_buf != etr_buf)) { + tmc_etr_free_sysfs_buf(drvdata->sysfs_buf); + drvdata->sysfs_buf = NULL; + } else { + tmc_sync_etr_buf(drvdata); + } }
static void tmc_etr_disable_hw(struct tmc_drvdata *drvdata) @@ -1077,6 +1092,8 @@ static void tmc_etr_disable_hw(struct tmc_drvdata *drvdata) tmc_disable_hw(drvdata);
CS_LOCK(drvdata->base); + /* Reset the ETR buf used by hardware */ + drvdata->etr_buf = NULL; }
static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev) @@ -1085,7 +1102,7 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev) bool used = false; unsigned long flags; struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); - struct etr_buf *new_buf = NULL, *free_buf = NULL; + struct etr_buf *sysfs_buf = NULL, *new_buf = NULL, *free_buf = NULL;
/* @@ -1097,7 +1114,8 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev) * with the lock released. */ spin_lock_irqsave(&drvdata->spinlock, flags); - if (!drvdata->etr_buf || (drvdata->etr_buf->size != drvdata->size)) { + sysfs_buf = READ_ONCE(drvdata->sysfs_buf); + if (!sysfs_buf || (sysfs_buf->size != drvdata->size)) { spin_unlock_irqrestore(&drvdata->spinlock, flags); /* Allocate memory with the spinlock released */ free_buf = new_buf = tmc_etr_setup_sysfs_buf(drvdata); @@ -1125,15 +1143,16 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev) * If we don't have a buffer or it doesn't match the requested size, * use the memory allocated above. Otherwise reuse it. */ - if (!drvdata->etr_buf || - (new_buf && drvdata->etr_buf->size != new_buf->size)) { + sysfs_buf = READ_ONCE(drvdata->sysfs_buf); + if (!sysfs_buf || + (new_buf && sysfs_buf->size != new_buf->size)) { used = true; - free_buf = drvdata->etr_buf; - drvdata->etr_buf = new_buf; + free_buf = sysfs_buf; + drvdata->sysfs_buf = new_buf; }
drvdata->mode = CS_MODE_SYSFS; - tmc_etr_enable_hw(drvdata); + tmc_etr_enable_hw(drvdata, drvdata->sysfs_buf); out: spin_unlock_irqrestore(&drvdata->spinlock, flags);
@@ -1218,13 +1237,13 @@ int tmc_read_prepare_etr(struct tmc_drvdata *drvdata) goto out; }
- /* If drvdata::etr_buf is NULL the trace data has been read already */ - if (drvdata->etr_buf == NULL) { + /* If sysfs_buf is NULL the trace data has been read already */ + if (!drvdata->sysfs_buf) { ret = -EINVAL; goto out; }
- /* Disable the TMC if need be */ + /* Disable the TMC if we are trying to read from a running session */ if (drvdata->mode == CS_MODE_SYSFS) tmc_etr_disable_hw(drvdata);
@@ -1238,7 +1257,7 @@ int tmc_read_prepare_etr(struct tmc_drvdata *drvdata) int tmc_read_unprepare_etr(struct tmc_drvdata *drvdata) { unsigned long flags; - struct etr_buf *etr_buf = NULL; + struct etr_buf *sysfs_buf = NULL;
/* config types are set a boot time and never change */ if (WARN_ON_ONCE(drvdata->config_type != TMC_CONFIG_TYPE_ETR)) @@ -1254,22 +1273,22 @@ int tmc_read_unprepare_etr(struct tmc_drvdata *drvdata) * so we don't have to explicitly clear it. Also, since the * tracer is still enabled drvdata::buf can't be NULL. */ - tmc_etr_enable_hw(drvdata); + tmc_etr_enable_hw(drvdata, drvdata->sysfs_buf); } else { /* * The ETR is not tracing and the buffer was just read. * As such prepare to free the trace buffer. */ - etr_buf = drvdata->etr_buf; - drvdata->etr_buf = NULL; + sysfs_buf = drvdata->sysfs_buf; + drvdata->sysfs_buf = NULL; }
drvdata->reading = false; spin_unlock_irqrestore(&drvdata->spinlock, flags);
/* Free allocated memory out side of the spinlock */ - if (etr_buf) - tmc_free_etr_buf(etr_buf); + if (sysfs_buf) + tmc_etr_free_sysfs_buf(sysfs_buf);
return 0; } diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h index 69da0b584a6b..14a3dec50b0f 100644 --- a/drivers/hwtracing/coresight/coresight-tmc.h +++ b/drivers/hwtracing/coresight/coresight-tmc.h @@ -185,6 +185,7 @@ struct etr_buf { * @trigger_cntr: amount of words to store after a trigger. * @etr_caps: Bitmask of capabilities of the TMC ETR, inferred from the * device configuration register (DEVID) + * @sysfs_data: SYSFS buffer for ETR. */ struct tmc_drvdata { void __iomem *base; @@ -204,6 +205,7 @@ struct tmc_drvdata { enum tmc_mem_intf_width memwidth; u32 trigger_cntr; u32 etr_caps; + struct etr_buf *sysfs_buf; };
struct etr_buf_operations {
On Thu, Oct 19, 2017 at 06:15:47PM +0100, Suzuki K Poulose wrote:
Since the ETR could be driven either by SYSFS or by perf, it becomes complicated how we deal with the buffers used for each of these modes. The ETR driver cannot simply free the current attached buffer without knowing the provider (i.e, sysfs vs perf).
To solve this issue, we provide:
- the driver-mode specific etr buffer to be retained in the drvdata
- the etr_buf for a session should be passed on when enabling the hardware, which will be stored in drvdata->etr_buf. This will be replaced (not free'd) as soon as the hardware is disabled, after necessary sync operation.
If I get you right the problem you're trying to solve is what to do with a sysFS buffer that hasn't been read (and freed) when a perf session is requested. In my opinion it should simply be freed. Indeed the user probably doesn't care much about that sysFS buffer, if it did the data would have been harvested.
The advantages of this are :
- The common code path doesn't need to worry about how to dispose an existing buffer, if it is about to start a new session with a different buffer, possibly in a different mode.
- The driver mode can control its buffers and can get access to the saved session even when the hardware is operating in a different mode. (e.g, we can still access a trace buffer from a sysfs mode even if the etr is now used in perf mode, without disrupting the current session.)
Towards this, we introduce a sysfs specific data which will hold the etr_buf used for sysfs mode of operation, controlled solely by the sysfs mode handling code.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
drivers/hwtracing/coresight/coresight-tmc-etr.c | 59 ++++++++++++++++--------- drivers/hwtracing/coresight/coresight-tmc.h | 2 + 2 files changed, 41 insertions(+), 20 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index f12b7c5f68b2..ef7498f05b34 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -961,11 +961,16 @@ static void tmc_sync_etr_buf(struct tmc_drvdata *drvdata) tmc_etr_buf_insert_barrier_packet(etr_buf, etr_buf->offset); } -static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata) +static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata,
struct etr_buf *etr_buf)
{ u32 axictl, sts;
- struct etr_buf *etr_buf = drvdata->etr_buf;
- /* Callers should provide an appropriate buffer for use */
- if (WARN_ON(!etr_buf || drvdata->etr_buf))
return;
- drvdata->etr_buf = etr_buf; /* Zero out the memory to help with debug */ memset(etr_buf->vaddr, 0, etr_buf->size);
@@ -1023,12 +1028,15 @@ static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata)
- also updating the @bufpp on where to find it. Since the trace data
- starts at anywhere in the buffer, depending on the RRP, we adjust the
- @len returned to handle buffer wrapping around.
- We are protected here by drvdata->reading != 0, which ensures the
*/
- sysfs_buf stays alive.
ssize_t tmc_etr_get_sysfs_trace(struct tmc_drvdata *drvdata, loff_t pos, size_t len, char **bufpp) { s64 offset;
- struct etr_buf *etr_buf = drvdata->etr_buf;
- struct etr_buf *etr_buf = drvdata->sysfs_buf;
if (pos + len > etr_buf->len) len = etr_buf->len - pos; @@ -1058,7 +1066,14 @@ tmc_etr_free_sysfs_buf(struct etr_buf *buf) static void tmc_etr_sync_sysfs_buf(struct tmc_drvdata *drvdata) {
- tmc_sync_etr_buf(drvdata);
- struct etr_buf *etr_buf = drvdata->etr_buf;
- if (WARN_ON(drvdata->sysfs_buf != etr_buf)) {
tmc_etr_free_sysfs_buf(drvdata->sysfs_buf);
drvdata->sysfs_buf = NULL;
- } else {
tmc_sync_etr_buf(drvdata);
- }
} static void tmc_etr_disable_hw(struct tmc_drvdata *drvdata) @@ -1077,6 +1092,8 @@ static void tmc_etr_disable_hw(struct tmc_drvdata *drvdata) tmc_disable_hw(drvdata); CS_LOCK(drvdata->base);
- /* Reset the ETR buf used by hardware */
- drvdata->etr_buf = NULL;
} static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev) @@ -1085,7 +1102,7 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev) bool used = false; unsigned long flags; struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
- struct etr_buf *new_buf = NULL, *free_buf = NULL;
- struct etr_buf *sysfs_buf = NULL, *new_buf = NULL, *free_buf = NULL;
/* @@ -1097,7 +1114,8 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev) * with the lock released. */ spin_lock_irqsave(&drvdata->spinlock, flags);
- if (!drvdata->etr_buf || (drvdata->etr_buf->size != drvdata->size)) {
- sysfs_buf = READ_ONCE(drvdata->sysfs_buf);
- if (!sysfs_buf || (sysfs_buf->size != drvdata->size)) { spin_unlock_irqrestore(&drvdata->spinlock, flags); /* Allocate memory with the spinlock released */ free_buf = new_buf = tmc_etr_setup_sysfs_buf(drvdata);
@@ -1125,15 +1143,16 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev) * If we don't have a buffer or it doesn't match the requested size, * use the memory allocated above. Otherwise reuse it. */
- if (!drvdata->etr_buf ||
(new_buf && drvdata->etr_buf->size != new_buf->size)) {
- sysfs_buf = READ_ONCE(drvdata->sysfs_buf);
- if (!sysfs_buf ||
used = true;(new_buf && sysfs_buf->size != new_buf->size)) {
free_buf = drvdata->etr_buf;
drvdata->etr_buf = new_buf;
free_buf = sysfs_buf;
}drvdata->sysfs_buf = new_buf;
drvdata->mode = CS_MODE_SYSFS;
- tmc_etr_enable_hw(drvdata);
- tmc_etr_enable_hw(drvdata, drvdata->sysfs_buf);
out: spin_unlock_irqrestore(&drvdata->spinlock, flags); @@ -1218,13 +1237,13 @@ int tmc_read_prepare_etr(struct tmc_drvdata *drvdata) goto out; }
- /* If drvdata::etr_buf is NULL the trace data has been read already */
- if (drvdata->etr_buf == NULL) {
- /* If sysfs_buf is NULL the trace data has been read already */
- if (!drvdata->sysfs_buf) { ret = -EINVAL; goto out; }
- /* Disable the TMC if need be */
- /* Disable the TMC if we are trying to read from a running session */ if (drvdata->mode == CS_MODE_SYSFS) tmc_etr_disable_hw(drvdata);
@@ -1238,7 +1257,7 @@ int tmc_read_prepare_etr(struct tmc_drvdata *drvdata) int tmc_read_unprepare_etr(struct tmc_drvdata *drvdata) { unsigned long flags;
- struct etr_buf *etr_buf = NULL;
- struct etr_buf *sysfs_buf = NULL;
/* config types are set a boot time and never change */ if (WARN_ON_ONCE(drvdata->config_type != TMC_CONFIG_TYPE_ETR)) @@ -1254,22 +1273,22 @@ int tmc_read_unprepare_etr(struct tmc_drvdata *drvdata) * so we don't have to explicitly clear it. Also, since the * tracer is still enabled drvdata::buf can't be NULL. */
tmc_etr_enable_hw(drvdata);
} else { /*tmc_etr_enable_hw(drvdata, drvdata->sysfs_buf);
*/
- The ETR is not tracing and the buffer was just read.
- As such prepare to free the trace buffer.
etr_buf = drvdata->etr_buf;
drvdata->etr_buf = NULL;
sysfs_buf = drvdata->sysfs_buf;
}drvdata->sysfs_buf = NULL;
drvdata->reading = false; spin_unlock_irqrestore(&drvdata->spinlock, flags); /* Free allocated memory out side of the spinlock */
- if (etr_buf)
tmc_free_etr_buf(etr_buf);
- if (sysfs_buf)
tmc_etr_free_sysfs_buf(sysfs_buf);
return 0; } diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h index 69da0b584a6b..14a3dec50b0f 100644 --- a/drivers/hwtracing/coresight/coresight-tmc.h +++ b/drivers/hwtracing/coresight/coresight-tmc.h @@ -185,6 +185,7 @@ struct etr_buf {
- @trigger_cntr: amount of words to store after a trigger.
- @etr_caps: Bitmask of capabilities of the TMC ETR, inferred from the
device configuration register (DEVID)
*/
- @sysfs_data: SYSFS buffer for ETR.
struct tmc_drvdata { void __iomem *base; @@ -204,6 +205,7 @@ struct tmc_drvdata { enum tmc_mem_intf_width memwidth; u32 trigger_cntr; u32 etr_caps;
- struct etr_buf *sysfs_buf;
}; struct etr_buf_operations { -- 2.13.6
On 02/11/17 20:26, Mathieu Poirier wrote:
On Thu, Oct 19, 2017 at 06:15:47PM +0100, Suzuki K Poulose wrote:
Since the ETR could be driven either by SYSFS or by perf, it becomes complicated how we deal with the buffers used for each of these modes. The ETR driver cannot simply free the current attached buffer without knowing the provider (i.e, sysfs vs perf).
To solve this issue, we provide:
- the driver-mode specific etr buffer to be retained in the drvdata
- the etr_buf for a session should be passed on when enabling the hardware, which will be stored in drvdata->etr_buf. This will be replaced (not free'd) as soon as the hardware is disabled, after necessary sync operation.
If I get you right the problem you're trying to solve is what to do with a sysFS buffer that hasn't been read (and freed) when a perf session is requested. In my opinion it should simply be freed. Indeed the user probably doesn't care much about that sysFS buffer, if it did the data would have been harvested.
Not only that. If we simply use the drvdata->etr_buf, we cannot track the mode which uses it. If we keep the etr_buf around, how do the new mode user decide how to free the existing one ? (e.g, the perf etr_buf could be associated with other perf data structures). This change would allow us to leave the handling of the etr_buf to its respective modes.
And whether to keep the sysfs etr_buf around is a separate decision from the above.
Cheers Suzuki
On 3 November 2017 at 04:08, Suzuki K Poulose Suzuki.Poulose@arm.com wrote:
On 02/11/17 20:26, Mathieu Poirier wrote:
On Thu, Oct 19, 2017 at 06:15:47PM +0100, Suzuki K Poulose wrote:
Since the ETR could be driven either by SYSFS or by perf, it becomes complicated how we deal with the buffers used for each of these modes. The ETR driver cannot simply free the current attached buffer without knowing the provider (i.e, sysfs vs perf).
To solve this issue, we provide:
- the driver-mode specific etr buffer to be retained in the drvdata
- the etr_buf for a session should be passed on when enabling the hardware, which will be stored in drvdata->etr_buf. This will be replaced (not free'd) as soon as the hardware is disabled, after necessary sync operation.
If I get you right the problem you're trying to solve is what to do with a sysFS buffer that hasn't been read (and freed) when a perf session is requested. In my opinion it should simply be freed. Indeed the user probably doesn't care much about that sysFS buffer, if it did the data would have been harvested.
Not only that. If we simply use the drvdata->etr_buf, we cannot track the mode which uses it. If we keep the etr_buf around, how do the new mode user decide how to free the existing one ? (e.g, the perf etr_buf could be associated with other perf data structures). This change would allow us to leave the handling of the etr_buf to its respective modes.
struct etr_buf has a 'mode' and an '*ops', how is that not sufficient? I'll try to finish reviewing your patches today, maybe I'll find the answer later on...
And whether to keep the sysfs etr_buf around is a separate decision from the above.
Cheers Suzuki
Since the ETR now uses mode specific buffers, we can reliably provide the trace data captured in sysfs mode, even when the ETR is operating in PERF mode.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com --- drivers/hwtracing/coresight/coresight-tmc-etr.c | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index ef7498f05b34..31353fc34b53 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -1231,19 +1231,17 @@ int tmc_read_prepare_etr(struct tmc_drvdata *drvdata) goto out; }
- /* Don't interfere if operated from Perf */ - if (drvdata->mode == CS_MODE_PERF) { - ret = -EINVAL; - goto out; - } - - /* If sysfs_buf is NULL the trace data has been read already */ + /* + * We can safely allow reads even if the ETR is operating in PERF mode, + * since the sysfs session is captured in mode specific data. + * If drvdata::sysfs_data is NULL the trace data has been read already. + */ if (!drvdata->sysfs_buf) { ret = -EINVAL; goto out; }
- /* Disable the TMC if we are trying to read from a running session */ + /* Disable the TMC if we are trying to read from a running session. */ if (drvdata->mode == CS_MODE_SYSFS) tmc_etr_disable_hw(drvdata);
We zero out the entire trace buffer used for ETR before it is enabled, for helping with debugging. Since we could be restoring a session in perf mode, this could destroy the data. Get rid of this step, if someone wants to debug, they can always add it as and when needed.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com --- drivers/hwtracing/coresight/coresight-tmc-etr.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index 31353fc34b53..849684f85443 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -971,8 +971,6 @@ static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata, return;
drvdata->etr_buf = etr_buf; - /* Zero out the memory to help with debug */ - memset(etr_buf->vaddr, 0, etr_buf->size);
CS_UNLOCK(drvdata->base);
@@ -1267,9 +1265,8 @@ int tmc_read_unprepare_etr(struct tmc_drvdata *drvdata) if (drvdata->mode == CS_MODE_SYSFS) { /* * The trace run will continue with the same allocated trace - * buffer. The trace buffer is cleared in tmc_etr_enable_hw(), - * so we don't have to explicitly clear it. Also, since the - * tracer is still enabled drvdata::buf can't be NULL. + * buffer. Since the tracer is still enabled drvdata::buf can't + * be NULL. */ tmc_etr_enable_hw(drvdata, drvdata->sysfs_buf); } else {
On Thu, Oct 19, 2017 at 06:15:49PM +0100, Suzuki K Poulose wrote:
We zero out the entire trace buffer used for ETR before it is enabled, for helping with debugging. Since we could be restoring a session in perf mode, this could destroy the data.
I'm not sure to follow you with "... restoring a session in perf mode ...". When operating from the perf interface all the memory allocated for a session is cleanup after, there is no re-using of memory as in sysFS.
Get rid of this step, if someone wants to debug, they can always add it as and when needed.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
drivers/hwtracing/coresight/coresight-tmc-etr.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index 31353fc34b53..849684f85443 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -971,8 +971,6 @@ static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata, return; drvdata->etr_buf = etr_buf;
- /* Zero out the memory to help with debug */
- memset(etr_buf->vaddr, 0, etr_buf->size);
I agree, this can be costly when dealing with large areas of memory.
CS_UNLOCK(drvdata->base); @@ -1267,9 +1265,8 @@ int tmc_read_unprepare_etr(struct tmc_drvdata *drvdata) if (drvdata->mode == CS_MODE_SYSFS) { /* * The trace run will continue with the same allocated trace
* buffer. The trace buffer is cleared in tmc_etr_enable_hw(),
* so we don't have to explicitly clear it. Also, since the
* tracer is still enabled drvdata::buf can't be NULL.
* buffer. Since the tracer is still enabled drvdata::buf can't
*/ tmc_etr_enable_hw(drvdata, drvdata->sysfs_buf); } else {* be NULL.
-- 2.13.6
On 02/11/17 20:36, Mathieu Poirier wrote:
On Thu, Oct 19, 2017 at 06:15:49PM +0100, Suzuki K Poulose wrote:
We zero out the entire trace buffer used for ETR before it is enabled, for helping with debugging. Since we could be restoring a session in perf mode, this could destroy the data.
I'm not sure to follow you with "... restoring a session in perf mode ...". When operating from the perf interface all the memory allocated for a session is cleanup after, there is no re-using of memory as in sysFS.
We could directly use the perf ring buffer for the ETR. In that case, the perf ring buffer could contain trace data collected from the previous "schedule" which the userspace hasn't collected yet. So, doing a memset here would destroy that data.
Cheers Suzuki
Get rid of this step, if someone wants to debug, they can always add it as and when needed.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
drivers/hwtracing/coresight/coresight-tmc-etr.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index 31353fc34b53..849684f85443 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -971,8 +971,6 @@ static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata, return; drvdata->etr_buf = etr_buf;
- /* Zero out the memory to help with debug */
- memset(etr_buf->vaddr, 0, etr_buf->size);
I agree, this can be costly when dealing with large areas of memory.
CS_UNLOCK(drvdata->base); @@ -1267,9 +1265,8 @@ int tmc_read_unprepare_etr(struct tmc_drvdata *drvdata) if (drvdata->mode == CS_MODE_SYSFS) { /* * The trace run will continue with the same allocated trace
* buffer. The trace buffer is cleared in tmc_etr_enable_hw(),
* so we don't have to explicitly clear it. Also, since the
* tracer is still enabled drvdata::buf can't be NULL.
* buffer. Since the tracer is still enabled drvdata::buf can't
*/ tmc_etr_enable_hw(drvdata, drvdata->sysfs_buf); } else {* be NULL.
-- 2.13.6
On 3 November 2017 at 04:10, Suzuki K Poulose Suzuki.Poulose@arm.com wrote:
On 02/11/17 20:36, Mathieu Poirier wrote:
On Thu, Oct 19, 2017 at 06:15:49PM +0100, Suzuki K Poulose wrote:
We zero out the entire trace buffer used for ETR before it is enabled, for helping with debugging. Since we could be restoring a session in perf mode, this could destroy the data.
I'm not sure to follow you with "... restoring a session in perf mode ...". When operating from the perf interface all the memory allocated for a session is cleanup after, there is no re-using of memory as in sysFS.
We could directly use the perf ring buffer for the ETR. In that case, the perf ring buffer could contain trace data collected from the previous "schedule" which the userspace hasn't collected yet. So, doing a memset here would destroy that data.
I originally thought your comment was about re-using the memory from a previous trace session, hence the confusion. Please rework your changelog to include this clarification as I am sure other people can be mislead.
Cheers Suzuki
Get rid of this step, if someone wants to debug, they can always add it as and when needed.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
drivers/hwtracing/coresight/coresight-tmc-etr.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index 31353fc34b53..849684f85443 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -971,8 +971,6 @@ static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata, return; drvdata->etr_buf = etr_buf;
/* Zero out the memory to help with debug */
memset(etr_buf->vaddr, 0, etr_buf->size);
I agree, this can be costly when dealing with large areas of memory.
CS_UNLOCK(drvdata->base);
@@ -1267,9 +1265,8 @@ int tmc_read_unprepare_etr(struct tmc_drvdata *drvdata) if (drvdata->mode == CS_MODE_SYSFS) { /* * The trace run will continue with the same allocated trace
* buffer. The trace buffer is cleared in
tmc_etr_enable_hw(),
* so we don't have to explicitly clear it. Also, since
the
* tracer is still enabled drvdata::buf can't be NULL.
* buffer. Since the tracer is still enabled drvdata::buf
can't
* be NULL. */ tmc_etr_enable_hw(drvdata, drvdata->sysfs_buf); } else {
-- 2.13.6
On 03/11/17 20:17, Mathieu Poirier wrote:
On 3 November 2017 at 04:10, Suzuki K Poulose Suzuki.Poulose@arm.com wrote:
On 02/11/17 20:36, Mathieu Poirier wrote:
On Thu, Oct 19, 2017 at 06:15:49PM +0100, Suzuki K Poulose wrote:
We zero out the entire trace buffer used for ETR before it is enabled, for helping with debugging. Since we could be restoring a session in perf mode, this could destroy the data.
I'm not sure to follow you with "... restoring a session in perf mode ...". When operating from the perf interface all the memory allocated for a session is cleanup after, there is no re-using of memory as in sysFS.
We could directly use the perf ring buffer for the ETR. In that case, the perf ring buffer could contain trace data collected from the previous "schedule" which the userspace hasn't collected yet. So, doing a memset here would destroy that data.
I originally thought your comment was about re-using the memory from a previous trace session, hence the confusion. Please rework your changelog to include this clarification as I am sure other people can be mislead.
Sure, will do.
Thanks Suzuki
Add support for creating buffers which can be used in save-restore mode (e.g, for use by perf). If the TMC-ETR supports save-restore feature, we could support the mode in all buffer backends. However, if it doesn't, we should fall back to using in built SG mechanism, where we can rotate the SG table by making some adjustments in the page table.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com --- drivers/hwtracing/coresight/coresight-tmc-etr.c | 132 +++++++++++++++++++++++- drivers/hwtracing/coresight/coresight-tmc.h | 15 +++ 2 files changed, 143 insertions(+), 4 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index 849684f85443..f8e654e1f5b2 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -590,7 +590,7 @@ tmc_etr_sg_table_index_to_daddr(struct tmc_sg_table *sg_table, u32 index) * 3) Update the hwaddr to point to the table pointer for the buffer * which starts at "base". */ -static int __maybe_unused +static int tmc_etr_sg_table_rotate(struct etr_sg_table *etr_table, u64 base_offset) { u32 last_entry, first_entry; @@ -700,6 +700,9 @@ static int tmc_etr_alloc_flat_buf(struct tmc_drvdata *drvdata, return -ENOMEM; etr_buf->vaddr = vaddr; etr_buf->hwaddr = paddr; + etr_buf->rrp = paddr; + etr_buf->rwp = paddr; + etr_buf->status = 0; etr_buf->mode = ETR_MODE_FLAT; etr_buf->private = drvdata; return 0; @@ -754,13 +757,19 @@ static int tmc_etr_alloc_sg_buf(struct tmc_drvdata *drvdata, void **pages) { struct etr_sg_table *etr_table; + struct tmc_sg_table *sg_table;
etr_table = tmc_init_etr_sg_table(drvdata->dev, node, etr_buf->size, pages); if (IS_ERR(etr_table)) return -ENOMEM; + sg_table = etr_table->sg_table; etr_buf->vaddr = tmc_sg_table_data_vaddr(etr_table->sg_table); etr_buf->hwaddr = etr_table->hwaddr; + /* TMC ETR SG automatically sets the RRP/RWP when enabled */ + etr_buf->rrp = etr_table->hwaddr; + etr_buf->rwp = etr_table->hwaddr; + etr_buf->status = 0; etr_buf->mode = ETR_MODE_ETR_SG; etr_buf->private = etr_table; return 0; @@ -816,11 +825,49 @@ static void tmc_etr_sync_sg_buf(struct etr_buf *etr_buf, u64 rrp, u64 rwp) tmc_sg_table_sync_data_range(table, r_offset, etr_buf->len); }
+static int tmc_etr_restore_sg_buf(struct etr_buf *etr_buf, + u64 r_offset, u64 w_offset, + u32 status, bool has_save_restore) +{ + int rc; + struct etr_sg_table *etr_table = etr_buf->private; + struct device *dev = etr_table->sg_table->dev; + + /* + * It is highly unlikely that we have an ETR with in-built SG and + * Save-Restore capability and we are not sure if the PTRs will + * be updated. + */ + if (has_save_restore) { + dev_warn_once(dev, + "Unexpected feature combination of SG and save-restore\n"); + return -EINVAL; + } + + /* + * Since we cannot program RRP/RWP different from DBAL, the offsets + * should match. + */ + if (r_offset != w_offset) { + dev_dbg(dev, "Mismatched RRP/RWP offsets\n"); + return -EINVAL; + } + + rc = tmc_etr_sg_table_rotate(etr_table, w_offset); + if (!rc) { + etr_buf->hwaddr = etr_table->hwaddr; + etr_buf->rrp = etr_table->hwaddr; + etr_buf->rwp = etr_table->hwaddr; + } + return rc; +} + static const struct etr_buf_operations etr_sg_buf_ops = { .alloc = tmc_etr_alloc_sg_buf, .free = tmc_etr_free_sg_buf, .sync = tmc_etr_sync_sg_buf, .get_data = tmc_etr_get_data_sg_buf, + .restore = tmc_etr_restore_sg_buf, };
static const struct etr_buf_operations *etr_buf_ops[] = { @@ -861,10 +908,42 @@ static struct etr_buf *tmc_alloc_etr_buf(struct tmc_drvdata *drvdata, { int rc = -ENOMEM; bool has_etr_sg, has_iommu; + bool has_flat, has_save_restore; struct etr_buf *etr_buf;
has_etr_sg = tmc_etr_has_cap(drvdata, TMC_ETR_SG); has_iommu = iommu_get_domain_for_dev(drvdata->dev); + has_save_restore = tmc_etr_has_cap(drvdata, TMC_ETR_SAVE_RESTORE); + + /* + * We can normally use flat DMA buffer provided that the buffer + * is not used in save restore fashion without hardware support. + */ + has_flat = !(flags & ETR_BUF_F_RESTORE_PTRS) || has_save_restore; + + /* + * To support save-restore on a given ETR we have the following + * conditions: + * 1) If the buffer requires save-restore of a pointers as well + * as the Status bit, we require ETR support for it and we coul + * support all the backends. + * 2) If the buffer requires only save-restore of pointers, then + * we could exploit a circular ETR SG list. None of the other + * backends can support it without the ETR feature. + * + * If the buffer will be used in a save-restore mode without + * the ETR support for SAVE_RESTORE, we can only support TMC + * ETR in-built SG tables which can be rotated to make it work. + */ + if ((flags & ETR_BUF_F_RESTORE_STATUS) && !has_save_restore) + return ERR_PTR(-EINVAL); + + if (!has_flat && !has_etr_sg) { + dev_dbg(drvdata->dev, + "No available backends for ETR buffer with flags %x\n", + flags); + return ERR_PTR(-EINVAL); + }
etr_buf = kzalloc(sizeof(*etr_buf), GFP_KERNEL); if (!etr_buf) @@ -883,7 +962,7 @@ static struct etr_buf *tmc_alloc_etr_buf(struct tmc_drvdata *drvdata, * Fallback to available mechanisms. * */ - if (!pages && + if (!pages && has_flat && (!has_etr_sg || has_iommu || size < SZ_1M)) rc = tmc_etr_mode_alloc_buf(ETR_MODE_FLAT, drvdata, etr_buf, node, pages); @@ -961,6 +1040,51 @@ static void tmc_sync_etr_buf(struct tmc_drvdata *drvdata) tmc_etr_buf_insert_barrier_packet(etr_buf, etr_buf->offset); }
+/* + * tmc_etr_restore_generic: Common helper to restore the buffer + * status for FLAT buffers, which use linear TMC ETR address range. + * This is only possible with in-built ETR capability to save-restore + * the pointers. The DBA will still point to the original start of the + * buffer. + */ +static int tmc_etr_buf_generic_restore(struct etr_buf *etr_buf, + u64 r_offset, u64 w_offset, + u32 status, bool has_save_restore) +{ + u64 size = etr_buf->size; + + if (!has_save_restore) + return -EINVAL; + etr_buf->rrp = etr_buf->hwaddr + (r_offset % size); + etr_buf->rwp = etr_buf->hwaddr + (w_offset % size); + etr_buf->status = status; + return 0; +} + +static int __maybe_unused +tmc_restore_etr_buf(struct tmc_drvdata *drvdata, struct etr_buf *etr_buf, + u64 r_offset, u64 w_offset, u32 status) +{ + bool has_save_restore = tmc_etr_has_cap(drvdata, TMC_ETR_SAVE_RESTORE); + + if (WARN_ON_ONCE(!has_save_restore && etr_buf->mode != ETR_MODE_ETR_SG)) + return -EINVAL; + /* + * If we use a circular SG list without ETR support, we can't + * support restoring "Full" bit. + */ + if (WARN_ON_ONCE(!has_save_restore && status)) + return -EINVAL; + if (status & ~TMC_STS_FULL) + return -EINVAL; + if (etr_buf->ops->restore) + return etr_buf->ops->restore(etr_buf, r_offset, w_offset, + status, has_save_restore); + else + return tmc_etr_buf_generic_restore(etr_buf, r_offset, w_offset, + status, has_save_restore); +} + static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata, struct etr_buf *etr_buf) { @@ -1004,8 +1128,8 @@ static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata, * STS to "not full"). */ if (tmc_etr_has_cap(drvdata, TMC_ETR_SAVE_RESTORE)) { - tmc_write_rrp(drvdata, etr_buf->hwaddr); - tmc_write_rwp(drvdata, etr_buf->hwaddr); + tmc_write_rrp(drvdata, etr_buf->rrp); + tmc_write_rwp(drvdata, etr_buf->rwp); sts = readl_relaxed(drvdata->base + TMC_STS) & ~TMC_STS_FULL; writel_relaxed(sts, drvdata->base + TMC_STS); } diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h index 14a3dec50b0f..2c5b905b6494 100644 --- a/drivers/hwtracing/coresight/coresight-tmc.h +++ b/drivers/hwtracing/coresight/coresight-tmc.h @@ -142,12 +142,22 @@ enum etr_mode { ETR_MODE_ETR_SG, /* Uses in-built TMC ETR SG mechanism */ };
+/* ETR buffer should support save-restore */ +#define ETR_BUF_F_RESTORE_PTRS 0x1 +#define ETR_BUF_F_RESTORE_STATUS 0x2 + +#define ETR_BUF_F_RESTORE_MINIMAL ETR_BUF_F_RESTORE_PTRS +#define ETR_BUF_F_RESTORE_FULL (ETR_BUF_F_RESTORE_PTRS |\ + ETR_BUF_F_RESTORE_STATUS) struct etr_buf_operations;
/** * struct etr_buf - Details of the buffer used by ETR * @mode : Mode of the ETR buffer, contiguous, Scatter Gather etc. * @full : Trace data overflow + * @status : Value for STATUS if the ETR supports save-restore. + * @rrp : Value for RRP{LO:HI} if the ETR supports save-restore + * @rwp : Value for RWP{LO:HI} if the ETR supports save-restore * @size : Size of the buffer. * @hwaddr : Address to be programmed in the TMC:DBA{LO,HI} * @vaddr : Virtual address of the buffer used for trace. @@ -159,6 +169,9 @@ struct etr_buf_operations; struct etr_buf { enum etr_mode mode; bool full; + u32 status; + dma_addr_t rrp; + dma_addr_t rwp; ssize_t size; dma_addr_t hwaddr; void *vaddr; @@ -212,6 +225,8 @@ struct etr_buf_operations { int (*alloc)(struct tmc_drvdata *drvdata, struct etr_buf *etr_buf, int node, void **pages); void (*sync)(struct etr_buf *etr_buf, u64 rrp, u64 rwp); + int (*restore)(struct etr_buf *etr_buf, u64 r_offset, + u64 w_offset, u32 status, bool has_save_restore); ssize_t (*get_data)(struct etr_buf *etr_buf, u64 offset, size_t len, char **bufpp); void (*free)(struct etr_buf *etr_buf);
On Thu, Oct 19, 2017 at 06:15:50PM +0100, Suzuki K Poulose wrote:
Add support for creating buffers which can be used in save-restore mode (e.g, for use by perf). If the TMC-ETR supports save-restore feature, we could support the mode in all buffer backends. However,
Instead of using the term backend simply write continuous buffer or SG mode. It is a lot less cryptic that way.
if it doesn't, we should fall back to using in built SG mechanism, where we can rotate the SG table by making some adjustments in the page table.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
drivers/hwtracing/coresight/coresight-tmc-etr.c | 132 +++++++++++++++++++++++- drivers/hwtracing/coresight/coresight-tmc.h | 15 +++ 2 files changed, 143 insertions(+), 4 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index 849684f85443..f8e654e1f5b2 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -590,7 +590,7 @@ tmc_etr_sg_table_index_to_daddr(struct tmc_sg_table *sg_table, u32 index)
- Update the hwaddr to point to the table pointer for the buffer
- which starts at "base".
*/ -static int __maybe_unused +static int tmc_etr_sg_table_rotate(struct etr_sg_table *etr_table, u64 base_offset) { u32 last_entry, first_entry; @@ -700,6 +700,9 @@ static int tmc_etr_alloc_flat_buf(struct tmc_drvdata *drvdata, return -ENOMEM; etr_buf->vaddr = vaddr; etr_buf->hwaddr = paddr;
- etr_buf->rrp = paddr;
- etr_buf->rwp = paddr;
- etr_buf->status = 0; etr_buf->mode = ETR_MODE_FLAT; etr_buf->private = drvdata; return 0;
@@ -754,13 +757,19 @@ static int tmc_etr_alloc_sg_buf(struct tmc_drvdata *drvdata, void **pages) { struct etr_sg_table *etr_table;
- struct tmc_sg_table *sg_table;
etr_table = tmc_init_etr_sg_table(drvdata->dev, node, etr_buf->size, pages); if (IS_ERR(etr_table)) return -ENOMEM;
- sg_table = etr_table->sg_table;
As far as I can tell this doesn't do anything.
etr_buf->vaddr = tmc_sg_table_data_vaddr(etr_table->sg_table); etr_buf->hwaddr = etr_table->hwaddr;
- /* TMC ETR SG automatically sets the RRP/RWP when enabled */
If TMC ETR SG automatically sets the RRP/RWP, why explicitly setting it?
- etr_buf->rrp = etr_table->hwaddr;
- etr_buf->rwp = etr_table->hwaddr;
- etr_buf->status = 0; etr_buf->mode = ETR_MODE_ETR_SG; etr_buf->private = etr_table; return 0;
@@ -816,11 +825,49 @@ static void tmc_etr_sync_sg_buf(struct etr_buf *etr_buf, u64 rrp, u64 rwp) tmc_sg_table_sync_data_range(table, r_offset, etr_buf->len); } +static int tmc_etr_restore_sg_buf(struct etr_buf *etr_buf,
u64 r_offset, u64 w_offset,
u32 status, bool has_save_restore)
Indentation
+{
- int rc;
- struct etr_sg_table *etr_table = etr_buf->private;
- struct device *dev = etr_table->sg_table->dev;
- /*
* It is highly unlikely that we have an ETR with in-built SG and
* Save-Restore capability and we are not sure if the PTRs will
* be updated.
*/
- if (has_save_restore) {
dev_warn_once(dev,
"Unexpected feature combination of SG and save-restore\n");
return -EINVAL;
- }
- /*
* Since we cannot program RRP/RWP different from DBAL, the offsets
* should match.
*/
- if (r_offset != w_offset) {
dev_dbg(dev, "Mismatched RRP/RWP offsets\n");
return -EINVAL;
- }
- rc = tmc_etr_sg_table_rotate(etr_table, w_offset);
- if (!rc) {
etr_buf->hwaddr = etr_table->hwaddr;
etr_buf->rrp = etr_table->hwaddr;
etr_buf->rwp = etr_table->hwaddr;
- }
- return rc;
+}
static const struct etr_buf_operations etr_sg_buf_ops = { .alloc = tmc_etr_alloc_sg_buf, .free = tmc_etr_free_sg_buf, .sync = tmc_etr_sync_sg_buf, .get_data = tmc_etr_get_data_sg_buf,
- .restore = tmc_etr_restore_sg_buf,
}; static const struct etr_buf_operations *etr_buf_ops[] = { @@ -861,10 +908,42 @@ static struct etr_buf *tmc_alloc_etr_buf(struct tmc_drvdata *drvdata, { int rc = -ENOMEM; bool has_etr_sg, has_iommu;
- bool has_flat, has_save_restore; struct etr_buf *etr_buf;
has_etr_sg = tmc_etr_has_cap(drvdata, TMC_ETR_SG); has_iommu = iommu_get_domain_for_dev(drvdata->dev);
- has_save_restore = tmc_etr_has_cap(drvdata, TMC_ETR_SAVE_RESTORE);
- /*
* We can normally use flat DMA buffer provided that the buffer
* is not used in save restore fashion without hardware support.
*/
- has_flat = !(flags & ETR_BUF_F_RESTORE_PTRS) || has_save_restore;
- /*
* To support save-restore on a given ETR we have the following
* conditions:
* 1) If the buffer requires save-restore of a pointers as well
* as the Status bit, we require ETR support for it and we coul
* support all the backends.
* 2) If the buffer requires only save-restore of pointers, then
* we could exploit a circular ETR SG list. None of the other
* backends can support it without the ETR feature.
*
* If the buffer will be used in a save-restore mode without
* the ETR support for SAVE_RESTORE, we can only support TMC
* ETR in-built SG tables which can be rotated to make it work.
*/
- if ((flags & ETR_BUF_F_RESTORE_STATUS) && !has_save_restore)
return ERR_PTR(-EINVAL);
- if (!has_flat && !has_etr_sg) {
dev_dbg(drvdata->dev,
"No available backends for ETR buffer with flags %x\n",
flags);
return ERR_PTR(-EINVAL);
- }
etr_buf = kzalloc(sizeof(*etr_buf), GFP_KERNEL); if (!etr_buf) @@ -883,7 +962,7 @@ static struct etr_buf *tmc_alloc_etr_buf(struct tmc_drvdata *drvdata, * Fallback to available mechanisms. * */
- if (!pages &&
- if (!pages && has_flat && (!has_etr_sg || has_iommu || size < SZ_1M)) rc = tmc_etr_mode_alloc_buf(ETR_MODE_FLAT, drvdata, etr_buf, node, pages);
@@ -961,6 +1040,51 @@ static void tmc_sync_etr_buf(struct tmc_drvdata *drvdata) tmc_etr_buf_insert_barrier_packet(etr_buf, etr_buf->offset); } +/*
- tmc_etr_restore_generic: Common helper to restore the buffer
- status for FLAT buffers, which use linear TMC ETR address range.
- This is only possible with in-built ETR capability to save-restore
- the pointers. The DBA will still point to the original start of the
- buffer.
- */
+static int tmc_etr_buf_generic_restore(struct etr_buf *etr_buf,
u64 r_offset, u64 w_offset,
u32 status, bool has_save_restore)
Indentation
+{
- u64 size = etr_buf->size;
- if (!has_save_restore)
return -EINVAL;
- etr_buf->rrp = etr_buf->hwaddr + (r_offset % size);
- etr_buf->rwp = etr_buf->hwaddr + (w_offset % size);
- etr_buf->status = status;
- return 0;
+}
+static int __maybe_unused +tmc_restore_etr_buf(struct tmc_drvdata *drvdata, struct etr_buf *etr_buf,
u64 r_offset, u64 w_offset, u32 status)
+{
- bool has_save_restore = tmc_etr_has_cap(drvdata, TMC_ETR_SAVE_RESTORE);
- if (WARN_ON_ONCE(!has_save_restore && etr_buf->mode != ETR_MODE_ETR_SG))
return -EINVAL;
- /*
* If we use a circular SG list without ETR support, we can't
* support restoring "Full" bit.
*/
- if (WARN_ON_ONCE(!has_save_restore && status))
return -EINVAL;
- if (status & ~TMC_STS_FULL)
return -EINVAL;
- if (etr_buf->ops->restore)
return etr_buf->ops->restore(etr_buf, r_offset, w_offset,
status, has_save_restore);
- else
return tmc_etr_buf_generic_restore(etr_buf, r_offset, w_offset,
status, has_save_restore);
+}
static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata, struct etr_buf *etr_buf) { @@ -1004,8 +1128,8 @@ static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata, * STS to "not full"). */ if (tmc_etr_has_cap(drvdata, TMC_ETR_SAVE_RESTORE)) {
tmc_write_rrp(drvdata, etr_buf->hwaddr);
tmc_write_rwp(drvdata, etr_buf->hwaddr);
tmc_write_rrp(drvdata, etr_buf->rrp);
sts = readl_relaxed(drvdata->base + TMC_STS) & ~TMC_STS_FULL; writel_relaxed(sts, drvdata->base + TMC_STS); }tmc_write_rwp(drvdata, etr_buf->rwp);
diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h index 14a3dec50b0f..2c5b905b6494 100644 --- a/drivers/hwtracing/coresight/coresight-tmc.h +++ b/drivers/hwtracing/coresight/coresight-tmc.h @@ -142,12 +142,22 @@ enum etr_mode { ETR_MODE_ETR_SG, /* Uses in-built TMC ETR SG mechanism */ }; +/* ETR buffer should support save-restore */ +#define ETR_BUF_F_RESTORE_PTRS 0x1 +#define ETR_BUF_F_RESTORE_STATUS 0x2
+#define ETR_BUF_F_RESTORE_MINIMAL ETR_BUF_F_RESTORE_PTRS +#define ETR_BUF_F_RESTORE_FULL (ETR_BUF_F_RESTORE_PTRS |\
ETR_BUF_F_RESTORE_STATUS)
struct etr_buf_operations; /**
- struct etr_buf - Details of the buffer used by ETR
- @mode : Mode of the ETR buffer, contiguous, Scatter Gather etc.
- @full : Trace data overflow
- @status : Value for STATUS if the ETR supports save-restore.
- @rrp : Value for RRP{LO:HI} if the ETR supports save-restore
- @rwp : Value for RWP{LO:HI} if the ETR supports save-restore
- @size : Size of the buffer.
- @hwaddr : Address to be programmed in the TMC:DBA{LO,HI}
- @vaddr : Virtual address of the buffer used for trace.
@@ -159,6 +169,9 @@ struct etr_buf_operations; struct etr_buf { enum etr_mode mode; bool full;
- u32 status;
- dma_addr_t rrp;
- dma_addr_t rwp; ssize_t size; dma_addr_t hwaddr; void *vaddr;
@@ -212,6 +225,8 @@ struct etr_buf_operations { int (*alloc)(struct tmc_drvdata *drvdata, struct etr_buf *etr_buf, int node, void **pages); void (*sync)(struct etr_buf *etr_buf, u64 rrp, u64 rwp);
- int (*restore)(struct etr_buf *etr_buf, u64 r_offset,
ssize_t (*get_data)(struct etr_buf *etr_buf, u64 offset, size_t len, char **bufpp); void (*free)(struct etr_buf *etr_buf);u64 w_offset, u32 status, bool has_save_restore);
-- 2.13.6
This patch adds a helper to insert barrier packets for a given size (aligned to packet size) at given offset in an etr_buf. This will be used later for perf mode when we try to start in the middle of an SG buffer.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com --- drivers/hwtracing/coresight/coresight-tmc-etr.c | 52 ++++++++++++++++++++++--- 1 file changed, 46 insertions(+), 6 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index f8e654e1f5b2..229c36b7266c 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -999,18 +999,58 @@ static ssize_t tmc_etr_buf_get_data(struct etr_buf *etr_buf, return etr_buf->ops->get_data(etr_buf, (u64)offset, len, bufpp); }
+/* + * tmc_etr_buf_insert_barrier_packets : Insert barrier packets at @offset upto + * @size of bytes in the given buffer. @size should be aligned to the barrier + * packet size. + * + * Returns the new @offset after filling the barriers on success. Otherwise + * returns error. + */ static inline s64 -tmc_etr_buf_insert_barrier_packet(struct etr_buf *etr_buf, u64 offset) +tmc_etr_buf_insert_barrier_packets(struct etr_buf *etr_buf, + u64 offset, u64 size) { ssize_t len; char *bufp;
- len = tmc_etr_buf_get_data(etr_buf, offset, - CORESIGHT_BARRIER_PKT_SIZE, &bufp); - if (WARN_ON(len <= CORESIGHT_BARRIER_PKT_SIZE)) + if ((size % CORESIGHT_BARRIER_PKT_SIZE) || + (offset % CORESIGHT_BARRIER_PKT_SIZE)) return -EINVAL; - coresight_insert_barrier_packet(bufp); - return offset + CORESIGHT_BARRIER_PKT_SIZE; + do { + len = tmc_etr_buf_get_data(etr_buf, offset, size, &bufp); + if (WARN_ON(len <= 0)) + return -EINVAL; + /* + * We are guaranteed that @bufp will point to a linear range + * of @len bytes, where @len <= @size. + */ + size -= len; + offset += len; + while (len >= CORESIGHT_BARRIER_PKT_SIZE) { + coresight_insert_barrier_packet(bufp); + bufp += CORESIGHT_BARRIER_PKT_SIZE; + len -= CORESIGHT_BARRIER_PKT_SIZE; + } + + /* + * Normally we shouldn't have any left over here, as the trace + * should always be aligned to ETR Frame size. + */ + WARN_ON(len); + /* If we reached the end of the buffer, wrap around */ + if (offset == etr_buf->size) + offset -= etr_buf->size; + } while (size); + + return offset; +} + +static inline s64 +tmc_etr_buf_insert_barrier_packet(struct etr_buf *etr_buf, u64 offset) +{ + return tmc_etr_buf_insert_barrier_packets(etr_buf, offset, + CORESIGHT_BARRIER_PKT_SIZE); }
/*
Right now we issue an update_buffer() and reset_buffer() call backs in succession when we stop tracing an event. The update_buffer is supposed to check the status of the buffer and make sure the ring buffer is updated with the trace data. And we store information about the size of the data collected only to be consumed by the reset_buffer callback which always follows the update_buffer. This patch gets rid of the reset_buffer callback altogether and performs the actions in update_buffer, making it return the size collected.
This removes some not-so pretty hack (storing the new head in the size field for snapshot mode) and cleans it up a little bit.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com --- drivers/hwtracing/coresight/coresight-etb10.c | 56 +++++------------------ drivers/hwtracing/coresight/coresight-etm-perf.c | 9 +--- drivers/hwtracing/coresight/coresight-tmc-etf.c | 58 +++++------------------- include/linux/coresight.h | 5 +- 4 files changed, 26 insertions(+), 102 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-etb10.c b/drivers/hwtracing/coresight/coresight-etb10.c index 757f556975f7..75c5699000b0 100644 --- a/drivers/hwtracing/coresight/coresight-etb10.c +++ b/drivers/hwtracing/coresight/coresight-etb10.c @@ -323,37 +323,7 @@ static int etb_set_buffer(struct coresight_device *csdev, return ret; }
-static unsigned long etb_reset_buffer(struct coresight_device *csdev, - struct perf_output_handle *handle, - void *sink_config) -{ - unsigned long size = 0; - struct cs_buffers *buf = sink_config; - - if (buf) { - /* - * In snapshot mode ->data_size holds the new address of the - * ring buffer's head. The size itself is the whole address - * range since we want the latest information. - */ - if (buf->snapshot) - handle->head = local_xchg(&buf->data_size, - buf->nr_pages << PAGE_SHIFT); - - /* - * Tell the tracer PMU how much we got in this run and if - * something went wrong along the way. Nobody else can use - * this cs_buffers instance until we are done. As such - * resetting parameters here and squaring off with the ring - * buffer API in the tracer PMU is fine. - */ - size = local_xchg(&buf->data_size, 0); - } - - return size; -} - -static void etb_update_buffer(struct coresight_device *csdev, +static unsigned long etb_update_buffer(struct coresight_device *csdev, struct perf_output_handle *handle, void *sink_config) { @@ -362,13 +332,13 @@ static void etb_update_buffer(struct coresight_device *csdev, u8 *buf_ptr; const u32 *barrier; u32 read_ptr, write_ptr, capacity; - u32 status, read_data, to_read; - unsigned long offset; + u32 status, read_data; + unsigned long offset, to_read; struct cs_buffers *buf = sink_config; struct etb_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
if (!buf) - return; + return 0;
capacity = drvdata->buffer_depth * ETB_FRAME_SIZE_WORDS;
@@ -473,18 +443,17 @@ static void etb_update_buffer(struct coresight_device *csdev, writel_relaxed(0x0, drvdata->base + ETB_RAM_WRITE_POINTER);
/* - * In snapshot mode all we have to do is communicate to - * perf_aux_output_end() the address of the current head. In full - * trace mode the same function expects a size to move rb->aux_head - * forward. + * In snapshot mode we have to update the handle->head to point + * to the new location. */ - if (buf->snapshot) - local_set(&buf->data_size, (cur * PAGE_SIZE) + offset); - else - local_add(to_read, &buf->data_size); - + if (buf->snapshot) { + handle->head = (cur * PAGE_SIZE) + offset; + to_read = buf->nr_pages << PAGE_SHIFT; + } etb_enable_hw(drvdata); CS_LOCK(drvdata->base); + + return to_read; }
static const struct coresight_ops_sink etb_sink_ops = { @@ -493,7 +462,6 @@ static const struct coresight_ops_sink etb_sink_ops = { .alloc_buffer = etb_alloc_buffer, .free_buffer = etb_free_buffer, .set_buffer = etb_set_buffer, - .reset_buffer = etb_reset_buffer, .update_buffer = etb_update_buffer, };
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c index 8a0ad77574e7..e5f9567c87c4 100644 --- a/drivers/hwtracing/coresight/coresight-etm-perf.c +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c @@ -342,15 +342,8 @@ static void etm_event_stop(struct perf_event *event, int mode) if (!sink_ops(sink)->update_buffer) return;
- sink_ops(sink)->update_buffer(sink, handle, + size = sink_ops(sink)->update_buffer(sink, handle, event_data->snk_config); - - if (!sink_ops(sink)->reset_buffer) - return; - - size = sink_ops(sink)->reset_buffer(sink, handle, - event_data->snk_config); - perf_aux_output_end(handle, size); }
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c index aa4e8f03ef49..073198e7b46e 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c @@ -358,36 +358,7 @@ static int tmc_set_etf_buffer(struct coresight_device *csdev, return ret; }
-static unsigned long tmc_reset_etf_buffer(struct coresight_device *csdev, - struct perf_output_handle *handle, - void *sink_config) -{ - long size = 0; - struct cs_buffers *buf = sink_config; - - if (buf) { - /* - * In snapshot mode ->data_size holds the new address of the - * ring buffer's head. The size itself is the whole address - * range since we want the latest information. - */ - if (buf->snapshot) - handle->head = local_xchg(&buf->data_size, - buf->nr_pages << PAGE_SHIFT); - /* - * Tell the tracer PMU how much we got in this run and if - * something went wrong along the way. Nobody else can use - * this cs_buffers instance until we are done. As such - * resetting parameters here and squaring off with the ring - * buffer API in the tracer PMU is fine. - */ - size = local_xchg(&buf->data_size, 0); - } - - return size; -} - -static void tmc_update_etf_buffer(struct coresight_device *csdev, +static unsigned long tmc_update_etf_buffer(struct coresight_device *csdev, struct perf_output_handle *handle, void *sink_config) { @@ -396,17 +367,17 @@ static void tmc_update_etf_buffer(struct coresight_device *csdev, const u32 *barrier; u32 *buf_ptr; u64 read_ptr, write_ptr; - u32 status, to_read; - unsigned long offset; + u32 status; + unsigned long offset, to_read; struct cs_buffers *buf = sink_config; struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
if (!buf) - return; + return 0;
/* This shouldn't happen */ if (WARN_ON_ONCE(drvdata->mode != CS_MODE_PERF)) - return; + return 0;
CS_UNLOCK(drvdata->base);
@@ -495,18 +466,14 @@ static void tmc_update_etf_buffer(struct coresight_device *csdev, } }
- /* - * In snapshot mode all we have to do is communicate to - * perf_aux_output_end() the address of the current head. In full - * trace mode the same function expects a size to move rb->aux_head - * forward. - */ - if (buf->snapshot) - local_set(&buf->data_size, (cur * PAGE_SIZE) + offset); - else - local_add(to_read, &buf->data_size); - + /* In snapshot mode we have to update the head */ + if (buf->snapshot) { + handle->head = (cur * PAGE_SIZE) + offset; + to_read = buf->nr_pages << PAGE_SHIFT; + } CS_LOCK(drvdata->base); + + return to_read; }
static const struct coresight_ops_sink tmc_etf_sink_ops = { @@ -515,7 +482,6 @@ static const struct coresight_ops_sink tmc_etf_sink_ops = { .alloc_buffer = tmc_alloc_etf_buffer, .free_buffer = tmc_free_etf_buffer, .set_buffer = tmc_set_etf_buffer, - .reset_buffer = tmc_reset_etf_buffer, .update_buffer = tmc_update_etf_buffer, };
diff --git a/include/linux/coresight.h b/include/linux/coresight.h index d950dad5056a..5c9e5fe2bf32 100644 --- a/include/linux/coresight.h +++ b/include/linux/coresight.h @@ -199,10 +199,7 @@ struct coresight_ops_sink { int (*set_buffer)(struct coresight_device *csdev, struct perf_output_handle *handle, void *sink_config); - unsigned long (*reset_buffer)(struct coresight_device *csdev, - struct perf_output_handle *handle, - void *sink_config); - void (*update_buffer)(struct coresight_device *csdev, + unsigned long (*update_buffer)(struct coresight_device *csdev, struct perf_output_handle *handle, void *sink_config); };
On Thu, Oct 19, 2017 at 06:15:52PM +0100, Suzuki K Poulose wrote:
Right now we issue an update_buffer() and reset_buffer() call backs in succession when we stop tracing an event. The update_buffer is supposed to check the status of the buffer and make sure the ring buffer is updated with the trace data. And we store information about the size of the data collected only to be consumed by the reset_buffer callback which always follows the update_buffer. This patch gets rid of the reset_buffer callback altogether and performs the actions in update_buffer, making it return the size collected.
This removes some not-so pretty hack (storing the new head in the size field for snapshot mode) and cleans it up a little bit.
The idea in splitting the update and reset operation was to seamlessly support sinks that generate an interrupt when their buffer gets full. Those are coming and when we do need to support them we'll find ourselves splitting the update and reset operation again.
See comment below.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
drivers/hwtracing/coresight/coresight-etb10.c | 56 +++++------------------ drivers/hwtracing/coresight/coresight-etm-perf.c | 9 +--- drivers/hwtracing/coresight/coresight-tmc-etf.c | 58 +++++------------------- include/linux/coresight.h | 5 +- 4 files changed, 26 insertions(+), 102 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-etb10.c b/drivers/hwtracing/coresight/coresight-etb10.c index 757f556975f7..75c5699000b0 100644 --- a/drivers/hwtracing/coresight/coresight-etb10.c +++ b/drivers/hwtracing/coresight/coresight-etb10.c @@ -323,37 +323,7 @@ static int etb_set_buffer(struct coresight_device *csdev, return ret; } -static unsigned long etb_reset_buffer(struct coresight_device *csdev,
struct perf_output_handle *handle,
void *sink_config)
-{
- unsigned long size = 0;
- struct cs_buffers *buf = sink_config;
- if (buf) {
/*
* In snapshot mode ->data_size holds the new address of the
* ring buffer's head. The size itself is the whole address
* range since we want the latest information.
*/
if (buf->snapshot)
handle->head = local_xchg(&buf->data_size,
buf->nr_pages << PAGE_SHIFT);
/*
* Tell the tracer PMU how much we got in this run and if
* something went wrong along the way. Nobody else can use
* this cs_buffers instance until we are done. As such
* resetting parameters here and squaring off with the ring
* buffer API in the tracer PMU is fine.
*/
size = local_xchg(&buf->data_size, 0);
- }
- return size;
-}
-static void etb_update_buffer(struct coresight_device *csdev, +static unsigned long etb_update_buffer(struct coresight_device *csdev, struct perf_output_handle *handle, void *sink_config) { @@ -362,13 +332,13 @@ static void etb_update_buffer(struct coresight_device *csdev, u8 *buf_ptr; const u32 *barrier; u32 read_ptr, write_ptr, capacity;
- u32 status, read_data, to_read;
- unsigned long offset;
- u32 status, read_data;
- unsigned long offset, to_read; struct cs_buffers *buf = sink_config; struct etb_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
if (!buf)
return;
return 0;
capacity = drvdata->buffer_depth * ETB_FRAME_SIZE_WORDS; @@ -473,18 +443,17 @@ static void etb_update_buffer(struct coresight_device *csdev, writel_relaxed(0x0, drvdata->base + ETB_RAM_WRITE_POINTER); /*
* In snapshot mode all we have to do is communicate to
* perf_aux_output_end() the address of the current head. In full
* trace mode the same function expects a size to move rb->aux_head
* forward.
* In snapshot mode we have to update the handle->head to point
*/* to the new location.
- if (buf->snapshot)
local_set(&buf->data_size, (cur * PAGE_SIZE) + offset);
- else
local_add(to_read, &buf->data_size);
- if (buf->snapshot) {
handle->head = (cur * PAGE_SIZE) + offset;
to_read = buf->nr_pages << PAGE_SHIFT;
- } etb_enable_hw(drvdata); CS_LOCK(drvdata->base);
- return to_read;
} static const struct coresight_ops_sink etb_sink_ops = { @@ -493,7 +462,6 @@ static const struct coresight_ops_sink etb_sink_ops = { .alloc_buffer = etb_alloc_buffer, .free_buffer = etb_free_buffer, .set_buffer = etb_set_buffer,
- .reset_buffer = etb_reset_buffer, .update_buffer = etb_update_buffer,
}; diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c index 8a0ad77574e7..e5f9567c87c4 100644 --- a/drivers/hwtracing/coresight/coresight-etm-perf.c +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c @@ -342,15 +342,8 @@ static void etm_event_stop(struct perf_event *event, int mode) if (!sink_ops(sink)->update_buffer) return;
sink_ops(sink)->update_buffer(sink, handle,
size = sink_ops(sink)->update_buffer(sink, handle, event_data->snk_config);
if (!sink_ops(sink)->reset_buffer)
return;
size = sink_ops(sink)->reset_buffer(sink, handle,
event_data->snk_config);
For the current sink we support, i.e those that don't generate interrupts when their buffer is full, I'm in agreement with your work. I suggest you don't touch the current implementation of etm_event_stop() and move everything you've done to the reset operation. The end result is the same and we don't have to rework (again) etm_event_stop() when we need to support IPs that do send interrupts.
- perf_aux_output_end(handle, size); }
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c index aa4e8f03ef49..073198e7b46e 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c @@ -358,36 +358,7 @@ static int tmc_set_etf_buffer(struct coresight_device *csdev, return ret; } -static unsigned long tmc_reset_etf_buffer(struct coresight_device *csdev,
struct perf_output_handle *handle,
void *sink_config)
-{
- long size = 0;
- struct cs_buffers *buf = sink_config;
- if (buf) {
/*
* In snapshot mode ->data_size holds the new address of the
* ring buffer's head. The size itself is the whole address
* range since we want the latest information.
*/
if (buf->snapshot)
handle->head = local_xchg(&buf->data_size,
buf->nr_pages << PAGE_SHIFT);
/*
* Tell the tracer PMU how much we got in this run and if
* something went wrong along the way. Nobody else can use
* this cs_buffers instance until we are done. As such
* resetting parameters here and squaring off with the ring
* buffer API in the tracer PMU is fine.
*/
size = local_xchg(&buf->data_size, 0);
- }
- return size;
-}
-static void tmc_update_etf_buffer(struct coresight_device *csdev, +static unsigned long tmc_update_etf_buffer(struct coresight_device *csdev, struct perf_output_handle *handle, void *sink_config) { @@ -396,17 +367,17 @@ static void tmc_update_etf_buffer(struct coresight_device *csdev, const u32 *barrier; u32 *buf_ptr; u64 read_ptr, write_ptr;
- u32 status, to_read;
- unsigned long offset;
- u32 status;
- unsigned long offset, to_read; struct cs_buffers *buf = sink_config; struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
if (!buf)
return;
return 0;
/* This shouldn't happen */ if (WARN_ON_ONCE(drvdata->mode != CS_MODE_PERF))
return;
return 0;
CS_UNLOCK(drvdata->base); @@ -495,18 +466,14 @@ static void tmc_update_etf_buffer(struct coresight_device *csdev, } }
- /*
* In snapshot mode all we have to do is communicate to
* perf_aux_output_end() the address of the current head. In full
* trace mode the same function expects a size to move rb->aux_head
* forward.
*/
- if (buf->snapshot)
local_set(&buf->data_size, (cur * PAGE_SIZE) + offset);
- else
local_add(to_read, &buf->data_size);
- /* In snapshot mode we have to update the head */
- if (buf->snapshot) {
handle->head = (cur * PAGE_SIZE) + offset;
to_read = buf->nr_pages << PAGE_SHIFT;
- } CS_LOCK(drvdata->base);
- return to_read;
} static const struct coresight_ops_sink tmc_etf_sink_ops = { @@ -515,7 +482,6 @@ static const struct coresight_ops_sink tmc_etf_sink_ops = { .alloc_buffer = tmc_alloc_etf_buffer, .free_buffer = tmc_free_etf_buffer, .set_buffer = tmc_set_etf_buffer,
- .reset_buffer = tmc_reset_etf_buffer, .update_buffer = tmc_update_etf_buffer,
}; diff --git a/include/linux/coresight.h b/include/linux/coresight.h index d950dad5056a..5c9e5fe2bf32 100644 --- a/include/linux/coresight.h +++ b/include/linux/coresight.h @@ -199,10 +199,7 @@ struct coresight_ops_sink { int (*set_buffer)(struct coresight_device *csdev, struct perf_output_handle *handle, void *sink_config);
- unsigned long (*reset_buffer)(struct coresight_device *csdev,
struct perf_output_handle *handle,
void *sink_config);
- void (*update_buffer)(struct coresight_device *csdev,
- unsigned long (*update_buffer)(struct coresight_device *csdev, struct perf_output_handle *handle, void *sink_config);
};
2.13.6
Add necessary support for using ETR as a sink in ETM perf tracing. We try make the best use of the available modes of buffers to try and avoid software double buffering.
We can use the perf ring buffer for ETR directly if all of the conditions below are met : 1) ETR is DMA coherent 2) perf is used in snapshot mode. In full tracing mode, we cannot guarantee that the ETR will stop before it overwrites the data which may not have been consumed by the user. 3) ETR supports save-restore with a scatter-gather mechanism which can use a given set of pages we use the perf ring buffer directly. If we have an in-built TMC ETR Scatter Gather unit, we make use of a circular SG list to restart from a given head. However, we need to align the starting offset to 4K in this case.
If the ETR doesn't support either of this, we fallback to software double buffering.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com --- drivers/hwtracing/coresight/coresight-tmc-etr.c | 372 +++++++++++++++++++++++- drivers/hwtracing/coresight/coresight-tmc.h | 2 + 2 files changed, 372 insertions(+), 2 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index 229c36b7266c..1dfe7cf7c721 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -21,6 +21,9 @@ #include "coresight-priv.h" #include "coresight-tmc.h"
+/* Lower limit for ETR hardware buffer in double buffering mode */ +#define TMC_ETR_PERF_MIN_BUF_SIZE SZ_1M + /* * The TMC ETR SG has a page size of 4K. The SG table contains pointers * to 4KB buffers. However, the OS may be use PAGE_SIZE different from @@ -1328,10 +1331,371 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev) return ret; }
+/* + * etr_perf_buffer - Perf buffer used for ETR + * @etr_buf - Actual buffer used by the ETR + * @snaphost - Perf session mode + * @head - handle->head at the beginning of the session. + * @nr_pages - Number of pages in the ring buffer. + * @pages - Pages in the ring buffer. + * @flags - Capabilities of the hardware buffer used in the + * session. If flags == 0, we use software double + * buffering. + */ +struct etr_perf_buffer { + struct etr_buf *etr_buf; + bool snapshot; + unsigned long head; + int nr_pages; + void **pages; + u32 flags; +}; + + +/* + * tmc_etr_setup_perf_buf: Allocate ETR buffer for use by perf. We try to + * use perf ring buffer pages for the ETR when we can. In the worst case + * we fallback to software double buffering. The size of the hardware buffer + * in this case is dependent on the size configured via sysfs, if we can't + * match the perf ring buffer size. We scale down the size by half until + * it reaches a limit of 1M, beyond which we give up. + */ +static struct etr_perf_buffer * +tmc_etr_setup_perf_buf(struct tmc_drvdata *drvdata, int node, int nr_pages, + void **pages, bool snapshot) +{ + int i; + struct etr_buf *etr_buf; + struct etr_perf_buffer *etr_perf; + unsigned long size; + unsigned long buf_flags[] = { + ETR_BUF_F_RESTORE_FULL, + ETR_BUF_F_RESTORE_MINIMAL, + 0, + }; + + etr_perf = kzalloc_node(sizeof(*etr_perf), GFP_KERNEL, node); + if (!etr_perf) + return ERR_PTR(-ENOMEM); + + size = nr_pages << PAGE_SHIFT; + /* + * We can use the perf ring buffer for ETR only if it is coherent + * and in snapshot mode as we cannot control how much data will be + * written before we stop. + */ + if (tmc_etr_has_cap(drvdata, TMC_ETR_COHERENT) && snapshot) { + for (i = 0; buf_flags[i]; i++) { + etr_buf = tmc_alloc_etr_buf(drvdata, size, + buf_flags[i], node, pages); + if (!IS_ERR(etr_buf)) { + etr_perf->flags = buf_flags[i]; + goto done; + } + } + } + + /* + * We have to now fallback to software double buffering. + * The tricky decision is choosing a size for the hardware buffer. + * We could start with drvdata->size (configurable via sysfs) and + * scale it down until we can allocate the data. + */ + etr_buf = tmc_alloc_etr_buf(drvdata, size, 0, node, NULL); + if (!IS_ERR(etr_buf)) + goto done; + size = drvdata->size; + do { + etr_buf = tmc_alloc_etr_buf(drvdata, size, 0, node, NULL); + if (!IS_ERR(etr_buf)) + goto done; + size /= 2; + } while (size >= TMC_ETR_PERF_MIN_BUF_SIZE); + + kfree(etr_perf); + return ERR_PTR(-ENOMEM); + +done: + etr_perf->etr_buf = etr_buf; + return etr_perf; +} + + +static void *tmc_etr_alloc_perf_buffer(struct coresight_device *csdev, + int cpu, void **pages, int nr_pages, + bool snapshot) +{ + struct etr_perf_buffer *etr_perf; + struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); + + if (cpu == -1) + cpu = smp_processor_id(); + + etr_perf = tmc_etr_setup_perf_buf(drvdata, cpu_to_node(cpu), + nr_pages, pages, snapshot); + if (IS_ERR(etr_perf)) { + dev_dbg(drvdata->dev, "Unable to allocate ETR buffer\n"); + return NULL; + } + + etr_perf->snapshot = snapshot; + etr_perf->nr_pages = nr_pages; + etr_perf->pages = pages; + + return etr_perf; +} + +static void tmc_etr_free_perf_buffer(void *config) +{ + struct etr_perf_buffer *etr_perf = config; + + if (etr_perf->etr_buf) + tmc_free_etr_buf(etr_perf->etr_buf); + kfree(etr_perf); +} + +/* + * Pad the etr buffer with barrier packets to align the head to 4K aligned + * offset. This is required for ETR SG backed buffers, so that we can rotate + * the buffer easily and avoid a software double buffering. + */ +static s64 tmc_etr_pad_perf_buffer(struct etr_perf_buffer *etr_perf, s64 head) +{ + s64 new_head; + struct etr_buf *etr_buf = etr_perf->etr_buf; + + head %= etr_buf->size; + new_head = ALIGN(head, SZ_4K); + if (head == new_head) + return head; + /* + * If the padding is not aligned to barrier packet size + * we can't do much. + */ + if ((new_head - head) % CORESIGHT_BARRIER_PKT_SIZE) + return -EINVAL; + return tmc_etr_buf_insert_barrier_packets(etr_buf, head, + new_head - head); +} + +static int tmc_etr_set_perf_buffer(struct coresight_device *csdev, + struct perf_output_handle *handle, + void *config) +{ + int rc; + unsigned long flags; + s64 head, new_head; + struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); + struct etr_perf_buffer *etr_perf = config; + struct etr_buf *etr_buf = etr_perf->etr_buf; + + etr_perf->head = handle->head; + head = etr_perf->head % etr_buf->size; + switch (etr_perf->flags) { + case ETR_BUF_F_RESTORE_MINIMAL: + new_head = tmc_etr_pad_perf_buffer(etr_perf, head); + if (new_head < 0) + return new_head; + if (head != new_head) { + rc = perf_aux_output_skip(handle, new_head - head); + if (rc) + return rc; + etr_perf->head = handle->head; + head = new_head; + } + /* Fall through */ + case ETR_BUF_F_RESTORE_FULL: + rc = tmc_restore_etr_buf(drvdata, etr_buf, head, head, 0); + break; + case 0: + /* Nothing to do here. */ + rc = 0; + break; + default: + dev_warn(drvdata->dev, "Unexpected flags in etr_perf buffer\n"); + WARN_ON(1); + rc = -EINVAL; + } + + /* + * This sink is going to be used in perf mode. No other session can + * grab it from us. So set the perf mode specific data here. This will + * be released just before we disable the sink from update_buffer call + * back. + */ + if (!rc) { + spin_lock_irqsave(&drvdata->spinlock, flags); + if (WARN_ON(drvdata->perf_data)) + rc = -EBUSY; + else + drvdata->perf_data = etr_perf; + spin_unlock_irqrestore(&drvdata->spinlock, flags); + } + return rc; +} + +/* + * tmc_etr_sync_perf_buffer: Copy the actual trace data from the hardware + * buffer to the perf ring buffer. + */ +static void tmc_etr_sync_perf_buffer(struct etr_perf_buffer *etr_perf) +{ + struct etr_buf *etr_buf = etr_perf->etr_buf; + unsigned long bytes, to_copy, head = etr_perf->head; + unsigned long pg_idx, pg_offset, src_offset; + char **dst_pages, *src_buf; + + head = etr_perf->head % (etr_perf->nr_pages << PAGE_SHIFT); + pg_idx = head >> PAGE_SHIFT; + pg_offset = head & (PAGE_SIZE - 1); + dst_pages = (char **)etr_perf->pages; + src_offset = etr_buf->offset; + to_copy = etr_buf->len; + + while (to_copy > 0) { + /* + * We can copy minimum of : + * 1) what is available in the source buffer, + * 2) what is available in the source buffer, before it + * wraps around. + * 3) what is available in the destination page. + * in one iteration. + */ + bytes = tmc_etr_buf_get_data(etr_buf, src_offset, to_copy, + &src_buf); + if (WARN_ON_ONCE(bytes <= 0)) + break; + bytes = min(PAGE_SIZE - pg_offset, bytes); + + memcpy(dst_pages[pg_idx] + pg_offset, src_buf, bytes); + to_copy -= bytes; + /* Move destination pointers */ + pg_offset += bytes; + if (pg_offset == PAGE_SIZE) { + pg_offset = 0; + if (++pg_idx == etr_perf->nr_pages) + pg_idx = 0; + } + + /* Move source pointers */ + src_offset += bytes; + if (src_offset >= etr_buf->size) + src_offset -= etr_buf->size; + } +} + +/* + * XXX: What is the expected behavior here in the following cases ? + * 1) Full trace mode, without double buffering : What should be the size + * reported back when the buffer is full and has wrapped around. Ideally, + * we should report for the lost trace to make sure the "head" in the ring + * buffer comes back to the position as in the trace buffer, rather than + * returning "total size" of the buffer. + * 2) In snapshot mode, should we always return "full buffer size" ? + */ +static unsigned long +tmc_etr_update_perf_buffer(struct coresight_device *csdev, + struct perf_output_handle *handle, + void *config) +{ + bool double_buffer, lost = false; + unsigned long flags, offset, size = 0; + struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); + struct etr_perf_buffer *etr_perf = config; + struct etr_buf *etr_buf = etr_perf->etr_buf; + + double_buffer = (etr_perf->flags == 0); + + spin_lock_irqsave(&drvdata->spinlock, flags); + if (WARN_ON(drvdata->perf_data != etr_perf)) { + lost = true; + spin_unlock_irqrestore(&drvdata->spinlock, flags); + goto out; + } + + CS_UNLOCK(drvdata->base); + + tmc_flush_and_stop(drvdata); + + tmc_sync_etr_buf(drvdata); + CS_UNLOCK(drvdata->base); + /* Reset perf specific data */ + drvdata->perf_data = NULL; + spin_unlock_irqrestore(&drvdata->spinlock, flags); + + offset = etr_buf->offset + etr_buf->len; + if (offset > etr_buf->size) + offset -= etr_buf->size; + + if (double_buffer) { + /* + * If we use software double buffering, update the ring buffer. + * And the size is what we have in the hardware buffer. + */ + size = etr_buf->len; + tmc_etr_sync_perf_buffer(etr_perf); + } else { + /* + * If the hardware uses perf ring buffer the size of the data + * we have is from the old-head to the current head of the + * buffer. This also means in non-snapshot mode, we have lost + * one-full-buffer-size worth data, if the buffer wraps around. + */ + unsigned long old_head; + + old_head = (etr_perf->head % etr_buf->size); + size = (offset - old_head + etr_buf->size) % etr_buf->size; + } + + /* + * Update handle->head in snapshot mode. Also update the size to the + * hardware buffer size if there was an overflow. + */ + if (etr_perf->snapshot) { + if (double_buffer) + handle->head += size; + else + handle->head = offset; + if (etr_buf->full) + size = etr_buf->size; + } + + lost |= etr_buf->full; +out: + if (lost) + perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED); + return size; +} + static int tmc_enable_etr_sink_perf(struct coresight_device *csdev) { - /* We don't support perf mode yet ! */ - return -EINVAL; + int rc = 0; + unsigned long flags; + struct etr_perf_buffer *etr_perf; + struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); + + spin_lock_irqsave(&drvdata->spinlock, flags); + /* + * There can be only one writer per sink in perf mode. If the sink + * is already open in SYSFS mode, we can't use it. + */ + if (drvdata->mode != CS_MODE_DISABLED) { + rc = -EBUSY; + goto unlock_out; + } + + etr_perf = drvdata->perf_data; + if (!etr_perf || !etr_perf->etr_buf) { + rc = -EINVAL; + goto unlock_out; + } + + drvdata->mode = CS_MODE_PERF; + tmc_etr_enable_hw(drvdata, etr_perf->etr_buf); + +unlock_out: + spin_unlock_irqrestore(&drvdata->spinlock, flags); + return rc; }
static int tmc_enable_etr_sink(struct coresight_device *csdev, u32 mode) @@ -1372,6 +1736,10 @@ static void tmc_disable_etr_sink(struct coresight_device *csdev) static const struct coresight_ops_sink tmc_etr_sink_ops = { .enable = tmc_enable_etr_sink, .disable = tmc_disable_etr_sink, + .alloc_buffer = tmc_etr_alloc_perf_buffer, + .update_buffer = tmc_etr_update_perf_buffer, + .set_buffer = tmc_etr_set_perf_buffer, + .free_buffer = tmc_etr_free_perf_buffer, };
const struct coresight_ops tmc_etr_cs_ops = { diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h index 2c5b905b6494..06386ceb7866 100644 --- a/drivers/hwtracing/coresight/coresight-tmc.h +++ b/drivers/hwtracing/coresight/coresight-tmc.h @@ -198,6 +198,7 @@ struct etr_buf { * @trigger_cntr: amount of words to store after a trigger. * @etr_caps: Bitmask of capabilities of the TMC ETR, inferred from the * device configuration register (DEVID) + * @perf_data: PERF buffer for ETR. * @sysfs_data: SYSFS buffer for ETR. */ struct tmc_drvdata { @@ -219,6 +220,7 @@ struct tmc_drvdata { u32 trigger_cntr; u32 etr_caps; struct etr_buf *sysfs_buf; + void *perf_data; };
struct etr_buf_operations {
On Thu, Oct 19, 2017 at 06:15:53PM +0100, Suzuki K Poulose wrote:
Add necessary support for using ETR as a sink in ETM perf tracing. We try make the best use of the available modes of buffers to try and avoid software double buffering.
We can use the perf ring buffer for ETR directly if all of the conditions below are met :
- ETR is DMA coherent
- perf is used in snapshot mode. In full tracing mode, we cannot guarantee that the ETR will stop before it overwrites the data which may not have been consumed by the user.
- ETR supports save-restore with a scatter-gather mechanism which can use a given set of pages we use the perf ring buffer directly. If we have an in-built TMC ETR Scatter Gather unit, we make use of a circular SG list to restart from a given head. However, we need to align the starting offset to 4K in this case.
If the ETR doesn't support either of this, we fallback to software double buffering.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
drivers/hwtracing/coresight/coresight-tmc-etr.c | 372 +++++++++++++++++++++++- drivers/hwtracing/coresight/coresight-tmc.h | 2 + 2 files changed, 372 insertions(+), 2 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index 229c36b7266c..1dfe7cf7c721 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -21,6 +21,9 @@ #include "coresight-priv.h" #include "coresight-tmc.h" +/* Lower limit for ETR hardware buffer in double buffering mode */ +#define TMC_ETR_PERF_MIN_BUF_SIZE SZ_1M
/*
- The TMC ETR SG has a page size of 4K. The SG table contains pointers
- to 4KB buffers. However, the OS may be use PAGE_SIZE different from
@@ -1328,10 +1331,371 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev) return ret; } +/*
- etr_perf_buffer - Perf buffer used for ETR
- @etr_buf - Actual buffer used by the ETR
- @snaphost - Perf session mode
- @head - handle->head at the beginning of the session.
- @nr_pages - Number of pages in the ring buffer.
- @pages - Pages in the ring buffer.
- @flags - Capabilities of the hardware buffer used in the
session. If flags == 0, we use software double
buffering.
- */
+struct etr_perf_buffer {
- struct etr_buf *etr_buf;
- bool snapshot;
- unsigned long head;
- int nr_pages;
- void **pages;
- u32 flags;
+};
Please move this to the top, just below the declaration for etr_sg_table.
+/*
- tmc_etr_setup_perf_buf: Allocate ETR buffer for use by perf. We try to
- use perf ring buffer pages for the ETR when we can. In the worst case
- we fallback to software double buffering. The size of the hardware buffer
- in this case is dependent on the size configured via sysfs, if we can't
- match the perf ring buffer size. We scale down the size by half until
- it reaches a limit of 1M, beyond which we give up.
- */
+static struct etr_perf_buffer * +tmc_etr_setup_perf_buf(struct tmc_drvdata *drvdata, int node, int nr_pages,
void **pages, bool snapshot)
+{
- int i;
- struct etr_buf *etr_buf;
- struct etr_perf_buffer *etr_perf;
- unsigned long size;
- unsigned long buf_flags[] = {
ETR_BUF_F_RESTORE_FULL,
ETR_BUF_F_RESTORE_MINIMAL,
0,
};
- etr_perf = kzalloc_node(sizeof(*etr_perf), GFP_KERNEL, node);
- if (!etr_perf)
return ERR_PTR(-ENOMEM);
- size = nr_pages << PAGE_SHIFT;
- /*
* We can use the perf ring buffer for ETR only if it is coherent
* and in snapshot mode as we cannot control how much data will be
* written before we stop.
*/
- if (tmc_etr_has_cap(drvdata, TMC_ETR_COHERENT) && snapshot) {
for (i = 0; buf_flags[i]; i++) {
etr_buf = tmc_alloc_etr_buf(drvdata, size,
buf_flags[i], node, pages);
if (!IS_ERR(etr_buf)) {
etr_perf->flags = buf_flags[i];
goto done;
}
}
- }
- /*
* We have to now fallback to software double buffering.
* The tricky decision is choosing a size for the hardware buffer.
* We could start with drvdata->size (configurable via sysfs) and
* scale it down until we can allocate the data.
*/
- etr_buf = tmc_alloc_etr_buf(drvdata, size, 0, node, NULL);
- if (!IS_ERR(etr_buf))
goto done;
- size = drvdata->size;
- do {
etr_buf = tmc_alloc_etr_buf(drvdata, size, 0, node, NULL);
if (!IS_ERR(etr_buf))
goto done;
size /= 2;
- } while (size >= TMC_ETR_PERF_MIN_BUF_SIZE);
- kfree(etr_perf);
- return ERR_PTR(-ENOMEM);
+done:
- etr_perf->etr_buf = etr_buf;
- return etr_perf;
+}
+static void *tmc_etr_alloc_perf_buffer(struct coresight_device *csdev,
int cpu, void **pages, int nr_pages,
bool snapshot)
+{
- struct etr_perf_buffer *etr_perf;
- struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
- if (cpu == -1)
cpu = smp_processor_id();
- etr_perf = tmc_etr_setup_perf_buf(drvdata, cpu_to_node(cpu),
nr_pages, pages, snapshot);
- if (IS_ERR(etr_perf)) {
dev_dbg(drvdata->dev, "Unable to allocate ETR buffer\n");
return NULL;
- }
- etr_perf->snapshot = snapshot;
- etr_perf->nr_pages = nr_pages;
- etr_perf->pages = pages;
- return etr_perf;
+}
+static void tmc_etr_free_perf_buffer(void *config) +{
- struct etr_perf_buffer *etr_perf = config;
- if (etr_perf->etr_buf)
tmc_free_etr_buf(etr_perf->etr_buf);
- kfree(etr_perf);
+}
+/*
- Pad the etr buffer with barrier packets to align the head to 4K aligned
- offset. This is required for ETR SG backed buffers, so that we can rotate
- the buffer easily and avoid a software double buffering.
- */
+static s64 tmc_etr_pad_perf_buffer(struct etr_perf_buffer *etr_perf, s64 head) +{
- s64 new_head;
- struct etr_buf *etr_buf = etr_perf->etr_buf;
- head %= etr_buf->size;
- new_head = ALIGN(head, SZ_4K);
- if (head == new_head)
return head;
- /*
* If the padding is not aligned to barrier packet size
* we can't do much.
*/
- if ((new_head - head) % CORESIGHT_BARRIER_PKT_SIZE)
return -EINVAL;
- return tmc_etr_buf_insert_barrier_packets(etr_buf, head,
new_head - head);
+}
+static int tmc_etr_set_perf_buffer(struct coresight_device *csdev,
struct perf_output_handle *handle,
void *config)
+{
- int rc;
- unsigned long flags;
- s64 head, new_head;
- struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
- struct etr_perf_buffer *etr_perf = config;
- struct etr_buf *etr_buf = etr_perf->etr_buf;
- etr_perf->head = handle->head;
- head = etr_perf->head % etr_buf->size;
- switch (etr_perf->flags) {
- case ETR_BUF_F_RESTORE_MINIMAL:
new_head = tmc_etr_pad_perf_buffer(etr_perf, head);
if (new_head < 0)
return new_head;
if (head != new_head) {
rc = perf_aux_output_skip(handle, new_head - head);
if (rc)
return rc;
etr_perf->head = handle->head;
head = new_head;
}
/* Fall through */
- case ETR_BUF_F_RESTORE_FULL:
rc = tmc_restore_etr_buf(drvdata, etr_buf, head, head, 0);
break;
- case 0:
/* Nothing to do here. */
rc = 0;
break;
- default:
dev_warn(drvdata->dev, "Unexpected flags in etr_perf buffer\n");
WARN_ON(1);
rc = -EINVAL;
- }
- /*
* This sink is going to be used in perf mode. No other session can
* grab it from us. So set the perf mode specific data here. This will
* be released just before we disable the sink from update_buffer call
* back.
*/
- if (!rc) {
spin_lock_irqsave(&drvdata->spinlock, flags);
if (WARN_ON(drvdata->perf_data))
rc = -EBUSY;
else
drvdata->perf_data = etr_perf;
spin_unlock_irqrestore(&drvdata->spinlock, flags);
- }
- return rc;
+}
+/*
- tmc_etr_sync_perf_buffer: Copy the actual trace data from the hardware
- buffer to the perf ring buffer.
- */
+static void tmc_etr_sync_perf_buffer(struct etr_perf_buffer *etr_perf) +{
- struct etr_buf *etr_buf = etr_perf->etr_buf;
- unsigned long bytes, to_copy, head = etr_perf->head;
- unsigned long pg_idx, pg_offset, src_offset;
- char **dst_pages, *src_buf;
- head = etr_perf->head % (etr_perf->nr_pages << PAGE_SHIFT);
- pg_idx = head >> PAGE_SHIFT;
- pg_offset = head & (PAGE_SIZE - 1);
- dst_pages = (char **)etr_perf->pages;
- src_offset = etr_buf->offset;
- to_copy = etr_buf->len;
- while (to_copy > 0) {
/*
* We can copy minimum of :
* 1) what is available in the source buffer,
* 2) what is available in the source buffer, before it
* wraps around.
* 3) what is available in the destination page.
* in one iteration.
*/
bytes = tmc_etr_buf_get_data(etr_buf, src_offset, to_copy,
&src_buf);
if (WARN_ON_ONCE(bytes <= 0))
break;
bytes = min(PAGE_SIZE - pg_offset, bytes);
memcpy(dst_pages[pg_idx] + pg_offset, src_buf, bytes);
to_copy -= bytes;
/* Move destination pointers */
pg_offset += bytes;
if (pg_offset == PAGE_SIZE) {
pg_offset = 0;
if (++pg_idx == etr_perf->nr_pages)
pg_idx = 0;
}
/* Move source pointers */
src_offset += bytes;
if (src_offset >= etr_buf->size)
src_offset -= etr_buf->size;
- }
+}
+/*
- XXX: What is the expected behavior here in the following cases ?
- Full trace mode, without double buffering : What should be the size
reported back when the buffer is full and has wrapped around. Ideally,
we should report for the lost trace to make sure the "head" in the ring
buffer comes back to the position as in the trace buffer, rather than
returning "total size" of the buffer.
- In snapshot mode, should we always return "full buffer size" ?
- */
+static unsigned long +tmc_etr_update_perf_buffer(struct coresight_device *csdev,
struct perf_output_handle *handle,
void *config)
+{
- bool double_buffer, lost = false;
- unsigned long flags, offset, size = 0;
- struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
- struct etr_perf_buffer *etr_perf = config;
- struct etr_buf *etr_buf = etr_perf->etr_buf;
- double_buffer = (etr_perf->flags == 0);
- spin_lock_irqsave(&drvdata->spinlock, flags);
- if (WARN_ON(drvdata->perf_data != etr_perf)) {
lost = true;
If we are here something went seriously wrong - I don't think much more can be done other than a WARN_ON()...
spin_unlock_irqrestore(&drvdata->spinlock, flags);
goto out;
- }
- CS_UNLOCK(drvdata->base);
- tmc_flush_and_stop(drvdata);
- tmc_sync_etr_buf(drvdata);
- CS_UNLOCK(drvdata->base);
- /* Reset perf specific data */
- drvdata->perf_data = NULL;
- spin_unlock_irqrestore(&drvdata->spinlock, flags);
- offset = etr_buf->offset + etr_buf->len;
- if (offset > etr_buf->size)
offset -= etr_buf->size;
- if (double_buffer) {
/*
* If we use software double buffering, update the ring buffer.
* And the size is what we have in the hardware buffer.
*/
size = etr_buf->len;
tmc_etr_sync_perf_buffer(etr_perf);
- } else {
/*
* If the hardware uses perf ring buffer the size of the data
* we have is from the old-head to the current head of the
* buffer. This also means in non-snapshot mode, we have lost
* one-full-buffer-size worth data, if the buffer wraps around.
*/
unsigned long old_head;
old_head = (etr_perf->head % etr_buf->size);
size = (offset - old_head + etr_buf->size) % etr_buf->size;
- }
- /*
* Update handle->head in snapshot mode. Also update the size to the
* hardware buffer size if there was an overflow.
*/
- if (etr_perf->snapshot) {
if (double_buffer)
handle->head += size;
else
handle->head = offset;
if (etr_buf->full)
size = etr_buf->size;
- }
- lost |= etr_buf->full;
+out:
- if (lost)
perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED);
- return size;
+}
static int tmc_enable_etr_sink_perf(struct coresight_device *csdev) {
- /* We don't support perf mode yet ! */
- return -EINVAL;
- int rc = 0;
- unsigned long flags;
- struct etr_perf_buffer *etr_perf;
- struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
- spin_lock_irqsave(&drvdata->spinlock, flags);
- /*
* There can be only one writer per sink in perf mode. If the sink
* is already open in SYSFS mode, we can't use it.
*/
- if (drvdata->mode != CS_MODE_DISABLED) {
rc = -EBUSY;
goto unlock_out;
- }
- etr_perf = drvdata->perf_data;
- if (!etr_perf || !etr_perf->etr_buf) {
rc = -EINVAL;
This is a serious malfunction - I would WARN_ON() before unlocking.
goto unlock_out;
- }
- drvdata->mode = CS_MODE_PERF;
- tmc_etr_enable_hw(drvdata, etr_perf->etr_buf);
+unlock_out:
- spin_unlock_irqrestore(&drvdata->spinlock, flags);
- return rc;
} static int tmc_enable_etr_sink(struct coresight_device *csdev, u32 mode) @@ -1372,6 +1736,10 @@ static void tmc_disable_etr_sink(struct coresight_device *csdev) static const struct coresight_ops_sink tmc_etr_sink_ops = { .enable = tmc_enable_etr_sink, .disable = tmc_disable_etr_sink,
- .alloc_buffer = tmc_etr_alloc_perf_buffer,
- .update_buffer = tmc_etr_update_perf_buffer,
- .set_buffer = tmc_etr_set_perf_buffer,
- .free_buffer = tmc_etr_free_perf_buffer,
}; const struct coresight_ops tmc_etr_cs_ops = { diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h index 2c5b905b6494..06386ceb7866 100644 --- a/drivers/hwtracing/coresight/coresight-tmc.h +++ b/drivers/hwtracing/coresight/coresight-tmc.h @@ -198,6 +198,7 @@ struct etr_buf {
- @trigger_cntr: amount of words to store after a trigger.
- @etr_caps: Bitmask of capabilities of the TMC ETR, inferred from the
device configuration register (DEVID)
*/
- @perf_data: PERF buffer for ETR.
- @sysfs_data: SYSFS buffer for ETR.
struct tmc_drvdata { @@ -219,6 +220,7 @@ struct tmc_drvdata { u32 trigger_cntr; u32 etr_caps; struct etr_buf *sysfs_buf;
- void *perf_data;
This is a temporary place holder while an event is active, i.e theoretically it doesn't stay the same for the entire trace session. In situations where there could be one ETR per CPU, the same ETR could be used to serve more than one trace session (since only one session can be active at a time on a CPU). As such I would call it curr_perf_data or something similar. I'd also make that clear in the above documentation.
Have you tried your implementation on a dragonboard or a Hikey?
Thanks, Mathieu
}; struct etr_buf_operations { -- 2.13.6
On 07/11/17 00:24, Mathieu Poirier wrote:
On Thu, Oct 19, 2017 at 06:15:53PM +0100, Suzuki K Poulose wrote:
Add necessary support for using ETR as a sink in ETM perf tracing. We try make the best use of the available modes of buffers to try and avoid software double buffering.
We can use the perf ring buffer for ETR directly if all of the conditions below are met :
- ETR is DMA coherent
- perf is used in snapshot mode. In full tracing mode, we cannot guarantee that the ETR will stop before it overwrites the data which may not have been consumed by the user.
- ETR supports save-restore with a scatter-gather mechanism which can use a given set of pages we use the perf ring buffer directly. If we have an in-built TMC ETR Scatter Gather unit, we make use of a circular SG list to restart from a given head. However, we need to align the starting offset to 4K in this case.
If the ETR doesn't support either of this, we fallback to software double buffering.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
+/*
- etr_perf_buffer - Perf buffer used for ETR
- @etr_buf - Actual buffer used by the ETR
- @snaphost - Perf session mode
- @head - handle->head at the beginning of the session.
- @nr_pages - Number of pages in the ring buffer.
- @pages - Pages in the ring buffer.
- @flags - Capabilities of the hardware buffer used in the
session. If flags == 0, we use software double
buffering.
- */
+struct etr_perf_buffer {
- struct etr_buf *etr_buf;
- bool snapshot;
- unsigned long head;
- int nr_pages;
- void **pages;
- u32 flags;
+};
Please move this to the top, just below the declaration for etr_sg_table.
Sure.
+/*
- XXX: What is the expected behavior here in the following cases ?
- Full trace mode, without double buffering : What should be the size
reported back when the buffer is full and has wrapped around. Ideally,
we should report for the lost trace to make sure the "head" in the ring
buffer comes back to the position as in the trace buffer, rather than
returning "total size" of the buffer.
- In snapshot mode, should we always return "full buffer size" ?
- */
+static unsigned long +tmc_etr_update_perf_buffer(struct coresight_device *csdev,
struct perf_output_handle *handle,
void *config)
+{
- bool double_buffer, lost = false;
- unsigned long flags, offset, size = 0;
- struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
- struct etr_perf_buffer *etr_perf = config;
- struct etr_buf *etr_buf = etr_perf->etr_buf;
- double_buffer = (etr_perf->flags == 0);
- spin_lock_irqsave(&drvdata->spinlock, flags);
- if (WARN_ON(drvdata->perf_data != etr_perf)) {
lost = true;
If we are here something went seriously wrong - I don't think much more can be done other than a WARN_ON()...
right. I will do it for the case below as well.
static int tmc_enable_etr_sink_perf(struct coresight_device *csdev) {
...
- etr_perf = drvdata->perf_data;
- if (!etr_perf || !etr_perf->etr_buf) {
rc = -EINVAL;
This is a serious malfunction - I would WARN_ON() before unlocking.
diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h index 2c5b905b6494..06386ceb7866 100644 --- a/drivers/hwtracing/coresight/coresight-tmc.h +++ b/drivers/hwtracing/coresight/coresight-tmc.h @@ -198,6 +198,7 @@ struct etr_buf {
- @trigger_cntr: amount of words to store after a trigger.
- @etr_caps: Bitmask of capabilities of the TMC ETR, inferred from the
device configuration register (DEVID)
*/ struct tmc_drvdata {
- @perf_data: PERF buffer for ETR.
- @sysfs_data: SYSFS buffer for ETR.
@@ -219,6 +220,7 @@ struct tmc_drvdata { u32 trigger_cntr; u32 etr_caps; struct etr_buf *sysfs_buf;
- void *perf_data;
This is a temporary place holder while an event is active, i.e theoretically it doesn't stay the same for the entire trace session. In situations where there could be one ETR per CPU, the same ETR could be used to serve more than one trace session (since only one session can be active at a time on a CPU). As such I would call it curr_perf_data or something similar. I'd also make that clear in the above documentation.
You're right. However, from the ETR's perspective, it doesn't care how the perf uses it. So from the ETR driver side, it still is something used by the perf mode. All it stands for is the buffer to be used when enabled in perf mode. I could definitely add some comment to describe this. But I am not sure if we have to rename the variable.
Have you tried your implementation on a dragonboard or a Hikey?
No, I haven't. But Mike and Rob are trying on the Draonboard & Hikey respectively. We are hitting some issues in the Scatter Gather mode, which is under debugging. The SG table looks correct, just that the ETR hangs up. It works fine in the flat memory mode. So, it is something to do with the READ (sg table pointers) vs WRITE (write trace data) pressure on the ETR.
One change I am working on with the perf buffer is to limit the "size" of the trace buffer used by the ETR (in case of the perf-ring buffer) to the handle->size. Otherwise we could be corrupting the collected trace waiting for consumption by the user. This is easily possible with our SG table. But with the flat buffer, we have to limit the size the minimum of (handle->size, space-in-circular-buffer-before wrapping).
In either case, we could loose data if we overflow the buffer, something we can't help at the moment.
Suzuki
Thanks, Mathieu
}; struct etr_buf_operations { -- 2.13.6
Hi Suzuki, Mathieu,
A follow up on Dragonboard issues...
===== Using Suzukis debug code and some of my own home spun updates I've got the following logging out of a typical ETR-SG session from the DB410. Session initiated using command line './perf record -e cs_etm/@826000.etr/ --per-thread sort'
root@linaro-developer:~# [ 122.075896] tmc_etr_sg_table_dump:455: Table base; Vaddr:ffffff800978d000; DAddr:0xb10b1000; Table Pages 1; Table Entries 1024 [ 122.075932] tmc_etr_sg_table_dump:462: 00000: ffffff800978d000:[N] 0xb14b0000 [ 122.086281] tmc_etr_sg_table_dump:462: 00001: ffffff800978d004:[N] 0xb14b1000 [ 122.093410] tmc_etr_sg_table_dump:462: 00002: ffffff800978d008:[N] 0xb14b2000 ----- snip ----- [ 129.438535] tmc_etr_sg_table_dump:462: 01021: ffffff800978dff4:[N] 0xb10ad000 [ 129.445741] tmc_etr_sg_table_dump:475: 01022: ### ffffff800978dff8:[L] 0xb10ae000 ### [ 129.452945] tmc_etr_sg_table_dump:479: 01023: empty line [ 129.460840] tmc_etr_sg_table_dump:485: ******* End of Table ***** [ 129.466333] tmc_etr_alloc_sg_buf:822: coresight-tmc 826000.etr: ETR - alloc SG buffer
== SG table looks fine - I've removed the last circular link used for rotating the table as that is not happening anyway and wanted to eliminate it as an issue
== first pass trace capture - long running program
[ 129.481359] tmc_etr_enable_hw:1239: Set DBA 0xb10b1000; AXICTL 0x000007bd [ 129.484297] tmc_etr_enable_hw:1260: exit() [ 129.491251] tmc_enable_etf_link:306: coresight-tmc 825000.etf: TMC-ETF enabled [ 129.794350] tmc_sync_etr_buf:1124: enter() [ 129.794377] tmc_sync_etr_buf:1131: ETR regs: RRP=0xb14b0000, RWP=0xB14B0000, STS=0x0000003C:full=false
== this shows the data page values for the first SG page from the table have been loaded into the RRP / RWP registers. Indication that the == SG table has been read. However status indicates that the buffer is empty, and that the AXI bus has returned an error (bit 5). (messing with permissions made no difference) == Error ignored by the driver (but I think the system is irretrievably broken now anyway).
[ 129.794383] tmc_etr_sync_sg_buf:849: enter() [ 129.806616] tmc_etr_sync_sg_buf:876: WARNING: Buffer Data Len == 0; force sync some pages [ 129.811051] tmc_etr_sync_sg_buf:881: exit() [ 129.819116] tmc_etr_sg_dump_pages:505: PG(0) : 0xcdcdcdcd::0xcdcdcdcd [ 129.823112] tmc_etr_sg_dump_pages:505: PG(1) : 0xcdcdcdcd::0xcdcdcdcd [ 129.829709] tmc_etr_sg_dump_pages:505: PG(2) : 0xcdcdcdcd::0xcdcdcdcd [ 129.836133] tmc_etr_sg_dump_pages:505: PG(3) : 0xcdcdcdcd::0xcdcdcdcd
== 1st 4 pages were pre-filled - seem untouched
[ 129.842556] tmc_sync_etr_buf:1143: exit() [ 129.848977] tmc_etr_sync_perf_buffer:1635: sync_perf 16384 bytes [ 129.853016] tmc_etf_print_regs_debug:37: TMC-ETF regs; RRP:0xF20 RWP:0xF20; Status:0x10
== ETF - operating as a FIFO link has recieved data and has been emptied - so the trace system has been running.
[ 129.859058] tmc_disable_etf_link:327: coresight-tmc 825000.etf: TMC-ETF disabled [ 129.866778] tmc_etr_disable_hw:1322: enter() [ 129.874410] tmc_etr_disable_hw:1336: exit() [ 129.878666] tmc_disable_etr_sink:1815: coresight-tmc 826000.etr: TMC-ETR disabled
== At this point we have the AXI bus errored out, and apparently no trace sent to the ETR memory.
== Second pass - perf tries to restart the trace.
[ 129.882636] tmc_etr_enable_hw:1197: enter() [ 129.890230] tmc_etr_enable_hw:1239: Set DBA 0xb10b1000; AXICTL 0x000007bd [ 129.894205] tmc_etr_enable_hw:1260: exit() [ 129.901157] tmc_enable_etf_link:306: coresight-tmc 825000.etf: TMC-ETF enabled [ 129.922498] coresight-tmc 826000.etr: timeout while waiting for completion of Manual Flush [ 129.922672] coresight-tmc 826000.etr: timeout while waiting for TMC to be Ready [ 129.929645] tmc_sync_etr_buf:1124: enter() [ 129.936850] tmc_sync_etr_buf:1131: ETR regs: RRP=0xb10b1000, RWP=0xB10B1000, STS=0x00000010:full=false
== this is bad - somehow the ETR regs have been set to the table base address, not the data page base address. No apparent AXI bus fault at this point, == but it is likely that the restart cleared the bit and the AXI is no longer responding.
[ 129.936856] tmc_etr_sync_sg_buf:849: enter() [ 129.950311] coresight-tmc 826000.etr: Unable to map RRP b10b1000 to offset
== driver error in response to invalid RRP value
[ 129.954733] tmc_etr_sg_dump_pages:505: PG(0) : 0xcdcdcdcd::0xcdcdcdcd [ 129.961417] tmc_etr_sg_dump_pages:505: PG(1) : 0xcdcdcdcd::0xcdcdcdcd [ 129.967928] tmc_etr_sg_dump_pages:505: PG(2) : 0xcdcdcdcd::0xcdcdcdcd [ 129.974350] tmc_etr_sg_dump_pages:505: PG(3) : 0xcdcdcdcd::0xcdcdcdcd [ 129.980772] tmc_sync_etr_buf:1143: exit() [ 129.987194] tmc_etr_sync_perf_buffer:1635: sync_perf 0 bytes [ 129.991201] tmc_etf_print_regs_debug:37: TMC-ETF regs; RRP:0x1C0 RWP:0x1C0; Status:0x1
== ETF is full - still trying to collect trace data.
[ 129.997066] coresight-tmc 825000.etf: timeout while waiting for completion of Manual Flush [ 130.004789] coresight-tmc 825000.etf: timeout while waiting for TMC to be Ready [ 130.012896] tmc_disable_etf_link:327: coresight-tmc 825000.etf: TMC-ETF disabled [ 130.020099] tmc_etr_disable_hw:1322: enter() [ 130.027879] coresight-tmc 826000.etr: timeout while waiting for completion of Manual Flush [ 130.032135] coresight-tmc 826000.etr: timeout while waiting for TMC to be Ready
== flushing not working at any point in the system here - probably due to incorrect ETR operation - can't flush if downstream not accepting data.
[ 130.040062] tmc_etr_disable_hw:1336: exit() [ 130.047266] tmc_disable_etr_sink:1815: coresight-tmc 826000.etr: TMC-ETR disabled
== Beyond this point, things pretty much repeat, but other systems start failing too - missing interrupts etc. == Symptoms would seem to indicate a locked out AXI bus - but that is pure speculation. == Eventually, the system automatically reboots itself - some watchdog element I guess.
=====
Conclusion:-
At this point ETR-SG is non-operational for unknown reasons - likely a memory system issue. Whether this is software config or hardware fault is not known at this point. However - this does raise the question about upstreaming this patchset. As it stands it will break existing ETR functionality on DB410c (and possibly HiKey 960). Currently the patchset decides on flat mapped / SG on buffer size. I would like to see a parameter added, something like SG threshold size, above which the implementation will choose SG, and below it will choose flat mapped. There also needs to be a special value - 0/-1 where SG is always disabled for the device. If the parameter is available in device tree and sysfs then it will give the control needed should the ETR-SG issue with the current non-operational platforms turn out to be insurmountable. At the very least it will allow the current patchset to be implemented in a way that can preserve what is currently working till a solution is found.
Regards
Mike
On 7 November 2017 at 10:52, Suzuki K Poulose Suzuki.Poulose@arm.com wrote:
On 07/11/17 00:24, Mathieu Poirier wrote:
On Thu, Oct 19, 2017 at 06:15:53PM +0100, Suzuki K Poulose wrote:
Add necessary support for using ETR as a sink in ETM perf tracing. We try make the best use of the available modes of buffers to try and avoid software double buffering.
We can use the perf ring buffer for ETR directly if all of the conditions below are met :
- ETR is DMA coherent
- perf is used in snapshot mode. In full tracing mode, we cannot guarantee that the ETR will stop before it overwrites the data which may not have been consumed by the user.
- ETR supports save-restore with a scatter-gather mechanism which can use a given set of pages we use the perf ring buffer directly. If we have an in-built TMC ETR Scatter Gather unit, we make use of a circular SG list to restart from a given head. However, we need to align the starting offset to 4K in this case.
If the ETR doesn't support either of this, we fallback to software double buffering.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
+/*
- etr_perf_buffer - Perf buffer used for ETR
- @etr_buf - Actual buffer used by the ETR
- @snaphost - Perf session mode
- @head - handle->head at the beginning of the session.
- @nr_pages - Number of pages in the ring buffer.
- @pages - Pages in the ring buffer.
- @flags - Capabilities of the hardware buffer used in the
session. If flags == 0, we use software double
buffering.
- */
+struct etr_perf_buffer {
struct etr_buf *etr_buf;
bool snapshot;
unsigned long head;
int nr_pages;
void **pages;
u32 flags;
+};
Please move this to the top, just below the declaration for etr_sg_table.
Sure.
+/*
- XXX: What is the expected behavior here in the following cases ?
- Full trace mode, without double buffering : What should be the
size
reported back when the buffer is full and has wrapped around.
Ideally,
we should report for the lost trace to make sure the "head" in
the ring
buffer comes back to the position as in the trace buffer, rather
than
returning "total size" of the buffer.
- In snapshot mode, should we always return "full buffer size" ?
- */
+static unsigned long +tmc_etr_update_perf_buffer(struct coresight_device *csdev,
struct perf_output_handle *handle,
void *config)
+{
bool double_buffer, lost = false;
unsigned long flags, offset, size = 0;
struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
struct etr_perf_buffer *etr_perf = config;
struct etr_buf *etr_buf = etr_perf->etr_buf;
double_buffer = (etr_perf->flags == 0);
spin_lock_irqsave(&drvdata->spinlock, flags);
if (WARN_ON(drvdata->perf_data != etr_perf)) {
lost = true;
If we are here something went seriously wrong - I don't think much more can be done other than a WARN_ON()...
right. I will do it for the case below as well.
static int tmc_enable_etr_sink_perf(struct coresight_device *csdev) {
...
etr_perf = drvdata->perf_data;
if (!etr_perf || !etr_perf->etr_buf) {
rc = -EINVAL;
This is a serious malfunction - I would WARN_ON() before unlocking.
diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h index 2c5b905b6494..06386ceb7866 100644 --- a/drivers/hwtracing/coresight/coresight-tmc.h +++ b/drivers/hwtracing/coresight/coresight-tmc.h @@ -198,6 +198,7 @@ struct etr_buf {
- @trigger_cntr: amount of words to store after a trigger.
- @etr_caps: Bitmask of capabilities of the TMC ETR, inferred from the
device configuration register (DEVID)
*/ struct tmc_drvdata {
- @perf_data: PERF buffer for ETR.
- @sysfs_data: SYSFS buffer for ETR.
@@ -219,6 +220,7 @@ struct tmc_drvdata { u32 trigger_cntr; u32 etr_caps; struct etr_buf *sysfs_buf;
void *perf_data;
This is a temporary place holder while an event is active, i.e theoretically it doesn't stay the same for the entire trace session. In situations where there could be one ETR per CPU, the same ETR could be used to serve more than one trace session (since only one session can be active at a time on a CPU). As such I would call it curr_perf_data or something similar. I'd also make that clear in the above documentation.
You're right. However, from the ETR's perspective, it doesn't care how the perf uses it. So from the ETR driver side, it still is something used by the perf mode. All it stands for is the buffer to be used when enabled in perf mode. I could definitely add some comment to describe this. But I am not sure if we have to rename the variable.
Have you tried your implementation on a dragonboard or a Hikey?
No, I haven't. But Mike and Rob are trying on the Draonboard & Hikey respectively. We are hitting some issues in the Scatter Gather mode, which is under debugging. The SG table looks correct, just that the ETR hangs up. It works fine in the flat memory mode. So, it is something to do with the READ (sg table pointers) vs WRITE (write trace data) pressure on the ETR.
One change I am working on with the perf buffer is to limit the "size" of the trace buffer used by the ETR (in case of the perf-ring buffer) to the handle->size. Otherwise we could be corrupting the collected trace waiting for consumption by the user. This is easily possible with our SG table. But with the flat buffer, we have to limit the size the minimum of (handle->size, space-in-circular-buffer-before wrapping).
In either case, we could loose data if we overflow the buffer, something we can't help at the moment.
Suzuki
Thanks, Mathieu
}; struct etr_buf_operations { -- 2.13.6
On 7 November 2017 at 08:17, Mike Leach mike.leach@linaro.org wrote:
Hi Suzuki, Mathieu,
A follow up on Dragonboard issues...
===== Using Suzukis debug code and some of my own home spun updates I've got the following logging out of a typical ETR-SG session from the DB410. Session initiated using command line './perf record -e cs_etm/@826000.etr/ --per-thread sort'
root@linaro-developer:~# [ 122.075896] tmc_etr_sg_table_dump:455: Table base; Vaddr:ffffff800978d000; DAddr:0xb10b1000; Table Pages 1; Table Entries 1024 [ 122.075932] tmc_etr_sg_table_dump:462: 00000: ffffff800978d000:[N] 0xb14b0000 [ 122.086281] tmc_etr_sg_table_dump:462: 00001: ffffff800978d004:[N] 0xb14b1000 [ 122.093410] tmc_etr_sg_table_dump:462: 00002: ffffff800978d008:[N] 0xb14b2000 ----- snip ----- [ 129.438535] tmc_etr_sg_table_dump:462: 01021: ffffff800978dff4:[N] 0xb10ad000 [ 129.445741] tmc_etr_sg_table_dump:475: 01022: ### ffffff800978dff8:[L] 0xb10ae000 ### [ 129.452945] tmc_etr_sg_table_dump:479: 01023: empty line [ 129.460840] tmc_etr_sg_table_dump:485: ******* End of Table ***** [ 129.466333] tmc_etr_alloc_sg_buf:822: coresight-tmc 826000.etr: ETR
- alloc SG buffer
== SG table looks fine - I've removed the last circular link used for rotating the table as that is not happening anyway and wanted to eliminate it as an issue
== first pass trace capture - long running program
[ 129.481359] tmc_etr_enable_hw:1239: Set DBA 0xb10b1000; AXICTL 0x000007bd [ 129.484297] tmc_etr_enable_hw:1260: exit() [ 129.491251] tmc_enable_etf_link:306: coresight-tmc 825000.etf: TMC-ETF enabled [ 129.794350] tmc_sync_etr_buf:1124: enter() [ 129.794377] tmc_sync_etr_buf:1131: ETR regs: RRP=0xb14b0000, RWP=0xB14B0000, STS=0x0000003C:full=false
== this shows the data page values for the first SG page from the table have been loaded into the RRP / RWP registers. Indication that the == SG table has been read. However status indicates that the buffer is empty, and that the AXI bus has returned an error (bit 5). (messing with permissions made no difference) == Error ignored by the driver (but I think the system is irretrievably broken now anyway).
[ 129.794383] tmc_etr_sync_sg_buf:849: enter() [ 129.806616] tmc_etr_sync_sg_buf:876: WARNING: Buffer Data Len == 0; force sync some pages [ 129.811051] tmc_etr_sync_sg_buf:881: exit() [ 129.819116] tmc_etr_sg_dump_pages:505: PG(0) : 0xcdcdcdcd::0xcdcdcdcd [ 129.823112] tmc_etr_sg_dump_pages:505: PG(1) : 0xcdcdcdcd::0xcdcdcdcd [ 129.829709] tmc_etr_sg_dump_pages:505: PG(2) : 0xcdcdcdcd::0xcdcdcdcd [ 129.836133] tmc_etr_sg_dump_pages:505: PG(3) : 0xcdcdcdcd::0xcdcdcdcd
== 1st 4 pages were pre-filled - seem untouched
[ 129.842556] tmc_sync_etr_buf:1143: exit() [ 129.848977] tmc_etr_sync_perf_buffer:1635: sync_perf 16384 bytes [ 129.853016] tmc_etf_print_regs_debug:37: TMC-ETF regs; RRP:0xF20 RWP:0xF20; Status:0x10
== ETF - operating as a FIFO link has recieved data and has been emptied - so the trace system has been running.
[ 129.859058] tmc_disable_etf_link:327: coresight-tmc 825000.etf: TMC-ETF disabled [ 129.866778] tmc_etr_disable_hw:1322: enter() [ 129.874410] tmc_etr_disable_hw:1336: exit() [ 129.878666] tmc_disable_etr_sink:1815: coresight-tmc 826000.etr: TMC-ETR disabled
== At this point we have the AXI bus errored out, and apparently no trace sent to the ETR memory.
== Second pass - perf tries to restart the trace.
[ 129.882636] tmc_etr_enable_hw:1197: enter() [ 129.890230] tmc_etr_enable_hw:1239: Set DBA 0xb10b1000; AXICTL 0x000007bd [ 129.894205] tmc_etr_enable_hw:1260: exit() [ 129.901157] tmc_enable_etf_link:306: coresight-tmc 825000.etf: TMC-ETF enabled [ 129.922498] coresight-tmc 826000.etr: timeout while waiting for completion of Manual Flush [ 129.922672] coresight-tmc 826000.etr: timeout while waiting for TMC to be Ready [ 129.929645] tmc_sync_etr_buf:1124: enter() [ 129.936850] tmc_sync_etr_buf:1131: ETR regs: RRP=0xb10b1000, RWP=0xB10B1000, STS=0x00000010:full=false
== this is bad - somehow the ETR regs have been set to the table base address, not the data page base address. No apparent AXI bus fault at this point, == but it is likely that the restart cleared the bit and the AXI is no longer responding.
[ 129.936856] tmc_etr_sync_sg_buf:849: enter() [ 129.950311] coresight-tmc 826000.etr: Unable to map RRP b10b1000 to offset
== driver error in response to invalid RRP value
[ 129.954733] tmc_etr_sg_dump_pages:505: PG(0) : 0xcdcdcdcd::0xcdcdcdcd [ 129.961417] tmc_etr_sg_dump_pages:505: PG(1) : 0xcdcdcdcd::0xcdcdcdcd [ 129.967928] tmc_etr_sg_dump_pages:505: PG(2) : 0xcdcdcdcd::0xcdcdcdcd [ 129.974350] tmc_etr_sg_dump_pages:505: PG(3) : 0xcdcdcdcd::0xcdcdcdcd [ 129.980772] tmc_sync_etr_buf:1143: exit() [ 129.987194] tmc_etr_sync_perf_buffer:1635: sync_perf 0 bytes [ 129.991201] tmc_etf_print_regs_debug:37: TMC-ETF regs; RRP:0x1C0 RWP:0x1C0; Status:0x1
== ETF is full - still trying to collect trace data.
[ 129.997066] coresight-tmc 825000.etf: timeout while waiting for completion of Manual Flush [ 130.004789] coresight-tmc 825000.etf: timeout while waiting for TMC to be Ready [ 130.012896] tmc_disable_etf_link:327: coresight-tmc 825000.etf: TMC-ETF disabled [ 130.020099] tmc_etr_disable_hw:1322: enter() [ 130.027879] coresight-tmc 826000.etr: timeout while waiting for completion of Manual Flush [ 130.032135] coresight-tmc 826000.etr: timeout while waiting for TMC to be Ready
== flushing not working at any point in the system here - probably due to incorrect ETR operation - can't flush if downstream not accepting data.
[ 130.040062] tmc_etr_disable_hw:1336: exit() [ 130.047266] tmc_disable_etr_sink:1815: coresight-tmc 826000.etr: TMC-ETR disabled
== Beyond this point, things pretty much repeat, but other systems start failing too - missing interrupts etc. == Symptoms would seem to indicate a locked out AXI bus - but that is pure speculation. == Eventually, the system automatically reboots itself - some watchdog element I guess.
=====
Conclusion:-
At this point ETR-SG is non-operational for unknown reasons - likely a memory system issue. Whether this is software config or hardware fault is not known at this point. However - this does raise the question about upstreaming this patchset. As it stands it will break existing ETR functionality on DB410c (and possibly HiKey 960). Currently the patchset decides on flat mapped / SG on buffer size. I would like to see a parameter added, something like SG threshold size, above which the implementation will choose SG, and below it will choose flat mapped. There also needs to be a special value - 0/-1 where SG is always disabled for the device. If the parameter is available in device tree and sysfs then it will give the control needed should the ETR-SG issue with the current non-operational platforms turn out to be insurmountable. At the very least it will allow the current patchset to be implemented in a way that can preserve what is currently working till a solution is found.
Right, the patchset won't go upstream if it break things. Before thinking about mitigation factors I'd like to see what the root cause of the problem is - when we have that we can discuss the best way to go around it.
Regards
Mike
On 7 November 2017 at 10:52, Suzuki K Poulose Suzuki.Poulose@arm.com wrote:
On 07/11/17 00:24, Mathieu Poirier wrote:
On Thu, Oct 19, 2017 at 06:15:53PM +0100, Suzuki K Poulose wrote:
Add necessary support for using ETR as a sink in ETM perf tracing. We try make the best use of the available modes of buffers to try and avoid software double buffering.
We can use the perf ring buffer for ETR directly if all of the conditions below are met :
- ETR is DMA coherent
- perf is used in snapshot mode. In full tracing mode, we cannot guarantee that the ETR will stop before it overwrites the data which may not have been consumed by the user.
- ETR supports save-restore with a scatter-gather mechanism which can use a given set of pages we use the perf ring buffer directly. If we have an in-built TMC ETR Scatter Gather unit, we make use of a circular SG list to restart from a given head. However, we need to align the starting offset to 4K in this case.
If the ETR doesn't support either of this, we fallback to software double buffering.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
+/*
- etr_perf_buffer - Perf buffer used for ETR
- @etr_buf - Actual buffer used by the ETR
- @snaphost - Perf session mode
- @head - handle->head at the beginning of the session.
- @nr_pages - Number of pages in the ring buffer.
- @pages - Pages in the ring buffer.
- @flags - Capabilities of the hardware buffer used in the
session. If flags == 0, we use software double
buffering.
- */
+struct etr_perf_buffer {
struct etr_buf *etr_buf;
bool snapshot;
unsigned long head;
int nr_pages;
void **pages;
u32 flags;
+};
Please move this to the top, just below the declaration for etr_sg_table.
Sure.
+/*
- XXX: What is the expected behavior here in the following cases ?
- Full trace mode, without double buffering : What should be the
size
reported back when the buffer is full and has wrapped around.
Ideally,
we should report for the lost trace to make sure the "head" in
the ring
buffer comes back to the position as in the trace buffer, rather
than
returning "total size" of the buffer.
- In snapshot mode, should we always return "full buffer size" ?
- */
+static unsigned long +tmc_etr_update_perf_buffer(struct coresight_device *csdev,
struct perf_output_handle *handle,
void *config)
+{
bool double_buffer, lost = false;
unsigned long flags, offset, size = 0;
struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
struct etr_perf_buffer *etr_perf = config;
struct etr_buf *etr_buf = etr_perf->etr_buf;
double_buffer = (etr_perf->flags == 0);
spin_lock_irqsave(&drvdata->spinlock, flags);
if (WARN_ON(drvdata->perf_data != etr_perf)) {
lost = true;
If we are here something went seriously wrong - I don't think much more can be done other than a WARN_ON()...
right. I will do it for the case below as well.
static int tmc_enable_etr_sink_perf(struct coresight_device *csdev) {
...
etr_perf = drvdata->perf_data;
if (!etr_perf || !etr_perf->etr_buf) {
rc = -EINVAL;
This is a serious malfunction - I would WARN_ON() before unlocking.
diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h index 2c5b905b6494..06386ceb7866 100644 --- a/drivers/hwtracing/coresight/coresight-tmc.h +++ b/drivers/hwtracing/coresight/coresight-tmc.h @@ -198,6 +198,7 @@ struct etr_buf {
- @trigger_cntr: amount of words to store after a trigger.
- @etr_caps: Bitmask of capabilities of the TMC ETR, inferred from the
device configuration register (DEVID)
*/ struct tmc_drvdata {
- @perf_data: PERF buffer for ETR.
- @sysfs_data: SYSFS buffer for ETR.
@@ -219,6 +220,7 @@ struct tmc_drvdata { u32 trigger_cntr; u32 etr_caps; struct etr_buf *sysfs_buf;
void *perf_data;
This is a temporary place holder while an event is active, i.e theoretically it doesn't stay the same for the entire trace session. In situations where there could be one ETR per CPU, the same ETR could be used to serve more than one trace session (since only one session can be active at a time on a CPU). As such I would call it curr_perf_data or something similar. I'd also make that clear in the above documentation.
You're right. However, from the ETR's perspective, it doesn't care how the perf uses it. So from the ETR driver side, it still is something used by the perf mode. All it stands for is the buffer to be used when enabled in perf mode. I could definitely add some comment to describe this. But I am not sure if we have to rename the variable.
Have you tried your implementation on a dragonboard or a Hikey?
No, I haven't. But Mike and Rob are trying on the Draonboard & Hikey respectively. We are hitting some issues in the Scatter Gather mode, which is under debugging. The SG table looks correct, just that the ETR hangs up. It works fine in the flat memory mode. So, it is something to do with the READ (sg table pointers) vs WRITE (write trace data) pressure on the ETR.
One change I am working on with the perf buffer is to limit the "size" of the trace buffer used by the ETR (in case of the perf-ring buffer) to the handle->size. Otherwise we could be corrupting the collected trace waiting for consumption by the user. This is easily possible with our SG table. But with the flat buffer, we have to limit the size the minimum of (handle->size, space-in-circular-buffer-before wrapping).
In either case, we could loose data if we overflow the buffer, something we can't help at the moment.
Suzuki
Thanks, Mathieu
}; struct etr_buf_operations { -- 2.13.6
-- Mike Leach Principal Engineer, ARM Ltd. Blackburn Design Centre. UK
On 19/10/17 18:15, Suzuki K Poulose wrote:
The TMC-ETR supports routing the Coresight trace data to the System memory. It supports two different modes in which the memory could be used.
- Contiguous memory - The memory is assumed to be physically
contiguous.
- Scatter Gather list - The memory can be chunks of 4K pages,
which are specified in a table of pointers which itself could be multiple 4K size pages.
To avoid the complications of the managing the buffer, this series adds a layer for managing the ETR buffer, which makes the best possibly choice based on what is available. The allocation can be tuned by passing in flags, existing pages (e.g, perf ring buffer) etc.
Towards supporting ETR Scatter Gather mode, we introduce a generic TMC scatter-gather table which can be used to manage the data and table pages. The table can be filled in the format expected by the Scatter-Gather mode.
The TMC ETR-SG mechanism doesn't allow starting the trace at non-zero offset (required by perf). So we make some tricky changes to the table at run time to allow starting at any "Page aligned" offset and then wrap around to the beginning of the buffer with very less overhead. See patches for more description.
The series also improves the way the ETR is controlled by different modes (sysfs vs. perf) by keeping mode specific data. This allows access to the trace data collected in sysfs mode, even when the ETR is operated in perf mode. Also with the transparent management of the buffer and scatter-gather mechanism, we can allow the user to request for larger trace buffers for sysfs mode. This is supported by providing a sysfs file, "buffer_size" which accepts a page aligned size, which will be used by the ETR when allocating a buffer.
Finally, it cleans up the etm perf sink callbacks a little bit and then adds the support for ETR sink. For the ETR, we try our best to use the perf ring buffer as the target hardware buffer, provided :
- The ETR is dma coherent (since the pages will be shared with userspace perf tool).
- The perf is used in snapshot mode (The ETR cannot be stopped based on the size of the data written hence we could easily overwrite the buffer. We may be able to fix this in the future)
- The ETR supports the Scatter-Gather mode.
If we can't use the perf buffers directly, we fallback to using software buffering where we have to copy the trace data back to the perf ring buffer.
Just to be clear :
The perf tool doesn't support the perf AUX api for coresight. I have used the perf tool from perf-OpenCSD [1] project to control the tracing.
[1] https://git.linaro.org/people/mathieu.poirier/coresight.git perf-opencsd-4.14-rc1
Suzuki