This series adds the support for using the tmc-etr in perf mode for storing the trace to system RAM. The ETR uses a separate buffer (double buffering) for storing the trace. This is copied back to the ring buffer when the event is stopped. We try to match the ETR buffer to the larger of perf ring buffer or the size configured for the ETR via sysfs. This allows tuning the buffer size to prevent overflows and loosing trace data, as we don't have overflow interrupt support (yet).
Applies on coresight/next
Changes since [0] : - Drop buffer rotation logic for etr-buf - Do not use perf ring buffer. (Add support later) - Fix handling of sink, preventing mixed modes of operation.
[0] - TMC ETR perf support - http://lists.infradead.org/pipermail/linux-arm-kernel/2018-May/574875.html
Suzuki K Poulose (6): coresight: Fix handling of sinks coresight: tmc-etr: Handle driver mode specific ETR buffers coresight: tmc-etr: Relax collection of trace from sysfs mode coresight: Convert driver messages to dev_dbg coresight: perf: Remove reset_buffer call back for sinks coresight: etm-perf: Add support for ETR backend
.../coresight/coresight-dynamic-replicator.c | 4 +- drivers/hwtracing/coresight/coresight-etb10.c | 62 +--- drivers/hwtracing/coresight/coresight-etm-perf.c | 9 +- drivers/hwtracing/coresight/coresight-etm3x.c | 4 +- drivers/hwtracing/coresight/coresight-etm4x.c | 4 +- drivers/hwtracing/coresight/coresight-funnel.c | 4 +- drivers/hwtracing/coresight/coresight-replicator.c | 4 +- drivers/hwtracing/coresight/coresight-stm.c | 4 +- drivers/hwtracing/coresight/coresight-tmc-etf.c | 66 +--- drivers/hwtracing/coresight/coresight-tmc-etr.c | 337 +++++++++++++++++++-- drivers/hwtracing/coresight/coresight-tmc.c | 4 +- drivers/hwtracing/coresight/coresight-tmc.h | 4 + drivers/hwtracing/coresight/coresight-tpiu.c | 4 +- drivers/hwtracing/coresight/coresight.c | 22 +- include/linux/coresight.h | 5 +- 15 files changed, 375 insertions(+), 162 deletions(-)
The coresight components could be operated either in sysfs mode or in perf mode. For some of the components, the mode of operation doesn't matter as they simply relay the data to the next component in the trace path. But for sinks, they need to be able to provide the trace data back to the user. Thus we need to make sure that "mode" is handled appropriately. e.g, the sysfs mode could have multiple sources driving the trace data, while perf mode doesn't allow sharing the sink.
The coresight_enable_sink() however doesn't really allow this check to trigger as it skips the "enable_sink" callback if the component is already enabled, irrespective of the mode. This could cause mixing of data from different modes or even same mode (in perf), if the sources are different. Also, if we fail to enable the sink while enabling a path (where sink is the first component enabled), we could end up in disabling the components in the "entire" path which were not enabled in this trial, causing disruptions in the existing trace paths.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com --- drivers/hwtracing/coresight/coresight.c | 22 +++++++++++++++------- 1 file changed, 15 insertions(+), 7 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight.c b/drivers/hwtracing/coresight/coresight.c index 3e07fd3..c0dabbd 100644 --- a/drivers/hwtracing/coresight/coresight.c +++ b/drivers/hwtracing/coresight/coresight.c @@ -132,12 +132,14 @@ static int coresight_enable_sink(struct coresight_device *csdev, u32 mode) { int ret;
- if (!csdev->enable) { - if (sink_ops(csdev)->enable) { - ret = sink_ops(csdev)->enable(csdev, mode); - if (ret) - return ret; - } + /* + * We need to make sure the "new" session is compatible with the + * existing "mode" of operation. + */ + if (sink_ops(csdev)->enable) { + ret = sink_ops(csdev)->enable(csdev, mode); + if (ret) + return ret; csdev->enable = true; }
@@ -339,8 +341,14 @@ int coresight_enable_path(struct list_head *path, u32 mode) switch (type) { case CORESIGHT_DEV_TYPE_SINK: ret = coresight_enable_sink(csdev, mode); + /* + * Sink is the first component turned on. If we + * failed to enable the sink, there are no components + * that need disabling. Disabling the path here + * would mean we could disrupt an existing session. + */ if (ret) - goto err; + goto out; break; case CORESIGHT_DEV_TYPE_SOURCE: /* sources are enabled from either sysFS or Perf */
Since the ETR could be driven either by SYSFS or by perf, it becomes complicated how we deal with the buffers used for each of these modes. The ETR driver cannot simply free the current attached buffer without knowing the provider (i.e, sysfs vs perf).
To solve this issue, we provide: 1) the driver-mode specific etr buffer to be retained in the drvdata 2) the etr_buf for a session should be passed on when enabling the hardware, which will be stored in drvdata->etr_buf. This will be replaced (not free'd) as soon as the hardware is disabled, after necessary sync operation.
The advantages of this are :
1) The common code path doesn't need to worry about how to dispose an existing buffer, if it is about to start a new session with a different buffer, possibly in a different mode. 2) The driver mode can control its buffers and can get access to the saved session even when the hardware is operating in a different mode. (e.g, we can still access a trace buffer from a sysfs mode even if the etr is now used in perf mode, without disrupting the current session.)
Towards this, we introduce a sysfs specific data which will hold the etr_buf used for sysfs mode of operation, controlled solely by the sysfs mode handling code.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com --- drivers/hwtracing/coresight/coresight-tmc-etr.c | 58 ++++++++++++++++--------- drivers/hwtracing/coresight/coresight-tmc.h | 2 + 2 files changed, 40 insertions(+), 20 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index 2eda5de..edbfeaa 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -895,10 +895,15 @@ static void tmc_sync_etr_buf(struct tmc_drvdata *drvdata) tmc_etr_buf_insert_barrier_packet(etr_buf, etr_buf->offset); }
-static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata) +static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata, + struct etr_buf *etr_buf) { u32 axictl, sts; - struct etr_buf *etr_buf = drvdata->etr_buf; + + /* Callers should provide an appropriate buffer for use */ + if (WARN_ON(!etr_buf || drvdata->etr_buf)) + return; + drvdata->etr_buf = etr_buf;
/* * If this ETR is connected to a CATU, enable it before we turn @@ -960,13 +965,16 @@ static void tmc_etr_enable_hw(struct tmc_drvdata *drvdata) * also updating the @bufpp on where to find it. Since the trace data * starts at anywhere in the buffer, depending on the RRP, we adjust the * @len returned to handle buffer wrapping around. + * + * We are protected here by drvdata->reading != 0, which ensures the + * sysfs_buf stays alive. */ ssize_t tmc_etr_get_sysfs_trace(struct tmc_drvdata *drvdata, loff_t pos, size_t len, char **bufpp) { s64 offset; ssize_t actual = len; - struct etr_buf *etr_buf = drvdata->etr_buf; + struct etr_buf *etr_buf = drvdata->sysfs_buf;
if (pos + actual > etr_buf->len) actual = etr_buf->len - pos; @@ -996,7 +1004,14 @@ tmc_etr_free_sysfs_buf(struct etr_buf *buf)
static void tmc_etr_sync_sysfs_buf(struct tmc_drvdata *drvdata) { - tmc_sync_etr_buf(drvdata); + struct etr_buf *etr_buf = drvdata->etr_buf; + + if (WARN_ON(drvdata->sysfs_buf != etr_buf)) { + tmc_etr_free_sysfs_buf(drvdata->sysfs_buf); + drvdata->sysfs_buf = NULL; + } else { + tmc_sync_etr_buf(drvdata); + } }
static void tmc_etr_disable_hw(struct tmc_drvdata *drvdata) @@ -1017,6 +1032,8 @@ static void tmc_etr_disable_hw(struct tmc_drvdata *drvdata)
/* Disable CATU device if this ETR is connected to one */ tmc_etr_disable_catu(drvdata); + /* Reset the ETR buf used by hardware */ + drvdata->etr_buf = NULL; }
static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev) @@ -1024,7 +1041,7 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev) int ret = 0; unsigned long flags; struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); - struct etr_buf *new_buf = NULL, *free_buf = NULL; + struct etr_buf *sysfs_buf = NULL, *new_buf = NULL, *free_buf = NULL;
/* * If we are enabling the ETR from disabled state, we need to make @@ -1035,7 +1052,8 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev) * with the lock released. */ spin_lock_irqsave(&drvdata->spinlock, flags); - if (!drvdata->etr_buf || (drvdata->etr_buf->size != drvdata->size)) { + sysfs_buf = READ_ONCE(drvdata->sysfs_buf); + if (!sysfs_buf || (sysfs_buf->size != drvdata->size)) { spin_unlock_irqrestore(&drvdata->spinlock, flags);
/* Allocate memory with the locks released */ @@ -1064,14 +1082,14 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev) * If we don't have a buffer or it doesn't match the requested size, * use the buffer allocated above. Otherwise reuse the existing buffer. */ - if (!drvdata->etr_buf || - (new_buf && drvdata->etr_buf->size != new_buf->size)) { - free_buf = drvdata->etr_buf; - drvdata->etr_buf = new_buf; + sysfs_buf = READ_ONCE(drvdata->etr_buf); + if (!sysfs_buf || (new_buf && sysfs_buf->size != new_buf->size)) { + free_buf = sysfs_buf; + drvdata->sysfs_buf = new_buf; }
drvdata->mode = CS_MODE_SYSFS; - tmc_etr_enable_hw(drvdata); + tmc_etr_enable_hw(drvdata, drvdata->sysfs_buf); out: spin_unlock_irqrestore(&drvdata->spinlock, flags);
@@ -1156,13 +1174,13 @@ int tmc_read_prepare_etr(struct tmc_drvdata *drvdata) goto out; }
- /* If drvdata::etr_buf is NULL the trace data has been read already */ - if (drvdata->etr_buf == NULL) { + /* If sysfs_buf is NULL the trace data has been read already */ + if (!drvdata->sysfs_buf) { ret = -EINVAL; goto out; }
- /* Disable the TMC if need be */ + /* Disable the TMC if we are trying to read from a running session */ if (drvdata->mode == CS_MODE_SYSFS) tmc_etr_disable_hw(drvdata);
@@ -1176,7 +1194,7 @@ int tmc_read_prepare_etr(struct tmc_drvdata *drvdata) int tmc_read_unprepare_etr(struct tmc_drvdata *drvdata) { unsigned long flags; - struct etr_buf *etr_buf = NULL; + struct etr_buf *sysfs_buf = NULL;
/* config types are set a boot time and never change */ if (WARN_ON_ONCE(drvdata->config_type != TMC_CONFIG_TYPE_ETR)) @@ -1191,22 +1209,22 @@ int tmc_read_unprepare_etr(struct tmc_drvdata *drvdata) * buffer. Since the tracer is still enabled drvdata::buf can't * be NULL. */ - tmc_etr_enable_hw(drvdata); + tmc_etr_enable_hw(drvdata, drvdata->sysfs_buf); } else { /* * The ETR is not tracing and the buffer was just read. * As such prepare to free the trace buffer. */ - etr_buf = drvdata->etr_buf; - drvdata->etr_buf = NULL; + sysfs_buf = drvdata->sysfs_buf; + drvdata->sysfs_buf = NULL; }
drvdata->reading = false; spin_unlock_irqrestore(&drvdata->spinlock, flags);
/* Free allocated memory out side of the spinlock */ - if (etr_buf) - tmc_free_etr_buf(etr_buf); + if (sysfs_buf) + tmc_etr_free_sysfs_buf(sysfs_buf);
return 0; } diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h index 7027bd6..872f63e 100644 --- a/drivers/hwtracing/coresight/coresight-tmc.h +++ b/drivers/hwtracing/coresight/coresight-tmc.h @@ -170,6 +170,7 @@ struct etr_buf { * @trigger_cntr: amount of words to store after a trigger. * @etr_caps: Bitmask of capabilities of the TMC ETR, inferred from the * device configuration register (DEVID) + * @sysfs_data: SYSFS buffer for ETR. */ struct tmc_drvdata { void __iomem *base; @@ -189,6 +190,7 @@ struct tmc_drvdata { enum tmc_mem_intf_width memwidth; u32 trigger_cntr; u32 etr_caps; + struct etr_buf *sysfs_buf; };
struct etr_buf_operations {
Since the ETR now uses mode specific buffers, we can reliably provide the trace data captured in sysfs mode, even when the ETR is operating in PERF mode.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com --- drivers/hwtracing/coresight/coresight-tmc-etr.c | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index edbfeaa..e54c8a4 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -1168,19 +1168,17 @@ int tmc_read_prepare_etr(struct tmc_drvdata *drvdata) goto out; }
- /* Don't interfere if operated from Perf */ - if (drvdata->mode == CS_MODE_PERF) { - ret = -EINVAL; - goto out; - } - - /* If sysfs_buf is NULL the trace data has been read already */ + /* + * We can safely allow reads even if the ETR is operating in PERF mode, + * since the sysfs session is captured in mode specific data. + * If drvdata::sysfs_data is NULL the trace data has been read already. + */ if (!drvdata->sysfs_buf) { ret = -EINVAL; goto out; }
- /* Disable the TMC if we are trying to read from a running session */ + /* Disable the TMC if we are trying to read from a running session. */ if (drvdata->mode == CS_MODE_SYSFS) tmc_etr_disable_hw(drvdata);
Convert component enable/disable messages from dev_info to dev_dbg. When used with perf, the components in the paths are enabled/disabled during each schedule of the run, which can flood the dmesg with these messages. Moreover, they are only useful for debug purposes. So, convert such messages to dev_dbg() which can be turned on as needed.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com --- drivers/hwtracing/coresight/coresight-dynamic-replicator.c | 4 ++-- drivers/hwtracing/coresight/coresight-etb10.c | 6 +++--- drivers/hwtracing/coresight/coresight-etm3x.c | 4 ++-- drivers/hwtracing/coresight/coresight-etm4x.c | 4 ++-- drivers/hwtracing/coresight/coresight-funnel.c | 4 ++-- drivers/hwtracing/coresight/coresight-replicator.c | 4 ++-- drivers/hwtracing/coresight/coresight-stm.c | 4 ++-- drivers/hwtracing/coresight/coresight-tmc-etf.c | 8 ++++---- drivers/hwtracing/coresight/coresight-tmc-etr.c | 4 ++-- drivers/hwtracing/coresight/coresight-tmc.c | 4 ++-- drivers/hwtracing/coresight/coresight-tpiu.c | 4 ++-- 11 files changed, 25 insertions(+), 25 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-dynamic-replicator.c b/drivers/hwtracing/coresight/coresight-dynamic-replicator.c index f6d0571..ebb8043 100644 --- a/drivers/hwtracing/coresight/coresight-dynamic-replicator.c +++ b/drivers/hwtracing/coresight/coresight-dynamic-replicator.c @@ -56,7 +56,7 @@ static int replicator_enable(struct coresight_device *csdev, int inport,
CS_LOCK(drvdata->base);
- dev_info(drvdata->dev, "REPLICATOR enabled\n"); + dev_dbg(drvdata->dev, "REPLICATOR enabled\n"); return 0; }
@@ -75,7 +75,7 @@ static void replicator_disable(struct coresight_device *csdev, int inport,
CS_LOCK(drvdata->base);
- dev_info(drvdata->dev, "REPLICATOR disabled\n"); + dev_dbg(drvdata->dev, "REPLICATOR disabled\n"); }
static const struct coresight_ops_link replicator_link_ops = { diff --git a/drivers/hwtracing/coresight/coresight-etb10.c b/drivers/hwtracing/coresight/coresight-etb10.c index 306119e..73878a6 100644 --- a/drivers/hwtracing/coresight/coresight-etb10.c +++ b/drivers/hwtracing/coresight/coresight-etb10.c @@ -156,7 +156,7 @@ static int etb_enable(struct coresight_device *csdev, u32 mode) spin_unlock_irqrestore(&drvdata->spinlock, flags);
out: - dev_info(drvdata->dev, "ETB enabled\n"); + dev_dbg(drvdata->dev, "ETB enabled\n"); return 0; }
@@ -262,7 +262,7 @@ static void etb_disable(struct coresight_device *csdev)
local_set(&drvdata->mode, CS_MODE_DISABLED);
- dev_info(drvdata->dev, "ETB disabled\n"); + dev_dbg(drvdata->dev, "ETB disabled\n"); }
static void *etb_alloc_buffer(struct coresight_device *csdev, int cpu, @@ -505,7 +505,7 @@ static void etb_dump(struct etb_drvdata *drvdata) } spin_unlock_irqrestore(&drvdata->spinlock, flags);
- dev_info(drvdata->dev, "ETB dumped\n"); + dev_dbg(drvdata->dev, "ETB dumped\n"); }
static int etb_open(struct inode *inode, struct file *file) diff --git a/drivers/hwtracing/coresight/coresight-etm3x.c b/drivers/hwtracing/coresight/coresight-etm3x.c index 7c74263..9ce8fba 100644 --- a/drivers/hwtracing/coresight/coresight-etm3x.c +++ b/drivers/hwtracing/coresight/coresight-etm3x.c @@ -501,7 +501,7 @@ static int etm_enable_sysfs(struct coresight_device *csdev) drvdata->sticky_enable = true; spin_unlock(&drvdata->spinlock);
- dev_info(drvdata->dev, "ETM tracing enabled\n"); + dev_dbg(drvdata->dev, "ETM tracing enabled\n"); return 0;
err: @@ -604,7 +604,7 @@ static void etm_disable_sysfs(struct coresight_device *csdev) spin_unlock(&drvdata->spinlock); cpus_read_unlock();
- dev_info(drvdata->dev, "ETM tracing disabled\n"); + dev_dbg(drvdata->dev, "ETM tracing disabled\n"); }
static void etm_disable(struct coresight_device *csdev, diff --git a/drivers/hwtracing/coresight/coresight-etm4x.c b/drivers/hwtracing/coresight/coresight-etm4x.c index 1d94ebe..c1dcc7c 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x.c +++ b/drivers/hwtracing/coresight/coresight-etm4x.c @@ -267,7 +267,7 @@ static int etm4_enable_sysfs(struct coresight_device *csdev) drvdata->sticky_enable = true; spin_unlock(&drvdata->spinlock);
- dev_info(drvdata->dev, "ETM tracing enabled\n"); + dev_dbg(drvdata->dev, "ETM tracing enabled\n"); return 0;
err: @@ -380,7 +380,7 @@ static void etm4_disable_sysfs(struct coresight_device *csdev) spin_unlock(&drvdata->spinlock); cpus_read_unlock();
- dev_info(drvdata->dev, "ETM tracing disabled\n"); + dev_dbg(drvdata->dev, "ETM tracing disabled\n"); }
static void etm4_disable(struct coresight_device *csdev, diff --git a/drivers/hwtracing/coresight/coresight-funnel.c b/drivers/hwtracing/coresight/coresight-funnel.c index 448145a..ee7a30b 100644 --- a/drivers/hwtracing/coresight/coresight-funnel.c +++ b/drivers/hwtracing/coresight/coresight-funnel.c @@ -65,7 +65,7 @@ static int funnel_enable(struct coresight_device *csdev, int inport,
funnel_enable_hw(drvdata, inport);
- dev_info(drvdata->dev, "FUNNEL inport %d enabled\n", inport); + dev_dbg(drvdata->dev, "FUNNEL inport %d enabled\n", inport); return 0; }
@@ -89,7 +89,7 @@ static void funnel_disable(struct coresight_device *csdev, int inport,
funnel_disable_hw(drvdata, inport);
- dev_info(drvdata->dev, "FUNNEL inport %d disabled\n", inport); + dev_dbg(drvdata->dev, "FUNNEL inport %d disabled\n", inport); }
static const struct coresight_ops_link funnel_link_ops = { diff --git a/drivers/hwtracing/coresight/coresight-replicator.c b/drivers/hwtracing/coresight/coresight-replicator.c index 8d2eaaa..feac983 100644 --- a/drivers/hwtracing/coresight/coresight-replicator.c +++ b/drivers/hwtracing/coresight/coresight-replicator.c @@ -35,7 +35,7 @@ static int replicator_enable(struct coresight_device *csdev, int inport, { struct replicator_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
- dev_info(drvdata->dev, "REPLICATOR enabled\n"); + dev_dbg(drvdata->dev, "REPLICATOR enabled\n"); return 0; }
@@ -44,7 +44,7 @@ static void replicator_disable(struct coresight_device *csdev, int inport, { struct replicator_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
- dev_info(drvdata->dev, "REPLICATOR disabled\n"); + dev_dbg(drvdata->dev, "REPLICATOR disabled\n"); }
static const struct coresight_ops_link replicator_link_ops = { diff --git a/drivers/hwtracing/coresight/coresight-stm.c b/drivers/hwtracing/coresight/coresight-stm.c index c46c70a..35d6f97 100644 --- a/drivers/hwtracing/coresight/coresight-stm.c +++ b/drivers/hwtracing/coresight/coresight-stm.c @@ -211,7 +211,7 @@ static int stm_enable(struct coresight_device *csdev, stm_enable_hw(drvdata); spin_unlock(&drvdata->spinlock);
- dev_info(drvdata->dev, "STM tracing enabled\n"); + dev_dbg(drvdata->dev, "STM tracing enabled\n"); return 0; }
@@ -274,7 +274,7 @@ static void stm_disable(struct coresight_device *csdev, pm_runtime_put(drvdata->dev);
local_set(&drvdata->mode, CS_MODE_DISABLED); - dev_info(drvdata->dev, "STM tracing disabled\n"); + dev_dbg(drvdata->dev, "STM tracing disabled\n"); } }
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c index 0549249..434003a 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c @@ -233,7 +233,7 @@ static int tmc_enable_etf_sink(struct coresight_device *csdev, u32 mode) if (ret) return ret;
- dev_info(drvdata->dev, "TMC-ETB/ETF enabled\n"); + dev_dbg(drvdata->dev, "TMC-ETB/ETF enabled\n"); return 0; }
@@ -256,7 +256,7 @@ static void tmc_disable_etf_sink(struct coresight_device *csdev)
spin_unlock_irqrestore(&drvdata->spinlock, flags);
- dev_info(drvdata->dev, "TMC-ETB/ETF disabled\n"); + dev_dbg(drvdata->dev, "TMC-ETB/ETF disabled\n"); }
static int tmc_enable_etf_link(struct coresight_device *csdev, @@ -275,7 +275,7 @@ static int tmc_enable_etf_link(struct coresight_device *csdev, drvdata->mode = CS_MODE_SYSFS; spin_unlock_irqrestore(&drvdata->spinlock, flags);
- dev_info(drvdata->dev, "TMC-ETF enabled\n"); + dev_dbg(drvdata->dev, "TMC-ETF enabled\n"); return 0; }
@@ -295,7 +295,7 @@ static void tmc_disable_etf_link(struct coresight_device *csdev, drvdata->mode = CS_MODE_DISABLED; spin_unlock_irqrestore(&drvdata->spinlock, flags);
- dev_info(drvdata->dev, "TMC-ETF disabled\n"); + dev_dbg(drvdata->dev, "TMC-ETF disabled\n"); }
static void *tmc_alloc_etf_buffer(struct coresight_device *csdev, int cpu, diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index e54c8a4..20a0beb 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -1098,7 +1098,7 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev) tmc_etr_free_sysfs_buf(free_buf);
if (!ret) - dev_info(drvdata->dev, "TMC-ETR enabled\n"); + dev_dbg(drvdata->dev, "TMC-ETR enabled\n");
return ret; } @@ -1141,7 +1141,7 @@ static void tmc_disable_etr_sink(struct coresight_device *csdev)
spin_unlock_irqrestore(&drvdata->spinlock, flags);
- dev_info(drvdata->dev, "TMC-ETR disabled\n"); + dev_dbg(drvdata->dev, "TMC-ETR disabled\n"); }
static const struct coresight_ops_sink tmc_etr_sink_ops = { diff --git a/drivers/hwtracing/coresight/coresight-tmc.c b/drivers/hwtracing/coresight/coresight-tmc.c index 1b817ec..ea249f0 100644 --- a/drivers/hwtracing/coresight/coresight-tmc.c +++ b/drivers/hwtracing/coresight/coresight-tmc.c @@ -81,7 +81,7 @@ static int tmc_read_prepare(struct tmc_drvdata *drvdata) }
if (!ret) - dev_info(drvdata->dev, "TMC read start\n"); + dev_dbg(drvdata->dev, "TMC read start\n");
return ret; } @@ -103,7 +103,7 @@ static int tmc_read_unprepare(struct tmc_drvdata *drvdata) }
if (!ret) - dev_info(drvdata->dev, "TMC read end\n"); + dev_dbg(drvdata->dev, "TMC read end\n");
return ret; } diff --git a/drivers/hwtracing/coresight/coresight-tpiu.c b/drivers/hwtracing/coresight/coresight-tpiu.c index 01b7457..7daeb084 100644 --- a/drivers/hwtracing/coresight/coresight-tpiu.c +++ b/drivers/hwtracing/coresight/coresight-tpiu.c @@ -73,7 +73,7 @@ static int tpiu_enable(struct coresight_device *csdev, u32 mode)
tpiu_enable_hw(drvdata);
- dev_info(drvdata->dev, "TPIU enabled\n"); + dev_dbg(drvdata->dev, "TPIU enabled\n"); return 0; }
@@ -99,7 +99,7 @@ static void tpiu_disable(struct coresight_device *csdev)
tpiu_disable_hw(drvdata);
- dev_info(drvdata->dev, "TPIU disabled\n"); + dev_dbg(drvdata->dev, "TPIU disabled\n"); }
static const struct coresight_ops_sink tpiu_sink_ops = {
Right now we issue an update_buffer() and reset_buffer() call backs in succession when we stop tracing an event. The update_buffer is supposed to check the status of the buffer and make sure the ring buffer is updated with the trace data. And we store information about the size of the data collected only to be consumed by the reset_buffer callback which always follows the update_buffer. This was originally designed for handling future IPs which could trigger a buffer overflow interrupt. This patch gets rid of the reset_buffer callback altogether and performs the actions in update_buffer, making it return the size collected. We can always add the support for handling the overflow interrupt case later.
This removes some not-so pretty hack (storing the new head in the size field for snapshot mode) and cleans it up a little bit.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com --- drivers/hwtracing/coresight/coresight-etb10.c | 56 +++++------------------ drivers/hwtracing/coresight/coresight-etm-perf.c | 9 +--- drivers/hwtracing/coresight/coresight-tmc-etf.c | 58 +++++------------------- include/linux/coresight.h | 5 +- 4 files changed, 26 insertions(+), 102 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-etb10.c b/drivers/hwtracing/coresight/coresight-etb10.c index 73878a6..a729a7b 100644 --- a/drivers/hwtracing/coresight/coresight-etb10.c +++ b/drivers/hwtracing/coresight/coresight-etb10.c @@ -315,37 +315,7 @@ static int etb_set_buffer(struct coresight_device *csdev, return ret; }
-static unsigned long etb_reset_buffer(struct coresight_device *csdev, - struct perf_output_handle *handle, - void *sink_config) -{ - unsigned long size = 0; - struct cs_buffers *buf = sink_config; - - if (buf) { - /* - * In snapshot mode ->data_size holds the new address of the - * ring buffer's head. The size itself is the whole address - * range since we want the latest information. - */ - if (buf->snapshot) - handle->head = local_xchg(&buf->data_size, - buf->nr_pages << PAGE_SHIFT); - - /* - * Tell the tracer PMU how much we got in this run and if - * something went wrong along the way. Nobody else can use - * this cs_buffers instance until we are done. As such - * resetting parameters here and squaring off with the ring - * buffer API in the tracer PMU is fine. - */ - size = local_xchg(&buf->data_size, 0); - } - - return size; -} - -static void etb_update_buffer(struct coresight_device *csdev, +static unsigned long etb_update_buffer(struct coresight_device *csdev, struct perf_output_handle *handle, void *sink_config) { @@ -354,13 +324,13 @@ static void etb_update_buffer(struct coresight_device *csdev, u8 *buf_ptr; const u32 *barrier; u32 read_ptr, write_ptr, capacity; - u32 status, read_data, to_read; - unsigned long offset; + u32 status, read_data; + unsigned long offset, to_read; struct cs_buffers *buf = sink_config; struct etb_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
if (!buf) - return; + return 0;
capacity = drvdata->buffer_depth * ETB_FRAME_SIZE_WORDS;
@@ -465,18 +435,17 @@ static void etb_update_buffer(struct coresight_device *csdev, writel_relaxed(0x0, drvdata->base + ETB_RAM_WRITE_POINTER);
/* - * In snapshot mode all we have to do is communicate to - * perf_aux_output_end() the address of the current head. In full - * trace mode the same function expects a size to move rb->aux_head - * forward. + * In snapshot mode we have to update the handle->head to point + * to the new location. */ - if (buf->snapshot) - local_set(&buf->data_size, (cur * PAGE_SIZE) + offset); - else - local_add(to_read, &buf->data_size); - + if (buf->snapshot) { + handle->head = (cur * PAGE_SIZE) + offset; + to_read = buf->nr_pages << PAGE_SHIFT; + } etb_enable_hw(drvdata); CS_LOCK(drvdata->base); + + return to_read; }
static const struct coresight_ops_sink etb_sink_ops = { @@ -485,7 +454,6 @@ static const struct coresight_ops_sink etb_sink_ops = { .alloc_buffer = etb_alloc_buffer, .free_buffer = etb_free_buffer, .set_buffer = etb_set_buffer, - .reset_buffer = etb_reset_buffer, .update_buffer = etb_update_buffer, };
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c index 6776956..c305228 100644 --- a/drivers/hwtracing/coresight/coresight-etm-perf.c +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c @@ -331,15 +331,8 @@ static void etm_event_stop(struct perf_event *event, int mode) if (!sink_ops(sink)->update_buffer) return;
- sink_ops(sink)->update_buffer(sink, handle, + size = sink_ops(sink)->update_buffer(sink, handle, event_data->snk_config); - - if (!sink_ops(sink)->reset_buffer) - return; - - size = sink_ops(sink)->reset_buffer(sink, handle, - event_data->snk_config); - perf_aux_output_end(handle, size); }
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c index 434003a..31a98f9 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c @@ -349,36 +349,7 @@ static int tmc_set_etf_buffer(struct coresight_device *csdev, return ret; }
-static unsigned long tmc_reset_etf_buffer(struct coresight_device *csdev, - struct perf_output_handle *handle, - void *sink_config) -{ - long size = 0; - struct cs_buffers *buf = sink_config; - - if (buf) { - /* - * In snapshot mode ->data_size holds the new address of the - * ring buffer's head. The size itself is the whole address - * range since we want the latest information. - */ - if (buf->snapshot) - handle->head = local_xchg(&buf->data_size, - buf->nr_pages << PAGE_SHIFT); - /* - * Tell the tracer PMU how much we got in this run and if - * something went wrong along the way. Nobody else can use - * this cs_buffers instance until we are done. As such - * resetting parameters here and squaring off with the ring - * buffer API in the tracer PMU is fine. - */ - size = local_xchg(&buf->data_size, 0); - } - - return size; -} - -static void tmc_update_etf_buffer(struct coresight_device *csdev, +static unsigned long tmc_update_etf_buffer(struct coresight_device *csdev, struct perf_output_handle *handle, void *sink_config) { @@ -387,17 +358,17 @@ static void tmc_update_etf_buffer(struct coresight_device *csdev, const u32 *barrier; u32 *buf_ptr; u64 read_ptr, write_ptr; - u32 status, to_read; - unsigned long offset; + u32 status; + unsigned long offset, to_read; struct cs_buffers *buf = sink_config; struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
if (!buf) - return; + return 0;
/* This shouldn't happen */ if (WARN_ON_ONCE(drvdata->mode != CS_MODE_PERF)) - return; + return 0;
CS_UNLOCK(drvdata->base);
@@ -486,18 +457,14 @@ static void tmc_update_etf_buffer(struct coresight_device *csdev, } }
- /* - * In snapshot mode all we have to do is communicate to - * perf_aux_output_end() the address of the current head. In full - * trace mode the same function expects a size to move rb->aux_head - * forward. - */ - if (buf->snapshot) - local_set(&buf->data_size, (cur * PAGE_SIZE) + offset); - else - local_add(to_read, &buf->data_size); - + /* In snapshot mode we have to update the head */ + if (buf->snapshot) { + handle->head = (cur * PAGE_SIZE) + offset; + to_read = buf->nr_pages << PAGE_SHIFT; + } CS_LOCK(drvdata->base); + + return to_read; }
static const struct coresight_ops_sink tmc_etf_sink_ops = { @@ -506,7 +473,6 @@ static const struct coresight_ops_sink tmc_etf_sink_ops = { .alloc_buffer = tmc_alloc_etf_buffer, .free_buffer = tmc_free_etf_buffer, .set_buffer = tmc_set_etf_buffer, - .reset_buffer = tmc_reset_etf_buffer, .update_buffer = tmc_update_etf_buffer, };
diff --git a/include/linux/coresight.h b/include/linux/coresight.h index d828a6e..47bb0ba 100644 --- a/include/linux/coresight.h +++ b/include/linux/coresight.h @@ -203,10 +203,7 @@ struct coresight_ops_sink { int (*set_buffer)(struct coresight_device *csdev, struct perf_output_handle *handle, void *sink_config); - unsigned long (*reset_buffer)(struct coresight_device *csdev, - struct perf_output_handle *handle, - void *sink_config); - void (*update_buffer)(struct coresight_device *csdev, + unsigned long (*update_buffer)(struct coresight_device *csdev, struct perf_output_handle *handle, void *sink_config); };
Add support for using TMC-ETR as backend for ETM perf tracing. We use software double buffering at the moment. i.e, the TMC-ETR uses a separate buffer than the perf ring buffer. The data is copied to the perf ring buffer once a session completes.
The TMC-ETR would try to match the larger of perf ring buffer or the ETR buffer size configured via sysfs, scaling down to a minimum limit of 1MB.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com --- drivers/hwtracing/coresight/coresight-tmc-etr.c | 265 +++++++++++++++++++++++- drivers/hwtracing/coresight/coresight-tmc.h | 2 + 2 files changed, 265 insertions(+), 2 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index 20a0beb..af921da 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -21,6 +21,28 @@ struct etr_flat_buf { };
/* + * etr_perf_buffer - Perf buffer used for ETR + * @etr_buf - Actual buffer used by the ETR + * @snaphost - Perf session mode + * @head - handle->head at the beginning of the session. + * @nr_pages - Number of pages in the ring buffer. + * @pages - Array of Pages in the ring buffer. + */ +struct etr_perf_buffer { + struct etr_buf *etr_buf; + bool snapshot; + unsigned long head; + int nr_pages; + void **pages; +}; + +/* Convert the perf index to an offset within the ETR buffer */ +#define PERF_IDX2OFF(idx, buf) ((idx) % ((buf)->nr_pages << PAGE_SHIFT)) + +/* Lower limit for ETR hardware buffer */ +#define TMC_ETR_PERF_MIN_BUF_SIZE SZ_1M + +/* * The TMC ETR SG has a page size of 4K. The SG table contains pointers * to 4KB buffers. However, the OS may use a PAGE_SIZE different from * 4K (i.e, 16KB or 64KB). This implies that a single OS page could @@ -1103,10 +1125,245 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev) return ret; }
+/* + * tmc_etr_setup_perf_buf: Allocate ETR buffer for use by perf. + * The size of the hardware buffer is dependent on the size configured + * via sysfs and the perf ring buffer size. We prefer to allocate the + * largest possible size, scaling down the size by half until it + * reaches a minimum limit (1M), beyond which we give up. + */ +static struct etr_perf_buffer * +tmc_etr_setup_perf_buf(struct tmc_drvdata *drvdata, int node, int nr_pages, + void **pages, bool snapshot) +{ + struct etr_buf *etr_buf; + struct etr_perf_buffer *etr_perf; + unsigned long size; + + etr_perf = kzalloc_node(sizeof(*etr_perf), GFP_KERNEL, node); + if (!etr_perf) + return ERR_PTR(-ENOMEM); + + /* + * Try to match the perf ring buffer size if it is larger + * than the size requested via sysfs. + */ + if ((nr_pages << PAGE_SHIFT) > drvdata->size) { + etr_buf = tmc_alloc_etr_buf(drvdata, (nr_pages << PAGE_SHIFT), + 0, node, NULL); + if (!IS_ERR(etr_buf)) + goto done; + } + + /* + * Else switch to configured size for this ETR + * and scale down until we hit the minimum limit. + */ + size = drvdata->size; + do { + etr_buf = tmc_alloc_etr_buf(drvdata, size, 0, node, NULL); + if (!IS_ERR(etr_buf)) + goto done; + size /= 2; + } while (size >= TMC_ETR_PERF_MIN_BUF_SIZE); + + kfree(etr_perf); + return ERR_PTR(-ENOMEM); + +done: + etr_perf->etr_buf = etr_buf; + return etr_perf; +} + + +static void *tmc_etr_alloc_perf_buffer(struct coresight_device *csdev, + int cpu, void **pages, int nr_pages, + bool snapshot) +{ + struct etr_perf_buffer *etr_perf; + struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); + + if (cpu == -1) + cpu = smp_processor_id(); + + etr_perf = tmc_etr_setup_perf_buf(drvdata, cpu_to_node(cpu), + nr_pages, pages, snapshot); + if (IS_ERR(etr_perf)) { + dev_dbg(drvdata->dev, "Unable to allocate ETR buffer\n"); + return NULL; + } + + etr_perf->snapshot = snapshot; + etr_perf->nr_pages = nr_pages; + etr_perf->pages = pages; + + return etr_perf; +} + +static void tmc_etr_free_perf_buffer(void *config) +{ + struct etr_perf_buffer *etr_perf = config; + + if (etr_perf->etr_buf) + tmc_free_etr_buf(etr_perf->etr_buf); + kfree(etr_perf); +} + +static int tmc_etr_set_perf_buffer(struct coresight_device *csdev, + struct perf_output_handle *handle, + void *config) +{ + int rc = 0; + unsigned long flags; + struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); + struct etr_perf_buffer *etr_perf = config; + + etr_perf->head = handle->head; + spin_lock_irqsave(&drvdata->spinlock, flags); + if (WARN_ON(drvdata->perf_data)) + rc = -EBUSY; + else + drvdata->perf_data = etr_perf; + spin_unlock_irqrestore(&drvdata->spinlock, flags); + return rc; +} + +/* + * tmc_etr_sync_perf_buffer: Copy the actual trace data from the hardware + * buffer to the perf ring buffer. + */ +static void tmc_etr_sync_perf_buffer(struct etr_perf_buffer *etr_perf) +{ + long bytes, to_copy; + long pg_idx, pg_offset, src_offset; + unsigned long head = etr_perf->head; + char **dst_pages, *src_buf; + struct etr_buf *etr_buf = etr_perf->etr_buf; + + head = PERF_IDX2OFF(etr_perf->head, etr_perf); + pg_idx = head >> PAGE_SHIFT; + pg_offset = head & (PAGE_SIZE - 1); + dst_pages = (char **)etr_perf->pages; + src_offset = etr_buf->offset; + to_copy = etr_buf->len; + + while (to_copy > 0) { + /* + * We can copy minimum of : + * 1) what is available in the source buffer, + * 2) what is available in the source buffer, before it + * wraps around. + * 3) what is available in the destination page. + * in one iteration. + */ + bytes = tmc_etr_buf_get_data(etr_buf, src_offset, to_copy, + &src_buf); + if (WARN_ON_ONCE(bytes <= 0)) + break; + bytes = min(bytes, (long)(PAGE_SIZE - pg_offset)); + + memcpy(dst_pages[pg_idx] + pg_offset, src_buf, bytes); + + to_copy -= bytes; + + /* Move destination pointers */ + pg_offset += bytes; + if (pg_offset == PAGE_SIZE) { + pg_offset = 0; + if (++pg_idx == etr_perf->nr_pages) + pg_idx = 0; + } + + /* Move source pointers */ + src_offset += bytes; + if (src_offset >= etr_buf->size) + src_offset -= etr_buf->size; + } +} + +/* + * tmc_etr_update_perf_buffer : Update the perf ring buffer with the + * available trace data. We use software double buffering at the moment. + * + * TODO: Add support for reusing the perf ring buffer. + */ +static unsigned long +tmc_etr_update_perf_buffer(struct coresight_device *csdev, + struct perf_output_handle *handle, + void *config) +{ + bool lost = false; + unsigned long flags, size = 0; + struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); + struct etr_perf_buffer *etr_perf = config; + struct etr_buf *etr_buf = etr_perf->etr_buf; + + spin_lock_irqsave(&drvdata->spinlock, flags); + if (WARN_ON(drvdata->perf_data != etr_perf)) { + lost = true; + spin_unlock_irqrestore(&drvdata->spinlock, flags); + goto out; + } + + CS_UNLOCK(drvdata->base); + + tmc_flush_and_stop(drvdata); + tmc_sync_etr_buf(drvdata); + + CS_UNLOCK(drvdata->base); + /* Reset perf specific data */ + drvdata->perf_data = NULL; + spin_unlock_irqrestore(&drvdata->spinlock, flags); + + size = etr_buf->len; + tmc_etr_sync_perf_buffer(etr_perf); + + /* + * Update handle->head in snapshot mode. Also update the size to the + * hardware buffer size if there was an overflow. + */ + if (etr_perf->snapshot) { + handle->head += size; + if (etr_buf->full) + size = etr_buf->size; + } + + lost |= etr_buf->full; +out: + if (lost) + perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED); + return size; +} + static int tmc_enable_etr_sink_perf(struct coresight_device *csdev) { - /* We don't support perf mode yet ! */ - return -EINVAL; + int rc = 0; + unsigned long flags; + struct etr_perf_buffer *etr_perf; + struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent); + + spin_lock_irqsave(&drvdata->spinlock, flags); + /* + * There can be only one writer per sink in perf mode. If the sink + * is already open in SYSFS mode, we can't use it. + */ + if (drvdata->mode != CS_MODE_DISABLED) { + rc = -EBUSY; + goto unlock_out; + } + + etr_perf = drvdata->perf_data; + if (WARN_ON(!etr_perf || !etr_perf->etr_buf)) { + rc = -EINVAL; + goto unlock_out; + } + + drvdata->mode = CS_MODE_PERF; + tmc_etr_enable_hw(drvdata, etr_perf->etr_buf); + +unlock_out: + spin_unlock_irqrestore(&drvdata->spinlock, flags); + return rc; }
static int tmc_enable_etr_sink(struct coresight_device *csdev, u32 mode) @@ -1147,6 +1404,10 @@ static void tmc_disable_etr_sink(struct coresight_device *csdev) static const struct coresight_ops_sink tmc_etr_sink_ops = { .enable = tmc_enable_etr_sink, .disable = tmc_disable_etr_sink, + .alloc_buffer = tmc_etr_alloc_perf_buffer, + .update_buffer = tmc_etr_update_perf_buffer, + .set_buffer = tmc_etr_set_perf_buffer, + .free_buffer = tmc_etr_free_perf_buffer, };
const struct coresight_ops tmc_etr_cs_ops = { diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h index 872f63e..487c537 100644 --- a/drivers/hwtracing/coresight/coresight-tmc.h +++ b/drivers/hwtracing/coresight/coresight-tmc.h @@ -170,6 +170,7 @@ struct etr_buf { * @trigger_cntr: amount of words to store after a trigger. * @etr_caps: Bitmask of capabilities of the TMC ETR, inferred from the * device configuration register (DEVID) + * @perf_data: PERF buffer for ETR. * @sysfs_data: SYSFS buffer for ETR. */ struct tmc_drvdata { @@ -191,6 +192,7 @@ struct tmc_drvdata { u32 trigger_cntr; u32 etr_caps; struct etr_buf *sysfs_buf; + void *perf_data; };
struct etr_buf_operations {
Hi Suzuki,
On Wed, Jul 11, 2018 at 03:16:39PM +0100, Suzuki K Poulose wrote:
Add support for using TMC-ETR as backend for ETM perf tracing. We use software double buffering at the moment. i.e, the TMC-ETR uses a separate buffer than the perf ring buffer. The data is copied to the perf ring buffer once a session completes.
The TMC-ETR would try to match the larger of perf ring buffer or the ETR buffer size configured via sysfs, scaling down to a minimum limit of 1MB.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
drivers/hwtracing/coresight/coresight-tmc-etr.c | 265 +++++++++++++++++++++++- drivers/hwtracing/coresight/coresight-tmc.h | 2 + 2 files changed, 265 insertions(+), 2 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index 20a0beb..af921da 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -21,6 +21,28 @@ struct etr_flat_buf { }; /*
- etr_perf_buffer - Perf buffer used for ETR
- @etr_buf - Actual buffer used by the ETR
- @snaphost - Perf session mode
- @head - handle->head at the beginning of the session.
- @nr_pages - Number of pages in the ring buffer.
- @pages - Array of Pages in the ring buffer.
- */
+struct etr_perf_buffer {
- struct etr_buf *etr_buf;
- bool snapshot;
- unsigned long head;
- int nr_pages;
- void **pages;
+};
+/* Convert the perf index to an offset within the ETR buffer */ +#define PERF_IDX2OFF(idx, buf) ((idx) % ((buf)->nr_pages << PAGE_SHIFT))
+/* Lower limit for ETR hardware buffer */ +#define TMC_ETR_PERF_MIN_BUF_SIZE SZ_1M
+/*
- The TMC ETR SG has a page size of 4K. The SG table contains pointers
- to 4KB buffers. However, the OS may use a PAGE_SIZE different from
- 4K (i.e, 16KB or 64KB). This implies that a single OS page could
@@ -1103,10 +1125,245 @@ static int tmc_enable_etr_sink_sysfs(struct coresight_device *csdev) return ret; } +/*
- tmc_etr_setup_perf_buf: Allocate ETR buffer for use by perf.
- The size of the hardware buffer is dependent on the size configured
- via sysfs and the perf ring buffer size. We prefer to allocate the
- largest possible size, scaling down the size by half until it
- reaches a minimum limit (1M), beyond which we give up.
- */
+static struct etr_perf_buffer * +tmc_etr_setup_perf_buf(struct tmc_drvdata *drvdata, int node, int nr_pages,
void **pages, bool snapshot)
+{
- struct etr_buf *etr_buf;
- struct etr_perf_buffer *etr_perf;
- unsigned long size;
- etr_perf = kzalloc_node(sizeof(*etr_perf), GFP_KERNEL, node);
- if (!etr_perf)
return ERR_PTR(-ENOMEM);
- /*
* Try to match the perf ring buffer size if it is larger
* than the size requested via sysfs.
*/
- if ((nr_pages << PAGE_SHIFT) > drvdata->size) {
etr_buf = tmc_alloc_etr_buf(drvdata, (nr_pages << PAGE_SHIFT),
0, node, NULL);
if (!IS_ERR(etr_buf))
goto done;
- }
- /*
* Else switch to configured size for this ETR
* and scale down until we hit the minimum limit.
*/
- size = drvdata->size;
- do {
etr_buf = tmc_alloc_etr_buf(drvdata, size, 0, node, NULL);
if (!IS_ERR(etr_buf))
goto done;
size /= 2;
- } while (size >= TMC_ETR_PERF_MIN_BUF_SIZE);
- kfree(etr_perf);
- return ERR_PTR(-ENOMEM);
+done:
- etr_perf->etr_buf = etr_buf;
- return etr_perf;
+}
+static void *tmc_etr_alloc_perf_buffer(struct coresight_device *csdev,
int cpu, void **pages, int nr_pages,
bool snapshot)
+{
- struct etr_perf_buffer *etr_perf;
- struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
- if (cpu == -1)
cpu = smp_processor_id();
- etr_perf = tmc_etr_setup_perf_buf(drvdata, cpu_to_node(cpu),
nr_pages, pages, snapshot);
- if (IS_ERR(etr_perf)) {
dev_dbg(drvdata->dev, "Unable to allocate ETR buffer\n");
return NULL;
- }
- etr_perf->snapshot = snapshot;
- etr_perf->nr_pages = nr_pages;
- etr_perf->pages = pages;
- return etr_perf;
+}
+static void tmc_etr_free_perf_buffer(void *config) +{
- struct etr_perf_buffer *etr_perf = config;
- if (etr_perf->etr_buf)
tmc_free_etr_buf(etr_perf->etr_buf);
- kfree(etr_perf);
+}
+static int tmc_etr_set_perf_buffer(struct coresight_device *csdev,
struct perf_output_handle *handle,
void *config)
+{
- int rc = 0;
- unsigned long flags;
- struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
- struct etr_perf_buffer *etr_perf = config;
- etr_perf->head = handle->head;
- spin_lock_irqsave(&drvdata->spinlock, flags);
- if (WARN_ON(drvdata->perf_data))
rc = -EBUSY;
- else
drvdata->perf_data = etr_perf;
- spin_unlock_irqrestore(&drvdata->spinlock, flags);
- return rc;
+}
+/*
- tmc_etr_sync_perf_buffer: Copy the actual trace data from the hardware
- buffer to the perf ring buffer.
- */
+static void tmc_etr_sync_perf_buffer(struct etr_perf_buffer *etr_perf) +{
- long bytes, to_copy;
- long pg_idx, pg_offset, src_offset;
- unsigned long head = etr_perf->head;
- char **dst_pages, *src_buf;
- struct etr_buf *etr_buf = etr_perf->etr_buf;
- head = PERF_IDX2OFF(etr_perf->head, etr_perf);
I'm puzzled here. Since etr_perf->head is set in tmc_etr_set_perf_buffer() to the value of handle->head, its value is always within the range of the perf ring buffer. In tmc_etr_alloc_perf_buffer() etr_perf->nr_pages is set to the number of pages present in that ring buffer. As such I'm not sure as to why we need the PERF_IDX2OFF() macro.
It seems to me that "head = etr_perf->head;" above is sufficient.
- pg_idx = head >> PAGE_SHIFT;
- pg_offset = head & (PAGE_SIZE - 1);
- dst_pages = (char **)etr_perf->pages;
- src_offset = etr_buf->offset;
- to_copy = etr_buf->len;
- while (to_copy > 0) {
/*
* We can copy minimum of :
s/We can copy minimum of :/In one iteration we can copy a minimum of:/
If I'm wrong about the PERF_IDX2OFF(), don't bother respinning just for that.
Thanks, Mathieu
* 1) what is available in the source buffer,
* 2) what is available in the source buffer, before it
* wraps around.
* 3) what is available in the destination page.
* in one iteration.
*/
bytes = tmc_etr_buf_get_data(etr_buf, src_offset, to_copy,
&src_buf);
if (WARN_ON_ONCE(bytes <= 0))
break;
bytes = min(bytes, (long)(PAGE_SIZE - pg_offset));
memcpy(dst_pages[pg_idx] + pg_offset, src_buf, bytes);
to_copy -= bytes;
/* Move destination pointers */
pg_offset += bytes;
if (pg_offset == PAGE_SIZE) {
pg_offset = 0;
if (++pg_idx == etr_perf->nr_pages)
pg_idx = 0;
}
/* Move source pointers */
src_offset += bytes;
if (src_offset >= etr_buf->size)
src_offset -= etr_buf->size;
- }
+}
+/*
- tmc_etr_update_perf_buffer : Update the perf ring buffer with the
- available trace data. We use software double buffering at the moment.
- TODO: Add support for reusing the perf ring buffer.
- */
+static unsigned long +tmc_etr_update_perf_buffer(struct coresight_device *csdev,
struct perf_output_handle *handle,
void *config)
+{
- bool lost = false;
- unsigned long flags, size = 0;
- struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
- struct etr_perf_buffer *etr_perf = config;
- struct etr_buf *etr_buf = etr_perf->etr_buf;
- spin_lock_irqsave(&drvdata->spinlock, flags);
- if (WARN_ON(drvdata->perf_data != etr_perf)) {
lost = true;
spin_unlock_irqrestore(&drvdata->spinlock, flags);
goto out;
- }
- CS_UNLOCK(drvdata->base);
- tmc_flush_and_stop(drvdata);
- tmc_sync_etr_buf(drvdata);
- CS_UNLOCK(drvdata->base);
- /* Reset perf specific data */
- drvdata->perf_data = NULL;
- spin_unlock_irqrestore(&drvdata->spinlock, flags);
- size = etr_buf->len;
- tmc_etr_sync_perf_buffer(etr_perf);
- /*
* Update handle->head in snapshot mode. Also update the size to the
* hardware buffer size if there was an overflow.
*/
- if (etr_perf->snapshot) {
handle->head += size;
if (etr_buf->full)
size = etr_buf->size;
- }
- lost |= etr_buf->full;
+out:
- if (lost)
perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED);
- return size;
+}
static int tmc_enable_etr_sink_perf(struct coresight_device *csdev) {
- /* We don't support perf mode yet ! */
- return -EINVAL;
- int rc = 0;
- unsigned long flags;
- struct etr_perf_buffer *etr_perf;
- struct tmc_drvdata *drvdata = dev_get_drvdata(csdev->dev.parent);
- spin_lock_irqsave(&drvdata->spinlock, flags);
- /*
* There can be only one writer per sink in perf mode. If the sink
* is already open in SYSFS mode, we can't use it.
*/
- if (drvdata->mode != CS_MODE_DISABLED) {
rc = -EBUSY;
goto unlock_out;
- }
- etr_perf = drvdata->perf_data;
- if (WARN_ON(!etr_perf || !etr_perf->etr_buf)) {
rc = -EINVAL;
goto unlock_out;
- }
- drvdata->mode = CS_MODE_PERF;
- tmc_etr_enable_hw(drvdata, etr_perf->etr_buf);
+unlock_out:
- spin_unlock_irqrestore(&drvdata->spinlock, flags);
- return rc;
} static int tmc_enable_etr_sink(struct coresight_device *csdev, u32 mode) @@ -1147,6 +1404,10 @@ static void tmc_disable_etr_sink(struct coresight_device *csdev) static const struct coresight_ops_sink tmc_etr_sink_ops = { .enable = tmc_enable_etr_sink, .disable = tmc_disable_etr_sink,
- .alloc_buffer = tmc_etr_alloc_perf_buffer,
- .update_buffer = tmc_etr_update_perf_buffer,
- .set_buffer = tmc_etr_set_perf_buffer,
- .free_buffer = tmc_etr_free_perf_buffer,
}; const struct coresight_ops tmc_etr_cs_ops = { diff --git a/drivers/hwtracing/coresight/coresight-tmc.h b/drivers/hwtracing/coresight/coresight-tmc.h index 872f63e..487c537 100644 --- a/drivers/hwtracing/coresight/coresight-tmc.h +++ b/drivers/hwtracing/coresight/coresight-tmc.h @@ -170,6 +170,7 @@ struct etr_buf {
- @trigger_cntr: amount of words to store after a trigger.
- @etr_caps: Bitmask of capabilities of the TMC ETR, inferred from the
device configuration register (DEVID)
*/
- @perf_data: PERF buffer for ETR.
- @sysfs_data: SYSFS buffer for ETR.
struct tmc_drvdata { @@ -191,6 +192,7 @@ struct tmc_drvdata { u32 trigger_cntr; u32 etr_caps; struct etr_buf *sysfs_buf;
- void *perf_data;
}; struct etr_buf_operations { -- 2.7.4
On 12/07/18 21:57, Mathieu Poirier wrote:
Hi Suzuki,
On Wed, Jul 11, 2018 at 03:16:39PM +0100, Suzuki K Poulose wrote:
Add support for using TMC-ETR as backend for ETM perf tracing. We use software double buffering at the moment. i.e, the TMC-ETR uses a separate buffer than the perf ring buffer. The data is copied to the perf ring buffer once a session completes.
The TMC-ETR would try to match the larger of perf ring buffer or the ETR buffer size configured via sysfs, scaling down to a minimum limit of 1MB.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
drivers/hwtracing/coresight/coresight-tmc-etr.c | 265 +++++++++++++++++++++++- drivers/hwtracing/coresight/coresight-tmc.h | 2 + 2 files changed, 265 insertions(+), 2 deletions(-)
+/*
- tmc_etr_sync_perf_buffer: Copy the actual trace data from the hardware
- buffer to the perf ring buffer.
- */
+static void tmc_etr_sync_perf_buffer(struct etr_perf_buffer *etr_perf) +{
- long bytes, to_copy;
- long pg_idx, pg_offset, src_offset;
- unsigned long head = etr_perf->head;
- char **dst_pages, *src_buf;
- struct etr_buf *etr_buf = etr_perf->etr_buf;
- head = PERF_IDX2OFF(etr_perf->head, etr_perf);
I'm puzzled here. Since etr_perf->head is set in tmc_etr_set_perf_buffer() to the value of handle->head, its value is always within the range of the perf ring
IIUC, the handle->head is not necessarily guaranteed to be within the aux-ring buffer.
buffer. In tmc_etr_alloc_perf_buffer() etr_perf->nr_pages is set to the number of pages present in that ring buffer. As such I'm not sure as to why we need the PERF_IDX2OFF() macro.
It seems to me that "head = etr_perf->head;" above is sufficient.
We need to do the step at some point, which could be moved to set_perf_buffer.
- pg_idx = head >> PAGE_SHIFT;
- pg_offset = head & (PAGE_SIZE - 1);
- dst_pages = (char **)etr_perf->pages;
- src_offset = etr_buf->offset;
- to_copy = etr_buf->len;
- while (to_copy > 0) {
/*
* We can copy minimum of :
s/We can copy minimum of :/In one iteration we can copy a minimum of:/
If I'm wrong about the PERF_IDX2OFF(), don't bother respinning just for that.
I have some more fixes for handling the different modes (sysfs vs perf), which I can include in the v2.
Suzuki
On Fri, 13 Jul 2018 at 05:23, Suzuki K Poulose Suzuki.Poulose@arm.com wrote:
On 12/07/18 21:57, Mathieu Poirier wrote:
Hi Suzuki,
On Wed, Jul 11, 2018 at 03:16:39PM +0100, Suzuki K Poulose wrote:
Add support for using TMC-ETR as backend for ETM perf tracing. We use software double buffering at the moment. i.e, the TMC-ETR uses a separate buffer than the perf ring buffer. The data is copied to the perf ring buffer once a session completes.
The TMC-ETR would try to match the larger of perf ring buffer or the ETR buffer size configured via sysfs, scaling down to a minimum limit of 1MB.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
drivers/hwtracing/coresight/coresight-tmc-etr.c | 265 +++++++++++++++++++++++- drivers/hwtracing/coresight/coresight-tmc.h | 2 + 2 files changed, 265 insertions(+), 2 deletions(-)
+/*
- tmc_etr_sync_perf_buffer: Copy the actual trace data from the hardware
- buffer to the perf ring buffer.
- */
+static void tmc_etr_sync_perf_buffer(struct etr_perf_buffer *etr_perf) +{
- long bytes, to_copy;
- long pg_idx, pg_offset, src_offset;
- unsigned long head = etr_perf->head;
- char **dst_pages, *src_buf;
- struct etr_buf *etr_buf = etr_perf->etr_buf;
- head = PERF_IDX2OFF(etr_perf->head, etr_perf);
I'm puzzled here. Since etr_perf->head is set in tmc_etr_set_perf_buffer() to the value of handle->head, its value is always within the range of the perf ring
IIUC, the handle->head is not necessarily guaranteed to be within the aux-ring buffer.
I just looked at the ETB code and you are correct.
buffer. In tmc_etr_alloc_perf_buffer() etr_perf->nr_pages is set to the number of pages present in that ring buffer. As such I'm not sure as to why we need the PERF_IDX2OFF() macro.
It seems to me that "head = etr_perf->head;" above is sufficient.
We need to do the step at some point, which could be moved to set_perf_buffer.
That's a good idea.
- pg_idx = head >> PAGE_SHIFT;
- pg_offset = head & (PAGE_SIZE - 1);
- dst_pages = (char **)etr_perf->pages;
- src_offset = etr_buf->offset;
- to_copy = etr_buf->len;
- while (to_copy > 0) {
/*
* We can copy minimum of :
s/We can copy minimum of :/In one iteration we can copy a minimum of:/
If I'm wrong about the PERF_IDX2OFF(), don't bother respinning just for that.
I have some more fixes for handling the different modes (sysfs vs perf), which I can include in the v2.
Very well.
Suzuki