On Wed, Feb 10, 2021 at 09:42:29AM +0530, Anshuman Khandual wrote:
On 2/9/21 11:09 PM, Mathieu Poirier wrote:
On Fri, Feb 05, 2021 at 10:53:30AM -0700, Mathieu Poirier wrote:
On Wed, Jan 27, 2021 at 02:25:35PM +0530, Anshuman Khandual wrote:
Trace Buffer Extension (TRBE) implements a trace buffer per CPU which is accessible via the system registers. The TRBE supports different addressing modes including CPU virtual address and buffer modes including the circular buffer mode. The TRBE buffer is addressed by a base pointer (TRBBASER_EL1), an write pointer (TRBPTR_EL1) and a limit pointer (TRBLIMITR_EL1). But the access to the trace buffer could be prohibited by a higher exception level (EL3 or EL2), indicated by TRBIDR_EL1.P. The TRBE can also generate a CPU private interrupt (PPI) on address translation errors and when the buffer is full. Overall implementation here is inspired from the Arm SPE driver.
I got this message when applying the patch:
Applying: coresight: sink: Add TRBE driver .git/rebase-apply/patch:76: new blank line at EOF.
warning: 1 line adds whitespace errors.
Cc: Mathieu Poirier mathieu.poirier@linaro.org Cc: Mike Leach mike.leach@linaro.org Cc: Suzuki K Poulose suzuki.poulose@arm.com Signed-off-by: Anshuman Khandual anshuman.khandual@arm.com
Changes in V3:
- Added new DT bindings document TRBE.yaml
- Changed TRBLIMITR_TRIG_MODE_SHIFT from 2 to 3
- Dropped isb() from trbe_reset_local()
- Dropped gap between (void *) and buf->trbe_base
- Changed 'int' to 'unsigned int' in is_trbe_available()
- Dropped unused function set_trbe_running(), set_trbe_virtual_mode(), set_trbe_enabled() and set_trbe_limit_pointer()
- Changed get_trbe_flag_update(), is_trbe_programmable() and get_trbe_address_align() to accept TRBIDR value
- Changed is_trbe_running(), is_trbe_abort(), is_trbe_wrap(), is_trbe_trg(), is_trbe_irq(), get_trbe_bsc() and get_trbe_ec() to accept TRBSR value
- Dropped snapshot mode condition in arm_trbe_alloc_buffer()
- Exit arm_trbe_init() when arm64_kernel_unmapped_at_el0() is enabled
- Compute trbe_limit before trbe_write to get the updated handle
- Added trbe_stop_and_truncate_event()
- Dropped trbe_handle_fatal()
Documentation/trace/coresight/coresight-trbe.rst | 39 + arch/arm64/include/asm/sysreg.h | 1 + drivers/hwtracing/coresight/Kconfig | 11 + drivers/hwtracing/coresight/Makefile | 1 + drivers/hwtracing/coresight/coresight-trbe.c | 1023 ++++++++++++++++++++++ drivers/hwtracing/coresight/coresight-trbe.h | 160 ++++ 6 files changed, 1235 insertions(+) create mode 100644 Documentation/trace/coresight/coresight-trbe.rst create mode 100644 drivers/hwtracing/coresight/coresight-trbe.c create mode 100644 drivers/hwtracing/coresight/coresight-trbe.h
[...]
+static irqreturn_t arm_trbe_irq_handler(int irq, void *dev) +{
- struct perf_output_handle **handle_ptr = dev;
- struct perf_output_handle *handle = *handle_ptr;
- enum trbe_fault_action act;
- WARN_ON(!is_trbe_irq(read_sysreg_s(SYS_TRBSR_EL1)));
- clr_trbe_irq();
- /*
* Ensure the trace is visible to the CPUs and
* any external aborts have been resolved.
*/
- trbe_drain_buffer();
- isb();
- if (!perf_get_aux(handle))
return IRQ_NONE;
- if (!is_perf_trbe(handle))
return IRQ_NONE;
- irq_work_run();
There is a comment in the SPE driver about this. Since this driver closely follows that implementation it would be nice to have the comments as well. Otherwise the reader has to constantly go back to the original driver.
Sure, will add the following comment before irq_work_run().
/* * Ensure perf callbacks have completed, which may disable the * profiling buffer in response to a TRUNCATION flag. */
I will come back to this function later.
Okay.
- act = trbe_get_fault_act(handle);
- switch (act) {
- case TRBE_FAULT_ACT_WRAP:
trbe_handle_overflow(handle);
break;
- case TRBE_FAULT_ACT_SPURIOUS:
trbe_handle_spurious(handle);
break;
- case TRBE_FAULT_ACT_FATAL:
trbe_stop_and_truncate_event(handle);
break;
- }
- return IRQ_HANDLED;
+}
+static const struct coresight_ops_sink arm_trbe_sink_ops = {
- .enable = arm_trbe_enable,
- .disable = arm_trbe_disable,
- .alloc_buffer = arm_trbe_alloc_buffer,
- .free_buffer = arm_trbe_free_buffer,
- .update_buffer = arm_trbe_update_buffer,
+};
+static const struct coresight_ops arm_trbe_cs_ops = {
- .sink_ops = &arm_trbe_sink_ops,
+};
+static ssize_t align_show(struct device *dev, struct device_attribute *attr, char *buf) +{
- struct trbe_cpudata *cpudata = dev_get_drvdata(dev);
- return sprintf(buf, "%llx\n", cpudata->trbe_align);
+} +static DEVICE_ATTR_RO(align);
+static ssize_t dbm_show(struct device *dev, struct device_attribute *attr, char *buf) +{
- struct trbe_cpudata *cpudata = dev_get_drvdata(dev);
- return sprintf(buf, "%d\n", cpudata->trbe_dbm);
+} +static DEVICE_ATTR_RO(dbm);
+static struct attribute *arm_trbe_attrs[] = {
- &dev_attr_align.attr,
- &dev_attr_dbm.attr,
- NULL,
+};
+static const struct attribute_group arm_trbe_group = {
- .attrs = arm_trbe_attrs,
+};
+static const struct attribute_group *arm_trbe_groups[] = {
- &arm_trbe_group,
- NULL,
+};
+static void arm_trbe_probe_coresight_cpu(void *info) +{
- struct trbe_drvdata *drvdata = info;
- struct coresight_desc desc = { 0 };
- int cpu = smp_processor_id();
- struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, cpu);
- struct coresight_device *trbe_csdev = per_cpu(csdev_sink, cpu);
- u64 trbidr = read_sysreg_s(SYS_TRBIDR_EL1);
- struct device *dev;
- if (WARN_ON(!cpudata))
goto cpu_clear;
Where was the memory for cpudata allocated? As far as I can tell, at this time it is just a pointer that was not allocated and as such it should be NULL.
cpudata gets allocated in arm_trbe_probe_coresight() just before calling individual CPU based probes i.e arm_trbe_probe_coresight_cpu() directly and via smp_call_function_many().
arm_trbe_device_probe() arm_trbe_probe_coresight() arm_trbe_probe_coresight_cpu()
Ah yes, my apologies here. Looking at the code I realised I skipped arm_trbe_probe_coresight() and went straight to arm_trbe_probe_coresight_cpu(). No wonder things didn't make sense. I will take another look at this function.
- if (trbe_csdev)
return;
- cpudata->cpu = smp_processor_id();
Why call this again when you already did above? And how is
Right, this is redundant. Will just assign it as cpu which has already been computed.
arm_trbe_probe_coresight_cpu() is called for every CPU in the system?
During boot in arm_trbe_probe_coresight(), it is called once directly on the executing cpu and on all other via smp_call_function_many().
- cpudata->drvdata = drvdata;
- dev = &cpudata->drvdata->pdev->dev;
- if (!is_trbe_available()) {
pr_err("TRBE is not implemented on cpu %d\n", cpudata->cpu);
goto cpu_clear;
- }
- if (!is_trbe_programmable(trbidr)) {
pr_err("TRBE is owned in higher exception level on cpu %d\n", cpudata->cpu);
goto cpu_clear;
- }
- desc.name = devm_kasprintf(dev, GFP_KERNEL, "%s%d", DRVNAME, smp_processor_id());
- if (IS_ERR(desc.name))
goto cpu_clear;
- desc.type = CORESIGHT_DEV_TYPE_SINK;
- desc.subtype.sink_subtype = CORESIGHT_DEV_SUBTYPE_SINK_PERCPU_SYSMEM;
- desc.ops = &arm_trbe_cs_ops;
- desc.pdata = dev_get_platdata(dev);
- desc.groups = arm_trbe_groups;
- desc.dev = dev;
- trbe_csdev = coresight_register(&desc);
- if (IS_ERR(trbe_csdev))
goto cpu_clear;
- dev_set_drvdata(&trbe_csdev->dev, cpudata);
- cpudata->trbe_dbm = get_trbe_flag_update(trbidr);
- cpudata->trbe_align = 1ULL << get_trbe_address_align(trbidr);
- if (cpudata->trbe_align > SZ_2K) {
pr_err("Unsupported alignment on cpu %d\n", cpudata->cpu);
goto cpu_clear;
- }
- per_cpu(csdev_sink, cpu) = trbe_csdev;
- trbe_reset_local();
- enable_percpu_irq(drvdata->irq, IRQ_TYPE_NONE);
- return;
+cpu_clear:
- cpumask_clear_cpu(cpudata->cpu, &cpudata->drvdata->supported_cpus);
+}
+static void arm_trbe_remove_coresight_cpu(void *info) +{
- int cpu = smp_processor_id();
- struct trbe_drvdata *drvdata = info;
- struct trbe_cpudata *cpudata = per_cpu_ptr(drvdata->cpudata, cpu);
- struct coresight_device *trbe_csdev = per_cpu(csdev_sink, cpu);
- if (trbe_csdev) {
coresight_unregister(trbe_csdev);
cpudata->drvdata = NULL;
per_cpu(csdev_sink, cpu) = NULL;
- }
- disable_percpu_irq(drvdata->irq);
- trbe_reset_local();
+}
+static int arm_trbe_probe_coresight(struct trbe_drvdata *drvdata) +{
- drvdata->cpudata = alloc_percpu(typeof(*drvdata->cpudata));
- if (IS_ERR(drvdata->cpudata))
return PTR_ERR(drvdata->cpudata);
- arm_trbe_probe_coresight_cpu(drvdata);
- smp_call_function_many(&drvdata->supported_cpus, arm_trbe_probe_coresight_cpu, drvdata, 1);
- return 0;
+}
+static int arm_trbe_remove_coresight(struct trbe_drvdata *drvdata) +{
- arm_trbe_remove_coresight_cpu(drvdata);
- smp_call_function_many(&drvdata->supported_cpus, arm_trbe_remove_coresight_cpu, drvdata, 1);
- free_percpu(drvdata->cpudata);
- return 0;
+}
+static int arm_trbe_cpu_startup(unsigned int cpu, struct hlist_node *node) +{
- struct trbe_drvdata *drvdata = hlist_entry_safe(node, struct trbe_drvdata, hotplug_node);
- if (cpumask_test_cpu(cpu, &drvdata->supported_cpus)) {
if (!per_cpu(csdev_sink, cpu)) {
arm_trbe_probe_coresight_cpu(drvdata);
} else {
trbe_reset_local();
enable_percpu_irq(drvdata->irq, IRQ_TYPE_NONE);
}
- }
- return 0;
+}
+static int arm_trbe_cpu_teardown(unsigned int cpu, struct hlist_node *node) +{
- struct trbe_drvdata *drvdata = hlist_entry_safe(node, struct trbe_drvdata, hotplug_node);
- if (cpumask_test_cpu(cpu, &drvdata->supported_cpus)) {
disable_percpu_irq(drvdata->irq);
trbe_reset_local();
- }
- return 0;
+}
+static int arm_trbe_probe_cpuhp(struct trbe_drvdata *drvdata) +{
- enum cpuhp_state trbe_online;
- trbe_online = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN, DRVNAME,
arm_trbe_cpu_startup, arm_trbe_cpu_teardown);
- if (trbe_online < 0)
return -EINVAL;
- if (cpuhp_state_add_instance(trbe_online, &drvdata->hotplug_node))
return -EINVAL;
- drvdata->trbe_online = trbe_online;
- return 0;
+}
+static void arm_trbe_remove_cpuhp(struct trbe_drvdata *drvdata) +{
- cpuhp_remove_multi_state(drvdata->trbe_online);
+}
+static int arm_trbe_probe_irq(struct platform_device *pdev,
struct trbe_drvdata *drvdata)
+{
- drvdata->irq = platform_get_irq(pdev, 0);
- if (!drvdata->irq) {
Please use function platform_get_irq() properly - there is even an example on how to do so in the documentation section of the function.
The documentation says, the format should be.
int irq = platform_get_irq(pdev, 0); if (irq < 0) return irq;
Will change the conditional check above.
pr_err("IRQ not found for the platform device\n");
return -ENXIO;
Why use a different error code?
We could return the irq (which is < 0) but followed the SPE driver which returns ENXIO here. Happy to change either way.
Please use the right error code.
- }
- if (!irq_is_percpu(drvdata->irq)) {
pr_err("IRQ is not a PPI\n");
return -EINVAL;
- }
- if (irq_get_percpu_devid_partition(drvdata->irq, &drvdata->supported_cpus))
return -EINVAL;
- drvdata->handle = alloc_percpu(typeof(*drvdata->handle));
- if (!drvdata->handle)
return -ENOMEM;
- if (request_percpu_irq(drvdata->irq, arm_trbe_irq_handler, DRVNAME, drvdata->handle)) {
free_percpu(drvdata->handle);
return -EINVAL;
Here too you need to use the error code from the calling function rather than making your own. Please revise for the entire patch.
Okay, will capture the return value from request_percpu_irq() and return the same when it is an error case i.e being positive.
- }
- return 0;
+}
+static void arm_trbe_remove_irq(struct trbe_drvdata *drvdata) +{
- free_percpu_irq(drvdata->irq, drvdata->handle);
- free_percpu(drvdata->handle);
+}
+static int arm_trbe_device_probe(struct platform_device *pdev) +{
- struct coresight_platform_data *pdata;
- struct trbe_drvdata *drvdata;
- struct device *dev = &pdev->dev;
- int ret;
- drvdata = devm_kzalloc(dev, sizeof(*drvdata), GFP_KERNEL);
- if (IS_ERR(drvdata))
return -ENOMEM;
if (!drvdata)
Changed.
- pdata = coresight_get_platform_data(dev);
- if (IS_ERR(pdata)) {
kfree(drvdata);
No need to do this since devm_kzalloc() was used above.
Suzuki had pointed out these issues, have already incorporated them i.e dropped kfree() here.
To avoid getting tunel vision I don't look at other comments before reviewing a patchset. As such it is possible to get redundant comments.
More to come shortly.
return -ENOMEM;
Why not using the error from coresight_get_platform_data() instead of masking it?
Okay, will return PTR_ERR(pdata) instead.
- }
- dev_set_drvdata(dev, drvdata);
- dev->platform_data = pdata;
- drvdata->pdev = pdev;
- ret = arm_trbe_probe_irq(pdev, drvdata);
- if (ret)
goto irq_failed;
- ret = arm_trbe_probe_coresight(drvdata);
- if (ret)
goto probe_failed;
- ret = arm_trbe_probe_cpuhp(drvdata);
- if (ret)
goto cpuhp_failed;
- return 0;
+cpuhp_failed:
- arm_trbe_remove_coresight(drvdata);
+probe_failed:
- arm_trbe_remove_irq(drvdata);
+irq_failed:
- kfree(pdata);
- kfree(drvdata);
Same here - both @pdata and @drvdata have been allocated by devm_kzalloc(). devm_kzalloc().
Dropped these kfree() statements.
- Anshuman