This patchset add support for SMB(System Memory Buffer) device, SMB
obtains CPU instructions from Coresight ETM device and stores these
messages in system memory.
SMB is developed by Ultrasoc technology, which is acquired by Siemens,
and we still use "Ultrasoc" to name document and driver.
Change since RFC:
- Move ultrasoc driver to drivers/hwtracing/coresight.
- Remove ultrasoc-axi-com.c, as AXI-COM doesn't need to be configured in
basic tracing function.
- Remove ultrasoc.c as SMB does not need to register with the ultrasoc core.
- Address the comments from Mathieu and Suzuki.
- Link: https://lists.linaro.org/pipermail/coresight/2021-June/006535.html
Qi Liu (2):
Documentation: tracing: Documentation for ultrasoc SMB drivers
coresight: ultrasoc: Add System Memory Buffer driver
.../trace/coresight/ultrasoc-trace.rst | 193 +++++
MAINTAINERS | 7 +
drivers/hwtracing/coresight/Kconfig | 3 +
drivers/hwtracing/coresight/Makefile | 2 +
drivers/hwtracing/coresight/ultrasoc/Kconfig | 12 +
drivers/hwtracing/coresight/ultrasoc/Makefile | 6 +
.../coresight/ultrasoc/ultrasoc-smb.c | 722 ++++++++++++++++++
.../coresight/ultrasoc/ultrasoc-smb.h | 142 ++++
8 files changed, 1087 insertions(+)
create mode 100644 Documentation/trace/coresight/ultrasoc-trace.rst
create mode 100644 drivers/hwtracing/coresight/ultrasoc/Kconfig
create mode 100644 drivers/hwtracing/coresight/ultrasoc/Makefile
create mode 100644 drivers/hwtracing/coresight/ultrasoc/ultrasoc-smb.c
create mode 100644 drivers/hwtracing/coresight/ultrasoc/ultrasoc-smb.h
--
2.17.1
This series adds CPU erratum work arounds related to the self-hosted
tracing. The list of affected errata handled in this series are :
* TRBE may overwrite trace in FILL mode
- Arm Neoverse-N2 #2139208
- Cortex-A710 #2119858
* A TSB instruction may not flush the trace completely when executed
in trace prohibited region.
- Arm Neoverse-N2 #2067961
- Cortex-A710 #2054223
The series applies on the self-hosted/trbe fixes posted here [0].
A tree containing both the series is available here [1].
[0] https://lkml.kernel.org/r/20210723124456.3828769-1-suzuki.poulose@arm.com
[1] git@git.gitlab.arm.com:linux-arm/linux-skp.git coresight/errata/trbe-tsb-n2-a710/v1
Suzuki K Poulose (10):
coresight: trbe: Add infrastructure for Errata handling
coresight: trbe: Add a helper to calculate the trace generated
coresight: trbe: Add a helper to pad a given buffer area
coresight: trbe: Decouple buffer base from the hardware base
coresight: trbe: Allow driver to choose a different alignment
arm64: Add Neoverse-N2, Cortex-A710 CPU part definition
arm64: Add erratum detection for TRBE overwrite in FILL mode
coresight: trbe: Workaround TRBE errat overwrite in FILL mode
arm64: Enable workaround for TRBE overwrite in FILL mode
arm64: errata: Add workaround for TSB flush failures
Documentation/arm64/silicon-errata.rst | 8 +
arch/arm64/Kconfig | 70 ++++++
arch/arm64/include/asm/barrier.h | 17 +-
arch/arm64/include/asm/cputype.h | 4 +
arch/arm64/kernel/cpu_errata.c | 44 ++++
arch/arm64/tools/cpucaps | 2 +
drivers/hwtracing/coresight/coresight-trbe.c | 227 ++++++++++++++++---
7 files changed, 341 insertions(+), 31 deletions(-)
--
2.24.1
I'm submitting this as an RFC because there are a few changes I'd
like to get feedback on. The two changes I wanted to make were the
last two for Coresight warnings, but I ended up making some perf-wide
changes along the way.
For #1 (perf tools: Add WARN_ONCE equivalent for UI warnings)
* Does it make sense to add warn once equivalents at all, or
should the once be re-done for each usage?
* Or should there be some kind of generic 'once' wrapper?
For #3 (perf tools: Add disassembly warnings for annotate --stdio)
* If the output is interpreted by any other tools, then adding
these warnings could be an issue, so maybe this change could
be dropped, but no error output at all isn't ideal.
For #4 (perf tools: Add flag for tracking warnings of missing DSOs)
* In theory I could re-use 'annotate_warned', but it might make sense
to rename it in that case, or just leave the new auxtrace_warned and
remove any confusion.
This set applies to perf/core e73f0f0ee7541
Thanks
James
James Clark (6):
perf tools: Add WARN_ONCE equivalent for UI warnings
perf tools: Re-add annotate_warned functionality
perf tools: Add disassembly warnings for annotate --stdio
perf tools: Add flag for tracking warnings of missing DSOs
perf cs-etm: Improve Coresight zero timestamp warning
perf cs-etm: Add warnings for missing DSOs
tools/perf/ui/browsers/annotate.c | 1 +
tools/perf/ui/gtk/annotate.c | 1 +
tools/perf/util/annotate.c | 20 +++++++++++++++++--
.../perf/util/cs-etm-decoder/cs-etm-decoder.c | 7 +++++--
tools/perf/util/cs-etm.c | 10 +++++++++-
tools/perf/util/debug.h | 14 +++++++++++++
tools/perf/util/dso.h | 1 +
7 files changed, 49 insertions(+), 5 deletions(-)
--
2.28.0
Decoding ETE already works because it is a superset of
ETMv4, but if any new packet types are found then they will be
ignored by the decoder. This patchset creates an ETE decoder
which can output the new packets and saves a new register that
is required. No new packet types are handled by perf yet, as this
can be added in the future.
This set applies on top of "perf cs-etm: Support TRBE
(unformatted decoding)" on perf/core.
James Clark (6):
perf cs-etm: Refactor initialisation of decoder params.
perf cs-etm: Initialise architecture based on TRCIDR1
perf cs-etm: Save TRCDEVARCH register
perf cs-etm: Update OpenCSD decoder for ETE
perf cs-etm: Create ETE decoder
perf cs-etm: Print the decoder name
tools/build/feature/test-libopencsd.c | 4 +-
tools/perf/arch/arm/util/cs-etm.c | 13 +-
.../perf/util/cs-etm-decoder/cs-etm-decoder.c | 151 ++++++++----------
.../perf/util/cs-etm-decoder/cs-etm-decoder.h | 8 +
tools/perf/util/cs-etm.c | 54 ++++++-
tools/perf/util/cs-etm.h | 6 +-
6 files changed, 147 insertions(+), 89 deletions(-)
--
2.28.0
On 30/07/2021 12:26, Anshuman Khandual wrote:
>
>
> On 7/28/21 7:22 PM, Suzuki K Poulose wrote:
>> Add the CPU Partnumbers for the new Arm designs.
>>
>> Cc: Catalin Marinas <catalin.marinas(a)arm.com>
>> Cc: Mark Rutland <mark.rutland(a)arm.com>
>> Cc: Will Deacon <will(a)kernel.org>
>> Signed-off-by: Suzuki K Poulose <suzuki.poulose(a)arm.com>
>> ---
>> arch/arm64/include/asm/cputype.h | 4 ++++
>> 1 file changed, 4 insertions(+)
>>
>> diff --git a/arch/arm64/include/asm/cputype.h b/arch/arm64/include/asm/cputype.h
>> index 6231e1f0abe7..b71bd6c176c2 100644
>> --- a/arch/arm64/include/asm/cputype.h
>> +++ b/arch/arm64/include/asm/cputype.h
>> @@ -73,6 +73,8 @@
>> #define ARM_CPU_PART_CORTEX_A76 0xD0B
>> #define ARM_CPU_PART_NEOVERSE_N1 0xD0C
>> #define ARM_CPU_PART_CORTEX_A77 0xD0D
>> +#define ARM_CPU_PART_CORTEX_A710 0xD47
>> +#define ARM_CPU_PART_NEOVERSE_N2 0xD49
>>
>> #define APM_CPU_PART_POTENZA 0x000
>>
>> @@ -112,6 +114,8 @@
>> #define MIDR_CORTEX_A55 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A55)
>> #define MIDR_CORTEX_A76 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A76)
>> #define MIDR_NEOVERSE_N1 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_N1)
>> +#define MIDR_CORTEX_A710 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A710)
>> +#define MIDR_NEOVERSE_N2 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_N2)
>
> Should not the new ones be added after MIDR_CORTEX_A77 below to preserve the order ?
>
TBH. The order doesn't matter much as they are part numbers for
different CPUs, without any dependencies between them.
But for the sake of keeping the order, I could fix this.
Thanks
Suzuki
On 30/07/2021 12:02, Anshuman Khandual wrote:
>
>
> On 7/28/21 7:22 PM, Suzuki K Poulose wrote:
>> The TRBE hardware mandates a minimum alignment for the TRBPTR_EL1,
>> advertised via the TRBIDR_EL1. This is used by the driver to
>> align the buffer write head. This patch allows the driver to
>> choose a different alignment from that of the hardware, by
>> decoupling the alignment tracking. This will be useful for
>> working around errata.
>>
>> Cc: Mathieu Poirier <mathieu.poirier(a)linaro.org>
>> Cc: Anshuman Khandual <anshuman.khandual(a)arm.com>
>> Cc: Mike Leach <mike.leach(a)linaro.org>
>> Cc: Leo Yan <leo.yan(a)linaro.org>
>> Signed-off-by: Suzuki K Poulose <suzuki.poulose(a)arm.com>
>> ---
>> drivers/hwtracing/coresight/coresight-trbe.c | 12 ++++++++----
>> 1 file changed, 8 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
>> index 9735d514c5e1..9ea28813182b 100644
>> --- a/drivers/hwtracing/coresight/coresight-trbe.c
>> +++ b/drivers/hwtracing/coresight/coresight-trbe.c
>> @@ -92,7 +92,8 @@ static unsigned long trbe_errata_cpucaps[TRBE_ERRATA_MAX] = {
>> /*
>> * struct trbe_cpudata: TRBE instance specific data
>> * @trbe_flag - TRBE dirty/access flag support
>> - * @tbre_align - Actual TRBE alignment required for TRBPTR_EL1.
>> + * @trbe_hw_align - Actual TRBE alignment required for TRBPTR_EL1.
>> + * @trbe_align - Software alignment used for the TRBPTR_EL1,
>> * @cpu - CPU this TRBE belongs to.
>> * @mode - Mode of current operation. (perf/disabled)
>> * @drvdata - TRBE specific drvdata
>> @@ -100,6 +101,7 @@ static unsigned long trbe_errata_cpucaps[TRBE_ERRATA_MAX] = {
>> */
>> struct trbe_cpudata {
>> bool trbe_flag;
>> + u64 trbe_hw_align;
>> u64 trbe_align;
>> int cpu;
>> enum cs_mode mode;
>> @@ -906,7 +908,7 @@ static ssize_t align_show(struct device *dev, struct device_attribute *attr, cha
>> {
>> struct trbe_cpudata *cpudata = dev_get_drvdata(dev);
>>
>> - return sprintf(buf, "%llx\n", cpudata->trbe_align);
>> + return sprintf(buf, "%llx\n", cpudata->trbe_hw_align);
>> }
>> static DEVICE_ATTR_RO(align);
>>
>> @@ -995,11 +997,13 @@ static void arm_trbe_probe_cpu(void *info)
>> }
>>
>> trbe_check_errata(cpudata);
>> - cpudata->trbe_align = 1ULL << get_trbe_address_align(trbidr);
>> - if (cpudata->trbe_align > SZ_2K) {
>> +
>> + cpudata->trbe_hw_align = 1ULL << get_trbe_address_align(trbidr);
>> + if (cpudata->trbe_hw_align > SZ_2K) {
>> pr_err("Unsupported alignment on cpu %d\n", cpu);
>> goto cpu_clear;
>> }
>> + cpudata->trbe_align = cpudata->trbe_hw_align;
>
> When it changes, it must be asserted that trbe_align would be a multiple
> of trbe_hw_align before existing from arm_trbe_probe_cpu().
We only set it to PAGE_SIZE, which is one of 4K, 16K, 64K all of which
are aligned to 2K or any of the smaller alignment supported by TRBE.
>
>> cpudata->trbe_flag = get_trbe_flag_update(trbidr);
>> cpudata->cpu = cpu;
>> cpudata->drvdata = drvdata;
>>
>
> Reviewed-by: Anshuman Khandual <anshuman.khandual(a)arm.com>
>
Thanks
Suzuki
On 30/07/2021 05:47, Anshuman Khandual wrote:
>
>
> On 7/23/21 6:16 PM, Suzuki K Poulose wrote:
>> We mark the buffer as TRUNCATED when there is no space left
>> in the buffer. But we do it at different points.
>> __trbe_normal_offset()
>> and also, at all the callers of the above function via
>> compute_trbe_buffer_limit(), when the limit == base (i.e
>> offset = 0 as returned by the __trbe_normal_offset()).
>>
>> So, given that the callers already mark the buffer as TRUNCATED
>> drop the caller inside the __trbe_normal_offset().
>>
>> This is in preparation to moving the handling of TRUNCATED
>> into a central place.
>>
>> Cc: Anshuman Khandual <anshuman.khandual(a)arm.com>
>> Cc: Mathieu Poirier <mathieu.poirier(a)linaro.org>
>> Cc: Mike Leach <mike.leach(a)linaro.org>
>> Cc: Leo Yan <leo.yan(a)linaro.org>
>> Signed-off-by: Suzuki K Poulose <suzuki.poulose(a)arm.com>
>> ---
>> drivers/hwtracing/coresight/coresight-trbe.c | 7 +------
>> 1 file changed, 1 insertion(+), 6 deletions(-)
>>
>> diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
>> index 446f080f8320..62e1a08f73ff 100644
>> --- a/drivers/hwtracing/coresight/coresight-trbe.c
>> +++ b/drivers/hwtracing/coresight/coresight-trbe.c
>> @@ -253,13 +253,9 @@ static unsigned long __trbe_normal_offset(struct perf_output_handle *handle)
>> * trbe_base trbe_base + nr_pages
>> *
>> * Perf aux buffer does not have any space for the driver to write into.
>> - * Just communicate trace truncation event to the user space by marking
>> - * it with PERF_AUX_FLAG_TRUNCATED.
>> */
>> - if (!handle->size) {
>> - perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED);
>> + if (!handle->size)
>> return 0;
>> - }
>>
>> /* Compute the tail and wakeup indices now that we've aligned head */
>> tail = PERF_IDX2OFF(handle->head + handle->size, buf);
>> @@ -361,7 +357,6 @@ static unsigned long __trbe_normal_offset(struct perf_output_handle *handle)
>> return limit;
>>
>> trbe_pad_buf(handle, handle->size);
>> - perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED);
>> return 0;
>> }
>
> What about in trbe_handle_spurious() path which used to set the flag via
> compute_trbe_buffer_limit(), but would not any more after this change. I
> guess following additional change would be required to preserve the past
> behaviour.
>
> --- a/drivers/hwtracing/coresight/coresight-trbe.c
> +++ b/drivers/hwtracing/coresight/coresight-trbe.c
> @@ -685,6 +685,7 @@ static void trbe_handle_spurious(struct perf_output_handle *handle)
> buf->trbe_write = buf->trbe_base + PERF_IDX2OFF(handle->head, buf);
> if (buf->trbe_limit == buf->trbe_base) {
> trbe_drain_and_disable_local();
> + perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED);
> return;
> }
> trbe_enable_hw(buf);
>
Agreed, I will add this.
Thanks
Suzuki
On 30/07/2021 06:15, Anshuman Khandual wrote:
>
>
> On 7/23/21 6:16 PM, Suzuki K Poulose wrote:
>> On a spurious IRQ, right now we disable the TRBE and then re-enable
>> it back, resetting the "buffer" pointers(i.e BASE, LIMIT and more
>> importantly WRITE) to the original pointers from the AUX handle.
>> This implies that we overwrite any trace that was written so far,
>> (by overwriting TRBPTR) while we should have ignored the IRQ.
>
> The ideas was that a state (pointers) reset would improve the chances
> of not getting the spurious IRQ once again. This is assuming that some
> thing during this current state machine, had caused the spurious IRQ.
> Hence just restart it back from the beginning. Yes, it does lose some
> trace data but whats the real possibility of such spurious IRQs in the
> first place ?
>
>>
>> This patch cleans the behavior, by only stopping the TRBE if the
>> IRQ was indeed raised, as we can read the TRBSR without stopping
>> the TRBE (Only writes to the TRBSR requires the TRBE disabled).
>> And also, on detecting a spurious IRQ after examining the TRBSR,
>> we simply re-enable the TRBE without touching the other parameters.
>
> This makes sense. I was not sure if TRBSR could be safely read without
> actually stopping the TRBE.
>
>>
>> Cc: Anshuman Khandual <anshuman.khandual(a)arm.com>
>> Cc: Mathieu Poirier <mathieu.poirier(a)linaro.org>
>> Cc: Mike Leach <mike.leach(a)linaro.org>
>> Cc: Leo Yan <leo.yan(a)linaro.org>
>> Signed-off-by: Suzuki K Poulose <suzuki.poulose(a)arm.com>
>> ---
>> drivers/hwtracing/coresight/coresight-trbe.c | 29 ++++++++++----------
>> 1 file changed, 15 insertions(+), 14 deletions(-)
>>
>> diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
>> index 62e1a08f73ff..503bea0137ae 100644
>> --- a/drivers/hwtracing/coresight/coresight-trbe.c
>> +++ b/drivers/hwtracing/coresight/coresight-trbe.c
>> @@ -679,15 +679,16 @@ static int arm_trbe_disable(struct coresight_device *csdev)
>>
>> static void trbe_handle_spurious(struct perf_output_handle *handle)
>> {
>> - struct trbe_buf *buf = etm_perf_sink_config(handle);
>> + u64 limitr = read_sysreg_s(SYS_TRBLIMITR_EL1);
>>
>> - buf->trbe_limit = compute_trbe_buffer_limit(handle);
>> - buf->trbe_write = buf->trbe_base + PERF_IDX2OFF(handle->head, buf);
>> - if (buf->trbe_limit == buf->trbe_base) {
>> - trbe_drain_and_disable_local();
>> - return;
>> - }
>> - trbe_enable_hw(buf);
>> + /*
>> + * If the IRQ was spurious, simply re-enable the TRBE
>> + * back without modifiying the buffer parameters to
>
> Typo here ^^^^^^ s/modifiying/modifying
>
>> + * retain the trace collected so far.
>> + */
>> + limitr |= TRBLIMITR_ENABLE;
>> + write_sysreg_s(limitr, SYS_TRBLIMITR_EL1);
>> + isb();
>> }
>>
>> static void trbe_handle_overflow(struct perf_output_handle *handle)
>> @@ -760,12 +761,7 @@ static irqreturn_t arm_trbe_irq_handler(int irq, void *dev)
>> enum trbe_fault_action act;
>> u64 status;
>>
>> - /*
>> - * Ensure the trace is visible to the CPUs and
>> - * any external aborts have been resolved.
>> - */
>> - trbe_drain_and_disable_local();
>> -
>> + /* Reads to TRBSR_EL1 is fine when TRBE is active */
>> status = read_sysreg_s(SYS_TRBSR_EL1);
>> /*
>> * If the pending IRQ was handled by update_buffer callback
>> @@ -774,6 +770,11 @@ static irqreturn_t arm_trbe_irq_handler(int irq, void *dev)
>> if (!is_trbe_irq(status))
>
> Warn here that a non-related IRQ has been delivered to this handler ?
> But moving the trbe_drain_and_disable_local() later, enables it to
> return back immediately after detecting an unrelated IRQ.
Not really. There could be race with the update_buffer(), see the
comment right above that. When that happens, we have disabled the
TRBE in the update_buffer(). Either case, we have nothing to do.
>
>> return IRQ_NONE;
>>
>> + /*
>> + * Ensure the trace is visible to the CPUs and
>> + * any external aborts have been resolved.
>> + */
>> + trbe_drain_and_disable_local();
>> clr_trbe_irq();
>> isb();
>>
>>
>
> Actually there are two types of spurious interrupts here.
>
> 1. Non-TRBE spurious interrupt
>
> Fails is_trbe_irq() test and needs to be returned immediately from
> arm_trbe_irq_handler(), after an warning for the platform IRQ
> delivery wiring.
Not necessarily warrant a WARNING. See above.
>
> 2. TRBE spurious interrupt
>
> Clears is_trbe_irq() and get handled in trbe_handle_spurious(). I
> still think leaving this unchanged might be better as it reduces
> the chance of getting further spurious TRBE interrupts.
How does it reduce the chances of getting another spurious interrupt ?
If the TRBE gets a spurious IRQ, that we cannot decode, I would rather
leave it as NOP.
Suzuki
On 30/07/2021 05:26, Anshuman Khandual wrote:
>
>
> On 7/23/21 6:16 PM, Suzuki K Poulose wrote:
>> When the TRBE is stopped on truncating an event, we may not
>> set the FORMAT flag, even though the size of the record is 0.
>> Let us be consistent and not confuse the user. Always set the
>> format flag for TRBE generated records.
>>
>> Cc: Anshuman Khandual <anshuman.khandual(a)arm.com>
>> Cc: Leo Yan <leo.yan(a)linaro.org>
>> Cc: Mike Leach <mike.leach(a)linaro.org>
>> Cc: Mathieu Poirier <mathieu.poirier(a)linaro.org>
>> Signed-off-by: Suzuki K Poulose <suzuki.poulose(a)arm.com>
>> ---
>> drivers/hwtracing/coresight/coresight-trbe.c | 3 ++-
>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
>> index 176868496879..446f080f8320 100644
>> --- a/drivers/hwtracing/coresight/coresight-trbe.c
>> +++ b/drivers/hwtracing/coresight/coresight-trbe.c
>> @@ -132,7 +132,8 @@ static void trbe_stop_and_truncate_event(struct perf_output_handle *handle)
>> * the update_buffer() to return a 0 size.
>> */
>> trbe_drain_and_disable_local();
>> - perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED);
>> + perf_aux_output_flag(handle, PERF_AUX_FLAG_TRUNCATED |
>> + PERF_AUX_FLAG_CORESIGHT_FORMAT_RAW);
>> *this_cpu_ptr(buf->cpudata->drvdata->handle) = NULL;
>> }
>
> But why should not PERF_AUX_FLAG_CORESIGHT_FORMAT_RAW also be set on
> zero sized records as well ? Otherwise there are two instances during
> TRBE buffer management, where PERF_AUX_FLAG_TRUNCATED is marked alone
> without PERF_AUX_FLAG_CORESIGHT_FORMAT_RAW. Those could be changed as
> well.
All records (irrespective of the size) generated by the TRBE must
contain the "RAW" flag. Did I miss another instance where we don't
do this ?
Suzuki
>