On 19/10/2021 12:04, Will Deacon wrote:
> On Thu, Oct 14, 2021 at 11:31:12PM +0100, Suzuki K Poulose wrote:
>> Arm Neoverse-N2 and the Cortex-A710 cores are affected
>> by a CPU erratum where the TRBE will overwrite the trace buffer
>> in FILL mode. The TRBE doesn't stop (as expected in FILL mode)
>> when it reaches the limit and wraps to the base to continue
>> writing upto 3 cache lines. This will overwrite any trace that
>> was written previously.
>>
>> Add the Neoverse-N2 erratumi(#2139208) and Cortex-A710 erratum
>> (#2119858) to the detection logic.
>
> Weird typo and double space in this sentence.
I have fixed this now.
Thanks
Suzuki
On 19/10/2021 05:36, Anshuman Khandual wrote:
>
>
> On 10/19/21 2:45 AM, Suzuki K Poulose wrote:
>> On 18/10/2021 16:51, Mathieu Poirier wrote:
>>> On Thu, Oct 14, 2021 at 11:31:20PM +0100, Suzuki K Poulose wrote:
>>>> ARM Neoverse-N2 (#2139208) and Cortex-A710(##2119858) suffers from
>>>> an erratum, which when triggered, might cause the TRBE to overwrite
>>>> the trace data already collected in FILL mode, in the event of a WRAP.
>>>> i.e, the TRBE doesn't stop writing the data, instead wraps to the base
>>>> and could write upto 3 cache line size worth trace. Thus, this could
>>>> corrupt the trace at the "BASE" pointer.
>>>>
>>>> The workaround is to program the write pointer 256bytes from the
>>>> base, such that if the erratum is triggered, it doesn't overwrite
>>>> the trace data that was captured. This skipped region could be
>>>> padded with ignore packets at the end of the session, so that
>>>> the decoder sees a continuous buffer with some padding at the
>>>> beginning. The trace data written at the base is considered
>>>> lost as the limit could have been in the middle of the perf
>>>> ring buffer, and jumping to the "base" is not acceptable.
>>>> We set the flags already to indicate that some amount of trace
>>>> was lost during the FILL event IRQ. So this is fine.
>>>>
>>>> One important change with the work around is, we program the
>>>> TRBBASER_EL1 to current page where we are allowed to write.
>>>> Otherwise, it could overwrite a region that may be consumed
>>>> by the perf. Towards this, we always make sure that the
>>>> "handle->head" and thus the trbe_write is PAGE_SIZE aligned,
>>>> so that we can set the BASE to the PAGE base and move the
>>>> TRBPTR to the 256bytes offset.
>>>>
>>>> Cc: Mike Leach <mike.leach(a)linaro.org>
>>>> Cc: Mathieu Poirier <mathieu.poirier(a)linaro.org>
>>>> Cc: Anshuman Khandual <anshuman.khandual(a)arm.com>
>>>> Cc: Leo Yan <leo.yan(a)linaro.org>
>>>> Reviewed-by: Mathieu Poirier <mathieu.poirier(a)linaro.org>
>>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose(a)arm.com>
>>>> ---
>>>> Changes since v2:
>>>> - Updated the ASCII art to include better description of
>>>> all the steps in the work around
>>>> Change since v1:
>>>> - Updated comment with ASCII art
>>>> - Add _BYTES suffix for the space to skip for the work around.
>>>> ---
>>>> drivers/hwtracing/coresight/coresight-trbe.c | 169 +++++++++++++++++--
>>>> 1 file changed, 158 insertions(+), 11 deletions(-)
>>>>
>>>> diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
>>>> index 314e5e7374c7..b56b166b2dec 100644
>>>> --- a/drivers/hwtracing/coresight/coresight-trbe.c
>>>> +++ b/drivers/hwtracing/coresight/coresight-trbe.c
>>>> @@ -16,6 +16,7 @@
>>>> #define pr_fmt(fmt) DRVNAME ": " fmt
>>>> #include <asm/barrier.h>
>>>> +#include <asm/cpufeature.h>
>>>
>>> Here too I get a checkpatch warning...
>>>
>>
>> That is a false alarm. I guess that warns for including
>> linux/cpufeature.h? It is a bit odd, we include this
>> for the arm64 cpucaps, not the generic linux feature
>
> It is a bit odd, I saw that too.
>
>> checks. (They are used for "loading modules" based
>> on "features" which are more like ELF HWCAPs).
>
> Should <asm/cpufeature.h> be renamed as <asm/arm64_cpufeature.h>
> or something similar instead to differentiate it from the generic
> <linux/cpufeature.h> as they are not related. Also, probably this
> warning could be avoided.
It is not that simple. asm/cpufeature.h on arm64 provides :
* arch backend for linux/cpufeature.h
* CPU feature sanity check infrastructure
* CPU capability infrastructure ( features & errata )
So ideally it should be split into 3 for better cleanup and
is something that could be pursued outside this series.
Suzuki
This patchset adds support for stopping trace on all enabled coresight
sources upon execution of a programmed instruction address.
ETM stop event is triggered by an instruction address match on a comparator.
This address match event is then used to stop the tracing across all ETM
source devices with the help of external input/output signals and ECT.
The instruction address for the stop event can be programmed using sysfs
interface.
The event flow diagram is like this,
Address comparator --> EXT OUT --> ECT---> EXT IN --> Counter-->ViewInst
The original intention of this patch is to stop all ETM sources at the
time of kernel panic without software intervention so that it can be
used as one of the building block while enabling panic/kdump support in
coresight drivers [1].
But there can be other use cases like stopping trace on assertions or
error functions etc. as well.
I am sending this patch as an RFC so as to get an early feedback
on the approach taken for implementation and other inputs if any.
Few caveats:
- Testing was done only with sysfs interface using a arbitrary kernel
symbol address.
Perf support will be added later based on initial feedback.
- CTI hook for enabling trace event connection need to be rewritten
to use existing support APIs.
[1] https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1652258.html
Linu Cherian (2):
coresight: Add support to setup Trace event signals
coresight: etm4x: Add support to generate and synchronize stop event
drivers/hwtracing/coresight/coresight-core.c | 37 ++++
.../hwtracing/coresight/coresight-cti-core.c | 21 +++
drivers/hwtracing/coresight/coresight-cti.h | 3 +
.../coresight/coresight-etm4x-core.c | 167 ++++++++++++++++++
.../coresight/coresight-etm4x-sysfs.c | 64 +++++++
drivers/hwtracing/coresight/coresight-etm4x.h | 16 ++
include/linux/coresight.h | 6 +
7 files changed, 314 insertions(+)
--
2.31.1
Hi
Thanks for the report. I have fixed all of them.
Suzuki
On 12/10/2021 16:31, Randy Dunlap wrote:
> Hi,
>
> On 10/12/21 6:17 AM, Suzuki K Poulose wrote:
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index 077f2ec4eeb2..404f56e87e93 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -666,6 +666,47 @@ config ARM64_ERRATUM_1508412
>> If unsure, say Y.
>> +config ARM64_WORKAROUND_TRBE_OVERWRITE_FILL_MODE
>> + bool
>> +
>> +config ARM64_ERRATUM_2119858
>> + bool "Cortex-A710: 2119858: workaround TRBE overwriting trace
>> data in FILL mode"
>> + default y
>> + depends on COMPILE_TEST # Until the CoreSight TRBE driver changes
>> are in
>> + depends on CORESIGHT_TRBE
>> + select ARM64_WORKAROUND_TRBE_OVERWRITE_FILL_MODE
>> + help
>> + This option adds the workaround for ARM Cortex-A710 erratum
>> 2119858.
>> +
>> + Affected Cortex-A710 cores could overwrite upto 3 cache lines
>> of trace
>
> up to
>
>> + data at the base of the buffer (ponited by TRBASER_EL1) in FILL
>> mode in
>
> pointed to by
>
>> + the event of a WRAP event.
>> +
>> + Work around the issue by always making sure we move the
>> TRBPTR_EL1 by
>> + 256bytes before enabling the buffer and filling the first
>> 256bytes of
>
> 256 bytes 256 bytes
>
>> + the buffer with ETM ignore packets upon disabling.
>> +
>> + If unsure, say Y.
>> +
>> +config ARM64_ERRATUM_2139208
>> + bool "Neoverse-N2: 2139208: workaround TRBE overwriting trace
>> data in FILL mode"
>> + default y
>> + depends on COMPILE_TEST # Until the CoreSight TRBE driver changes
>> are in
>> + depends on CORESIGHT_TRBE
>> + select ARM64_WORKAROUND_TRBE_OVERWRITE_FILL_MODE
>> + help
>> + This option adds the workaround for ARM Neoverse-N2 erratum
>> 2139208.
>> +
>> + Affected Neoverse-N2 cores could overwrite upto 3 cache lines
>> of trace
>
> up to
>
>> + data at the base of the buffer (ponited by TRBASER_EL1) in FILL
>> mode in
>
> pointed to by
>
>> + the event of a WRAP event.
>> +
>> + Work around the issue by always making sure we move the
>> TRBPTR_EL1 by
>> + 256bytes before enabling the buffer and filling the first
>> 256bytes of
>
> 256 bytes 256 bytes
>
>> + the buffer with ETM ignore packets upon disabling.
>> +
>> + If unsure, say Y.
>
>
This series adds CPU erratum work arounds related to the self-hosted
tracing. The list of affected errata handled in this series are :
* TRBE may overwrite trace in FILL mode
- Arm Neoverse-N2 #2139208
- Cortex-A710 #211985
* A TSB instruction may not flush the trace completely when executed
in trace prohibited region.
- Arm Neoverse-N2 #2067961
- Cortex-A710 #2054223
* TRBE may write to out-of-range address
- Arm Neoverse-N2 #2253138
- Cortex-A710 #2224489
The series applies on coresight/next. The series has been reordered
to make it easier to merge the patches via arm64 tree and the coresight
tree.
Patches 1-4 are could be picked up via arm64 tree. The rest can go via
the coresight tree. The Kconfig items for the TRBE errata are initially
dropped in with dependency on COMPILE_TEST. These are dropped only after
the driver is equipped with the work around in later patches.
A tree is available here :
git@git.gitlab.arm.com:linux-arm/linux-skp.git coresight/errata/trbe-tsb-n2-a710/v4
Changes since v3:
- Fix missing Kconfig selection for TSB flush failure erratum (Will)
Merged the Kconfig changes to the core patch for TSB.
- Use COMPILE_TEST dependency for the TRBE work arounds instead of
delaying the Kconfig entries.
Changes since v2:
* https://lkml.kernel.org/r/20210921134121.2423546-1-suzuki.poulose@arm.com
- Dropped patch adding a helper to reach cpudata from perf handle
- Split the TSB erratum work around patch to split the Kconfig/erratum
list update changes(pushed to the end of the series).
- Added wrappers to check the erratum :
trbe_has_erratum(cpudata, TRBE_ERRATUM_<TITLE>) -> trbe_may_<title>
- More ASCII art explanation on workaround.
Changes since v1:
* https://lkml.kernel.org/r/20210728135217.591173-1-suzuki.poulose@arm.com
- Added a fix to the TRBE driver handling of sink_specific data
- Added more description and ASCII art for overwrite in FILL mode
work around
- Added another TRBE erratum to the list.
"TRBE may write to out-of-range address"
Patches from 12-17
- Added comment to list the expectations around TSB erratum workaround.
Suzuki K Poulose (15):
arm64: Add Neoverse-N2, Cortex-A710 CPU part definition
arm64: errata: Add detection for TRBE overwrite in FILL mode
arm64: errata: Add workaround for TSB flush failures
arm64: errata: Add detection for TRBE write to out-of-range
coresight: trbe: Add a helper to calculate the trace generated
coresight: trbe: Add a helper to pad a given buffer area
coresight: trbe: Decouple buffer base from the hardware base
coresight: trbe: Allow driver to choose a different alignment
coresight: trbe: Add infrastructure for Errata handling
coresight: trbe: Workaround TRBE errata overwrite in FILL mode
coresight: trbe: Add a helper to determine the minimum buffer size
coresight: trbe: Make sure we have enough space
coresight: trbe: Work around write to out of range
arm64: errata: Enable workaround for TRBE overwrite in FILL mode
arm64: errata: Enable TRBE workaround for write to out-of-range
address
Documentation/arm64/silicon-errata.rst | 12 +
arch/arm64/Kconfig | 111 ++++++
arch/arm64/include/asm/barrier.h | 16 +-
arch/arm64/include/asm/cputype.h | 4 +
arch/arm64/kernel/cpu_errata.c | 64 ++++
arch/arm64/tools/cpucaps | 3 +
drivers/hwtracing/coresight/coresight-trbe.c | 382 +++++++++++++++++--
7 files changed, 556 insertions(+), 36 deletions(-)
--
2.25.4
On 12/10/2021 09:19, Will Deacon wrote:
> On Mon, Oct 11, 2021 at 04:55:37PM +0100, German Gomez wrote:
>> Hi Leo,
>>
>> On 06/10/2021 10:51, Leo Yan wrote:
>>> On Wed, Oct 06, 2021 at 10:35:20AM +0100, German Gomez wrote:
>>>
>>> [...]
>>>
>>>>> So simply say, I think the head pointer monotonically increasing is
>>>>> the right thing to do in Arm SPE driver.
>>>> I will talk to James about how we can proceed on this.
>>> Thanks!
>>
>> I took this offline with James and, though it looks possible to patch
>> the SPE driver to have a monotonically increasing head pointer in order
>> to simplify the handling in the perf tool, it could be a breaking change
>> for users of the perf_event_open syscall that currently rely on the way
>> it works now.
>>
>> An alternative way we considered to simplify the patch is to change the
>> logic inside the find_snapshot callback so that it records the entire
>> contents of the aux buffer every time.
>>
>> What do you think?
>
> What does intel-pt do?
Intel-pt has a wrapped head, which is why it has the intel_pt_find_snapshot()
function in perf to try to not save any zeros from the buffer that haven't
been written yet. (With a wrapped head pointer it's impossible to tell).
Coresight has a monotonically increasing head pointer so it is possible to
tell. Recently, Leo removed the Coresight version of find_snapshot() for this
reason.
It would be nice to do the same for SPE because that function has a heuristic
and is also slow, but I imagine that not returning wrapped head pointers could
break anything that expects them.
James
>
> Will
>