On 30/07/2021 04:48, Anshuman Khandual wrote:
>
> On 7/23/21 6:16 PM, Suzuki K Poulose wrote:
>> The Trace Filtering support (FEAT_TRF) ensures that the ETM
>> can be prohibited from generating any trace for a given EL.
>> This is much stricter knob, than the TRCVICTLR exception level
>
> Could you please explain 'stricter' ? Are you suggesting that TRCVICTLR
> based exception filtering some times might not implement the filtering
> even if configured ?
>
Sure, the TRVICTLR only ensures that the ETM doesn't generate
any "branch" trace packets. But that doesn't prevent it from
generating the "Context" packets which may contain the kernel
addresses, if they are generated while in Kernel.
But, the FEAT_TRF strictly prevents the trace unit from generating
any packets while it is "prohibited". Thus it is a much better
control to prevent kernel address leaks via the trace.
>> masks. At the moment, we do a onetime enable trace at user and
>> kernel and leave it untouched for the kernel life time.
>>
>> This patch makes the switch dynamic, by honoring the filters
>> set by the user and enforcing them in the TRFCR controls.
>
> TRFCR actually helps in making the exception level filtering dynamic
> which was not possible earlier with TRCVICTLR.
>
>> We also rename the cpu_enable_tracing() appropriately to
>> cpu_detect_trace_filtering() and the drvdata member
>> trfc => trfcr to indicate the "value" of the TRFCR_EL1.
>
> Makes sense.
>
>>
>> Cc: Mathieu Poirier <mathieu.poirier(a)linaro.org>
>> Cc: Al Grant <al.grant(a)arm.com>
>> Cc: Mike Leach <mike.leach(a)linaro.org>
>> Cc: Leo Yan <leo.yan(a)linaro.org>
>> Signed-off-by: Suzuki K Poulose <suzuki.poulose(a)arm.com>
>> ---
>> .../coresight/coresight-etm4x-core.c | 61 ++++++++++++++-----
>> drivers/hwtracing/coresight/coresight-etm4x.h | 5 +-
>> .../coresight/coresight-self-hosted-trace.h | 7 +++
>> 3 files changed, 55 insertions(+), 18 deletions(-)
>>
>> diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
>> index 3e548dac9b05..adba84b29455 100644
>> --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
>> +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
>> @@ -237,6 +237,43 @@ struct etm4_enable_arg {
>> int rc;
>> };
>>
>> +/*
>> + * etm4x_prohibit_trace - Prohibit the CPU from tracing at all ELs.
>> + * When the CPU supports FEAT_TRF, we could move the ETM to a trace
>> + * prohibited state by filtering the Exception levels via TRFCR_EL1.
>> + */
>> +static void etm4x_prohibit_trace(struct etmv4_drvdata *drvdata)
>> +{
>> + if (drvdata->trfcr)
>> + cpu_prohibit_trace();
>
> Should it be as etm4x_allow_trace() instead, where drvdata->trfcr
> indicates the presence of FEAT_TRF - just to be clear ?
>
> /* If the CPU doesn't support FEAT_TRF, nothing to do */
> if (!drvdata->trfcr)
> return;
>
> cpu_prohibit_trace();
>
OK
>> +}
>> +
>> +/*
>> + * etm4x_allow_trace - Allow CPU tracing in the respective ELs,
>> + * as configured by the drvdata->config.mode for the current
>> + * session. Even though we have TRCVICTLR bits to filter the
>> + * trace in the ELs, it doesn't prevent the ETM from generating
>> + * a packet (e.g, TraceInfo) that might contain the addresses from
>> + * the excluded levels. Thus we use the additional controls provided
>> + * via the Trace Filtering controls (FEAT_TRF) to make sure no trace
>> + * is generated for the excluded ELs.
>> + */
>> +static void etm4x_allow_trace(struct etmv4_drvdata *drvdata)
>> +{
>> + u64 trfcr = drvdata->trfcr;
>> +
>> + /* If the CPU doesn't support FEAT_TRF, nothing to do */
>> + if (!trfcr)
>> + return;
>> +
>> + if (drvdata->config.mode & ETM_MODE_EXCL_KERN)
>> + trfcr &= ~TRFCR_ELx_ExTRE;
>> + if (drvdata->config.mode & ETM_MODE_EXCL_USER)
>> + trfcr &= ~TRFCR_ELx_E0TRE;
>> +
>> + write_trfcr(trfcr);
>> +}
>> +
>> #ifdef CONFIG_ETM4X_IMPDEF_FEATURE
>>
>> #define HISI_HIP08_AMBA_ID 0x000b6d01
>> @@ -441,6 +478,7 @@ static int etm4_enable_hw(struct etmv4_drvdata *drvdata)
>> if (etm4x_is_ete(drvdata))
>> etm4x_relaxed_write32(csa, TRCRSR_TA, TRCRSR);
>>
>> + etm4x_allow_trace(drvdata);
>> /* Enable the trace unit */
>> etm4x_relaxed_write32(csa, 1, TRCPRGCTLR);
>>
>> @@ -719,7 +757,6 @@ static int etm4_enable(struct coresight_device *csdev,
>> static void etm4_disable_hw(void *info)
>> {
>> u32 control;
>> - u64 trfcr;
>> struct etmv4_drvdata *drvdata = info;
>> struct etmv4_config *config = &drvdata->config;
>> struct coresight_device *csdev = drvdata->csdev;
>> @@ -746,12 +783,7 @@ static void etm4_disable_hw(void *info)
>> * If the CPU supports v8.4 Trace filter Control,
>> * set the ETM to trace prohibited region.
>> */
>> - if (drvdata->trfc) {
>> - trfcr = read_sysreg_s(SYS_TRFCR_EL1);
>> - write_sysreg_s(trfcr & ~(TRFCR_ELx_ExTRE | TRFCR_ELx_E0TRE),
>> - SYS_TRFCR_EL1);
>> - isb();
>> - }
>> + etm4x_prohibit_trace(drvdata);
>> /*
>> * Make sure everything completes before disabling, as recommended
>> * by section 7.3.77 ("TRCVICTLR, ViewInst Main Control Register,
>> @@ -767,9 +799,6 @@ static void etm4_disable_hw(void *info)
>> if (coresight_timeout(csa, TRCSTATR, TRCSTATR_PMSTABLE_BIT, 1))
>> dev_err(etm_dev,
>> "timeout while waiting for PM stable Trace Status\n");
>> - if (drvdata->trfc)
>> - write_sysreg_s(trfcr, SYS_TRFCR_EL1);
>> -
>> /* read the status of the single shot comparators */
>> for (i = 0; i < drvdata->nr_ss_cmp; i++) {
>> config->ss_status[i] =
>> @@ -964,15 +993,15 @@ static bool etm4_init_csdev_access(struct etmv4_drvdata *drvdata,
>> return false;
>> }
>>
>> -static void cpu_enable_tracing(struct etmv4_drvdata *drvdata)
>> +static void cpu_detect_trace_filtering(struct etmv4_drvdata *drvdata)
>> {
>> u64 dfr0 = read_sysreg(id_aa64dfr0_el1);
>> u64 trfcr;
>>
>> + drvdata->trfcr = 0;
>> if (!cpuid_feature_extract_unsigned_field(dfr0, ID_AA64DFR0_TRACE_FILT_SHIFT))
>> return;
>>
>> - drvdata->trfc = true;
>> /*
>> * If the CPU supports v8.4 SelfHosted Tracing, enable
>> * tracing at the kernel EL and EL0, forcing to use the
>> @@ -986,7 +1015,7 @@ static void cpu_enable_tracing(struct etmv4_drvdata *drvdata)
>> if (is_kernel_in_hyp_mode())
>> trfcr |= TRFCR_EL2_CX;
>>
>> - write_trfcr(trfcr);
>> + drvdata->trfcr = trfcr;
>> }
>>
>> static void etm4_init_arch_data(void *info)
>> @@ -1177,7 +1206,7 @@ static void etm4_init_arch_data(void *info)
>> /* NUMCNTR, bits[30:28] number of counters available for tracing */
>> drvdata->nr_cntr = BMVAL(etmidr5, 28, 30);
>> etm4_cs_lock(drvdata, csa);
>> - cpu_enable_tracing(drvdata);
>> + cpu_detect_trace_filtering(drvdata);
>> }
>>
>> static inline u32 etm4_get_victlr_access_type(struct etmv4_config *config)
>> @@ -1673,7 +1702,7 @@ static int etm4_cpu_save(struct etmv4_drvdata *drvdata)
>> int ret = 0;
>>
>> /* Save the TRFCR irrespective of whether the ETM is ON */
>> - if (drvdata->trfc)
>> + if (drvdata->trfcr)
>> drvdata->save_trfcr = read_trfcr();
>> /*
>> * Save and restore the ETM Trace registers only if
>> @@ -1782,7 +1811,7 @@ static void __etm4_cpu_restore(struct etmv4_drvdata *drvdata)
>>
>> static void etm4_cpu_restore(struct etmv4_drvdata *drvdata)
>> {
>> - if (drvdata->trfc)
>> + if (drvdata->trfcr)
>> write_trfcr(drvdata->save_trfcr);
>> if (drvdata->state_needs_restore)
>> __etm4_cpu_restore(drvdata);
>> diff --git a/drivers/hwtracing/coresight/coresight-etm4x.h b/drivers/hwtracing/coresight/coresight-etm4x.h
>> index 82cba16b73a6..724819592c2e 100644
>> --- a/drivers/hwtracing/coresight/coresight-etm4x.h
>> +++ b/drivers/hwtracing/coresight/coresight-etm4x.h
>> @@ -919,7 +919,8 @@ struct etmv4_save_state {
>> * @nooverflow: Indicate if overflow prevention is supported.
>> * @atbtrig: If the implementation can support ATB triggers
>> * @lpoverride: If the implementation can support low-power state over.
>> - * @trfc: If the implementation supports Arm v8.4 trace filter controls.
>> + * @trfcr: If the CPU supportfs FEAT_TRF, value of the TRFCR_ELx with
>
> Typo here. ^^^^^^ s/supportfs/supports
>
>> + * trace allowed at user and kernel ELs. Otherwise, 0.
>
> The sentence here does not make sense. Is not the exception level ELx and EL0
> can be filtered out independently ? Should this be something like ...
The value holds a superset of the possible "allowed" configurations.
We do this to avoid setting the TRFCR_CX everytime depending on the
kernel EL (only possible from EL2). So we initialize the field
with value of TRCR_ELx with all the ELs enabled. This can be filtered
later by the driver accordingly. This will also serve as marker
to check the availability of the feature.
Thanks
Suzuki
>
> "If the CPU supports FEAT_TRF, value of the TRFCR_ELx - indicating whether
> trace is allowed at user [and/or] kernel ELs. Otherwise, 0."
>
>> * @config: structure holding configuration parameters.
>> * @save_trfcr: Saved TRFCR_EL1 register during a CPU PM event.
>> * @save_state: State to be preserved across power loss
>> @@ -972,7 +973,7 @@ struct etmv4_drvdata {
>> bool nooverflow;
>> bool atbtrig;
>> bool lpoverride;
>> - bool trfc;
>> + u64 trfcr;
>> struct etmv4_config config;
>> u64 save_trfcr;
>> struct etmv4_save_state *save_state;
>> diff --git a/drivers/hwtracing/coresight/coresight-self-hosted-trace.h b/drivers/hwtracing/coresight/coresight-self-hosted-trace.h
>> index 53b35a28075e..586d26e0cba3 100644
>> --- a/drivers/hwtracing/coresight/coresight-self-hosted-trace.h
>> +++ b/drivers/hwtracing/coresight/coresight-self-hosted-trace.h
>> @@ -22,4 +22,11 @@ static inline void write_trfcr(u64 val)
>> isb();
>> }
>>
>> +static inline void cpu_prohibit_trace(void)
>> +{
>> + u64 trfcr = read_trfcr();
>> +
>> + /* Prohibit tracing at EL0 & the kernel EL */
>> + write_trfcr(trfcr & ~(TRFCR_ELx_ExTRE | TRFCR_ELx_E0TRE));
>> +}
>> #endif /* __CORESIGHT_SELF_HOSTED_TRACE_H */
>>
This patchset introduces initial concepts in CoreSight system
configuration management support. to allow more detailed and complex
programming to be applied to CoreSight systems during trace capture.
Configurations consist of 2 elements:-
1) Features - programming combinations for devices, applied to a class of
device on the system (all ETMv4), or individual devices.
2) Configurations - a set of programmed features used when the named
configuration is selected.
Features and configurations are declared as a data table, a set of register,
resource and parameter requirements. Features and configurations are loaded
into the system by the virtual cs_syscfg device. This then matches features
to any registered devices and loads the feature into them.
Individual device classes that support feature and configuration register
with cs_syscfg.
Once loaded a configuration can be enabled for a specific trace run.
Configurations are registered with the perf cs_etm event as entries in
cs_etm/events. These can be selected on the perf command line as follows:-
perf record -e cs_etm/<config_name>/ ...
This patch set has one pre-loaded configuration and feature.
A named "strobing" feature is provided for ETMv4.
A named "autofdo" configuration is provided. This configuration enables
strobing on any ETM in used.
Thus the command:
perf record -e cs_etm/autofdo/ ...
will trace the supplied application while enabling the "autofdo" configuation
on each ETM as it is enabled by perf. This in turn will enable strobing for
the ETM - with default parameters. Parameters can be adjusted using configfs.
The sink used in the trace run will be automatically selected.
A configuration can supply up to 15 of preset parameter values, which will
subsitute in parameter values for any feature used in the configuration.
Selection of preset values as follows
perf record -e cs_etm/autofdo,preset=1/ ...
(valid presets 1-N, where N is the number supplied in the configuration, not
exceeding 15. preset=0 is the same as not selecting a preset.)
Applies to & tested against coresight/next (5.13-rc6 base)
Changes since v8:
Patch 0003 - altered spinlock to use irq versions. Moved use of config enabled flag.
Patch 0005 - dropped in_enable flag. Use config enabled flag within spinlock guards.
Changes since v7:
Fixed kernel test robot issue - config with CORESIGHT=y & CONFIGFS_FS=m causes
build error. Altered CORESIGHT config to select CONFIGFS_FS.
Reported-by: kernel test robot <lkp(a)intel.com>
Replaced mutex use to protect loaded config lists in coresight devices with per
device spinlock to remove issue when disable called in interrupt context.
Reported-by: Branislav Rankov <branislav.rankov(a)arm.com>
Changes since v6:
Fixed kernel test robot issues-
Reported-by: kernel test robot <lkp(a)intel.com>
Changes since v5:
1) Fix code style issues from auto-build reports, as
Reported-by: kernel test robot <lkp(a)intel.com>
2) Update comments to get consistent docs for API functions.
3) remove unused #define from autofdo example.
4) fix perf code style issues from patch 4 (Mathieu)
5) fix configfs code style issues from patch 9. (Mathieu)
Changes since v4: (based on comments from Matthieu and Suzuki).
No large functional changes - primarily code improvements and naming schema.
1) Updated entire set to ensure a consistent naming scheme was used for
variables and struct members that refer to the key objects in the system.
Suffixes _desc used for all references to feature and configuraion descriptors,
suffix _csdev used for all references to load feature and configs in the csdev
instances. (Mathieu & Suzuki).
2) Dropped the 'configurations' sub dir in cs_etm perf directories as superfluous
with the configfs containing the same information. (Mathieu).
3) Simplified perf handling code (suzuki)
4) Multiple simplifications and improvements for code readability (Matthieu
and Suzuki)
Changes since v3: (Primarily based on comments from Matthieu)
1) Locking mechanisms simplified.
2) Removed the possibility to enable features independently from
configurations.Only configurations can be enabled now. Simplifies programming
logic.
3) Configuration now uses an activate->enable mechanism. This means that perf
will activate a selected configuration at the start of a session (during
setup_aux), and disable at the end of a session (around free_aux)
The active configuration and associated features will be programmed into the
CoreSight device instances when they are enabled. This locks the configuration
into the system while in use. Parameters cannot be altered while this is
in place. This mechanism will be extended in future for dynamic load / unload
of configurations to prevent removal while in use.
4) Removed the custom bus / driver as un-necessary. A single device is
registered to own perf fs elements and configfs.
5) Various other minor issues addressed.
Changes since v2:
1) Added documentation file.
2) Altered cs_syscfg driver to no longer be coresight_device based, and moved
to its own custom bus to remove it from the main coresight bus. (Mathieu)
3) Added configfs support to inspect and control loaded configurations and
features. Allows listing of preset values (Yabin Cui)
4) Dropped sysfs support for adjusting feature parameters on the per device
basis, in favour of a single point adjustment in configfs that is pushed to all
device instances.
5) Altered how the config and preset command line options are handled in perf
and the drivers. (Mathieu and Suzuki).
6) Fixes for various issues and technical points (Mathieu, Yabin)
Changes since v1:
1) Moved preloaded configurations and features out of individual drivers.
2) Added cs_syscfg driver to manage configurations and features. Individual
drivers register with cs_syscfg indicating support for config, and provide
matching information that the system uses to load features into the drivers.
This allows individual drivers to be updated on an as needed basis - and
removes the need to consider devices that cannot benefit from configuration -
static replicators, funnels, tpiu.
3) Added perf selection of configuarations.
4) Rebased onto the coresight module loading set.
To follow in future revisions / sets:-
a) load of additional config and features by loadable module.
b) load of additional config and features by configfs
c) enhanced resource management for ETMv4 and checking features have sufficient
resources to be enabled.
d) ECT and CTI support for configuration and features.
Mike Leach (10):
coresight: syscfg: Initial coresight system configuration
coresight: syscfg: Add registration and feature loading for cs devices
coresight: config: Add configuration and feature generic functions
coresight: etm-perf: update to handle configuration selection
coresight: syscfg: Add API to activate and enable configurations
coresight: etm-perf: Update to activate selected configuration
coresight: etm4x: Add complex configuration handlers to etmv4
coresight: config: Add preloaded configurations
coresight: syscfg: Add initial configfs support
Documentation: coresight: Add documentation for CoreSight config
.../trace/coresight/coresight-config.rst | 244 +++++
Documentation/trace/coresight/coresight.rst | 16 +
drivers/hwtracing/coresight/Kconfig | 1 +
drivers/hwtracing/coresight/Makefile | 7 +-
.../hwtracing/coresight/coresight-cfg-afdo.c | 153 ++++
.../coresight/coresight-cfg-preload.c | 31 +
.../coresight/coresight-cfg-preload.h | 13 +
.../hwtracing/coresight/coresight-config.c | 272 ++++++
.../hwtracing/coresight/coresight-config.h | 253 ++++++
drivers/hwtracing/coresight/coresight-core.c | 12 +-
.../hwtracing/coresight/coresight-etm-perf.c | 150 +++-
.../hwtracing/coresight/coresight-etm-perf.h | 12 +-
.../hwtracing/coresight/coresight-etm4x-cfg.c | 182 ++++
.../hwtracing/coresight/coresight-etm4x-cfg.h | 30 +
.../coresight/coresight-etm4x-core.c | 38 +-
.../coresight/coresight-etm4x-sysfs.c | 3 +
.../coresight/coresight-syscfg-configfs.c | 396 ++++++++
.../coresight/coresight-syscfg-configfs.h | 45 +
.../hwtracing/coresight/coresight-syscfg.c | 844 ++++++++++++++++++
.../hwtracing/coresight/coresight-syscfg.h | 81 ++
include/linux/coresight.h | 9 +
21 files changed, 2756 insertions(+), 36 deletions(-)
create mode 100644 Documentation/trace/coresight/coresight-config.rst
create mode 100644 drivers/hwtracing/coresight/coresight-cfg-afdo.c
create mode 100644 drivers/hwtracing/coresight/coresight-cfg-preload.c
create mode 100644 drivers/hwtracing/coresight/coresight-cfg-preload.h
create mode 100644 drivers/hwtracing/coresight/coresight-config.c
create mode 100644 drivers/hwtracing/coresight/coresight-config.h
create mode 100644 drivers/hwtracing/coresight/coresight-etm4x-cfg.c
create mode 100644 drivers/hwtracing/coresight/coresight-etm4x-cfg.h
create mode 100644 drivers/hwtracing/coresight/coresight-syscfg-configfs.c
create mode 100644 drivers/hwtracing/coresight/coresight-syscfg-configfs.h
create mode 100644 drivers/hwtracing/coresight/coresight-syscfg.c
create mode 100644 drivers/hwtracing/coresight/coresight-syscfg.h
--
2.17.1
This printf is the only occurrence outside of the tests. When using
OpenCSD with perf in TUI mode, printfs can corrupt the UI and make
perf's own error messages hard to see. The proper way to print would be
pr_warning() or ui__warning() which aren't available here.
I don't see an easy way of making this work, and I don't think this
printf is that useful either, so remove it.
Signed-off-by: James Clark <james.clark(a)arm.com>
---
decoder/source/trc_frame_deformatter.cpp | 1 -
1 file changed, 1 deletion(-)
diff --git a/decoder/source/trc_frame_deformatter.cpp b/decoder/source/trc_frame_deformatter.cpp
index 4d46854..affc324 100644
--- a/decoder/source/trc_frame_deformatter.cpp
+++ b/decoder/source/trc_frame_deformatter.cpp
@@ -436,7 +436,6 @@ int TraceFmtDcdImpl::checkForResetFSyncPatterns()
if (num_fsyncs)
{
- printf("Frame deformatter: Found %d FSYNCS\n",num_fsyncs);
if ((num_fsyncs % 4) == 0)
{
// reset the upstream decoders
--
2.28.0
On 29/07/2021 10:55, Marc Zyngier wrote:
> On Wed, 28 Jul 2021 14:52:17 +0100,
> Suzuki K Poulose <suzuki.poulose(a)arm.com> wrote:
>>
>> Arm Neoverse-N2 (#2067961) and Cortex-A710 (#2054223) suffers
>> from errata, where a TSB (trace synchronization barrier)
>> fails to flush the trace data completely, when executed from
>> a trace prohibited region. In Linux we always execute it
>> after we have moved the PE to trace prohibited region. So,
>> we can apply the workaround everytime a TSB is executed.
>>
>> The work around is to issue two TSB consecutively.
>>
>> NOTE: This errata is defined as LOCAL_CPU_ERRATUM, implying
>> that a late CPU could be blocked from booting if it is the
>> first CPU that requires the workaround. This is because we
>> do not allow setting a cpu_hwcaps after the SMP boot. The
>> other alternative is to use "this_cpu_has_cap()" instead
>> of the faster system wide check, which may be a bit of an
>> overhead, given we may have to do this in nvhe KVM host
>> before a guest entry.
>>
>> Cc: Will Deacon <will(a)kernel.org>
>> Cc: Catalin Marinas <catalin.marinas(a)arm.com>
>> Cc: Mathieu Poirier <mathieu.poirier(a)linaro.org>
>> Cc: Mike Leach <mike.leach(a)linaro.org>
>> Cc: Mark Rutland <mark.rutland(a)arm.com>
>> Cc: Anshuman Khandual <anshuman.khandual(a)arm.com>
>> Cc: Marc Zyngier <maz(a)kernel.org>
>> Signed-off-by: Suzuki K Poulose <suzuki.poulose(a)arm.com>
>> ---
>> Documentation/arm64/silicon-errata.rst | 4 ++++
>> arch/arm64/Kconfig | 31 ++++++++++++++++++++++++++
>> arch/arm64/include/asm/barrier.h | 17 +++++++++++++-
>> arch/arm64/kernel/cpu_errata.c | 19 ++++++++++++++++
>> arch/arm64/tools/cpucaps | 1 +
>> 5 files changed, 71 insertions(+), 1 deletion(-)
>
> [...]
>
>> diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
>> index 451e11e5fd23..3bc1ed436e04 100644
>> --- a/arch/arm64/include/asm/barrier.h
>> +++ b/arch/arm64/include/asm/barrier.h
>> @@ -23,7 +23,7 @@
>> #define dsb(opt) asm volatile("dsb " #opt : : : "memory")
>>
>> #define psb_csync() asm volatile("hint #17" : : : "memory")
>> -#define tsb_csync() asm volatile("hint #18" : : : "memory")
>> +#define __tsb_csync() asm volatile("hint #18" : : : "memory")
>> #define csdb() asm volatile("hint #20" : : : "memory")
>>
>> #ifdef CONFIG_ARM64_PSEUDO_NMI
>> @@ -46,6 +46,21 @@
>> #define dma_rmb() dmb(oshld)
>> #define dma_wmb() dmb(oshst)
>>
>> +
>> +#define tsb_csync() \
>> + do { \
>> + /* \
>> + * CPUs affected by Arm Erratum 2054223 or 2067961 needs \
>> + * another TSB to ensure the trace is flushed. \
>> + */ \
>> + if (cpus_have_const_cap(ARM64_WORKAROUND_TSB_FLUSH_FAILURE)) { \
>
> Could this be made a final cap instead? Or do you expect this to be
> usable before caps have been finalised?
Good point. This can be final cap.
>
>> + __tsb_csync(); \
>> + __tsb_csync(); \
>> + } else { \
>> + __tsb_csync(); \
>> + } \
>
> nit: You could keep one unconditional __tsb_csync().
I thought about that, I was worried if the CPU expects them back to back
without any other instructions in between them. Thinking about it a bit
more, it doesn't look like that is the case. I will confirm this and
change it accordingly.
Thanks
Suzuki
>
> Thanks,
>
> M.
>
This series fixes the following issues with the TRBE and Self-Hosted
trace for CoreSight.
The Self-hosted trace filter control registers are now save-restored
across CPU PM event. And more importantly the Trace Filtering is now
used to control per ETM session (rather than allowing the trace
throughout the life time of the system). i.e, ETM configuration of
the given run is used to enforce trace filtering (TRFCR) along with the
Trace Exclusion controls in TRCVICTLR.
For the TRBE, we were using the TRUNCATED flag in the AUX buffer on
every IRQ to indicate that we may have lost a few bytes of trace. But
this causes the event to be disabled until the userspace re-enables
it back, even when there is space left in the ring buffer. To make
things worse, we were restarting the AUX handle, which would soon
be disabled, potentially creating 0 sized records (without truncation),
which the perf tool tends to ignore. This might cause the event to be
disabled permanently. Also, sometimes we leave the buffer TRUNCATED,
but delay the closing of the handle to event schedule out, which could
cause significant black out in the trace capture. This was reported
by Tamas Zsoldos.
This series removes the use of TRUNCATED flag for every IRQ. Instead,
it is only used if we really run out of space in the buffer. And also
we make sure the "handle" is closed immediately on TRUNCATED case,
which triggers the userspace to take action. The core perf layer has
been hardened to handle this case where a "handle" is closed out.
Finally, we make sure that the CPU trace is prohibited, when the TRBE
is left disabled. The ETE/ETM driver will program the Trace Filtering
appropriately since we do this dynamically now with the first half
of the series.
Changes since v1 [0]:
- Moved TRFCR related accessors to a new header file
- Following a discussion, dropped the TRUNCATED flag from
the TRBE IRQ handler on WRAP. Instead mark COLLISION.
- Added new patches to harden the ETM perf layer to handle
an error in the sink driver.
- Fix TRBE spurious IRQ handling
- Cleanup TRBE driver to make the "TRUNCATE" cases managed
at a central place.
[0] https://lkml.kernel.org/r/20210712113830.2803257-1-suzuki.poulose@arm.com
Suzuki K Poulose (10):
coresight: etm4x: Save restore TRFCR_EL1
coresight: etm4x: Use Trace Filtering controls dynamically
coresight: etm-pmu: Ensure the AUX handle is valid
coresight: trbe: Ensure the format flag is set on truncation
coresight: trbe: Drop duplicate TRUNCATE flags
coresight: trbe: Fix handling of spurious interrupts
coresight: trbe: Do not truncate buffer on IRQ
coresight: trbe: Unify the enabling sequence
coresight: trbe: End the AUX handle on truncation
coresight: trbe: Prohibit trace before disabling TRBE
.../hwtracing/coresight/coresight-etm-perf.c | 27 ++++-
.../coresight/coresight-etm4x-core.c | 98 ++++++++++++----
drivers/hwtracing/coresight/coresight-etm4x.h | 7 +-
.../coresight/coresight-self-hosted-trace.h | 34 ++++++
drivers/hwtracing/coresight/coresight-trbe.c | 109 ++++++++++--------
5 files changed, 197 insertions(+), 78 deletions(-)
create mode 100644 drivers/hwtracing/coresight/coresight-self-hosted-trace.h
--
2.24.1
This patchset introduces initial concepts in CoreSight system
configuration management support. to allow more detailed and complex
programming to be applied to CoreSight systems during trace capture.
Configurations consist of 2 elements:-
1) Features - programming combinations for devices, applied to a class of
device on the system (all ETMv4), or individual devices.
2) Configurations - a set of programmed features used when the named
configuration is selected.
Features and configurations are declared as a data table, a set of register,
resource and parameter requirements. Features and configurations are loaded
into the system by the virtual cs_syscfg device. This then matches features
to any registered devices and loads the feature into them.
Individual device classes that support feature and configuration register
with cs_syscfg.
Once loaded a configuration can be enabled for a specific trace run.
Configurations are registered with the perf cs_etm event as entries in
cs_etm/events. These can be selected on the perf command line as follows:-
perf record -e cs_etm/<config_name>/ ...
This patch set has one pre-loaded configuration and feature.
A named "strobing" feature is provided for ETMv4.
A named "autofdo" configuration is provided. This configuration enables
strobing on any ETM in used.
Thus the command:
perf record -e cs_etm/autofdo/ ...
will trace the supplied application while enabling the "autofdo" configuation
on each ETM as it is enabled by perf. This in turn will enable strobing for
the ETM - with default parameters. Parameters can be adjusted using configfs.
The sink used in the trace run will be automatically selected.
A configuration can supply up to 15 of preset parameter values, which will
subsitute in parameter values for any feature used in the configuration.
Selection of preset values as follows
perf record -e cs_etm/autofdo,preset=1/ ...
(valid presets 1-N, where N is the number supplied in the configuration, not
exceeding 15. preset=0 is the same as not selecting a preset.)
Applies to & tested against coresight/next (5.13-rc6 base)
Changes since v7:
Fixed kernel test robot issue - config with CORESIGHT=y & CONFIGFS_FS=m causes
build error. Altered CORESIGHT config to select CONFIGFS_FS.
Reported-by: kernel test robot <lkp(a)intel.com>
Replaced mutex use to protect loaded config lists in coresight devices with per
device spinlock to remove issue when disable called in interrupt context.
Reported-by: Branislav Rankov <branislav.rankov(a)arm.com>
Changes since v6:
Fixed kernel test robot issues-
Reported-by: kernel test robot <lkp(a)intel.com>
Changes since v5:
1) Fix code style issues from auto-build reports, as
Reported-by: kernel test robot <lkp(a)intel.com>
2) Update comments to get consistent docs for API functions.
3) remove unused #define from autofdo example.
4) fix perf code style issues from patch 4 (Mathieu)
5) fix configfs code style issues from patch 9. (Mathieu)
Changes since v4: (based on comments from Matthieu and Suzuki).
No large functional changes - primarily code improvements and naming schema.
1) Updated entire set to ensure a consistent naming scheme was used for
variables and struct members that refer to the key objects in the system.
Suffixes _desc used for all references to feature and configuraion descriptors,
suffix _csdev used for all references to load feature and configs in the csdev
instances. (Mathieu & Suzuki).
2) Dropped the 'configurations' sub dir in cs_etm perf directories as superfluous
with the configfs containing the same information. (Mathieu).
3) Simplified perf handling code (suzuki)
4) Multiple simplifications and improvements for code readability (Matthieu
and Suzuki)
Changes since v3: (Primarily based on comments from Matthieu)
1) Locking mechanisms simplified.
2) Removed the possibility to enable features independently from
configurations.Only configurations can be enabled now. Simplifies programming
logic.
3) Configuration now uses an activate->enable mechanism. This means that perf
will activate a selected configuration at the start of a session (during
setup_aux), and disable at the end of a session (around free_aux)
The active configuration and associated features will be programmed into the
CoreSight device instances when they are enabled. This locks the configuration
into the system while in use. Parameters cannot be altered while this is
in place. This mechanism will be extended in future for dynamic load / unload
of configurations to prevent removal while in use.
4) Removed the custom bus / driver as un-necessary. A single device is
registered to own perf fs elements and configfs.
5) Various other minor issues addressed.
Changes since v2:
1) Added documentation file.
2) Altered cs_syscfg driver to no longer be coresight_device based, and moved
to its own custom bus to remove it from the main coresight bus. (Mathieu)
3) Added configfs support to inspect and control loaded configurations and
features. Allows listing of preset values (Yabin Cui)
4) Dropped sysfs support for adjusting feature parameters on the per device
basis, in favour of a single point adjustment in configfs that is pushed to all
device instances.
5) Altered how the config and preset command line options are handled in perf
and the drivers. (Mathieu and Suzuki).
6) Fixes for various issues and technical points (Mathieu, Yabin)
Changes since v1:
1) Moved preloaded configurations and features out of individual drivers.
2) Added cs_syscfg driver to manage configurations and features. Individual
drivers register with cs_syscfg indicating support for config, and provide
matching information that the system uses to load features into the drivers.
This allows individual drivers to be updated on an as needed basis - and
removes the need to consider devices that cannot benefit from configuration -
static replicators, funnels, tpiu.
3) Added perf selection of configuarations.
4) Rebased onto the coresight module loading set.
To follow in future revisions / sets:-
a) load of additional config and features by loadable module.
b) load of additional config and features by configfs
c) enhanced resource management for ETMv4 and checking features have sufficient
resources to be enabled.
d) ECT and CTI support for configuration and features.
Mike Leach (10):
coresight: syscfg: Initial coresight system configuration
coresight: syscfg: Add registration and feature loading for cs devices
coresight: config: Add configuration and feature generic functions
coresight: etm-perf: update to handle configuration selection
coresight: syscfg: Add API to activate and enable configurations
coresight: etm-perf: Update to activate selected configuration
coresight: etm4x: Add complex configuration handlers to etmv4
coresight: config: Add preloaded configurations
coresight: syscfg: Add initial configfs support
Documentation: coresight: Add documentation for CoreSight config
.../trace/coresight/coresight-config.rst | 244 ++++++
Documentation/trace/coresight/coresight.rst | 16 +
drivers/hwtracing/coresight/Kconfig | 1 +
drivers/hwtracing/coresight/Makefile | 7 +-
.../hwtracing/coresight/coresight-cfg-afdo.c | 153 ++++
.../coresight/coresight-cfg-preload.c | 31 +
.../coresight/coresight-cfg-preload.h | 13 +
.../hwtracing/coresight/coresight-config.c | 275 ++++++
.../hwtracing/coresight/coresight-config.h | 253 ++++++
drivers/hwtracing/coresight/coresight-core.c | 12 +-
.../hwtracing/coresight/coresight-etm-perf.c | 150 +++-
.../hwtracing/coresight/coresight-etm-perf.h | 12 +-
.../hwtracing/coresight/coresight-etm4x-cfg.c | 182 ++++
.../hwtracing/coresight/coresight-etm4x-cfg.h | 30 +
.../coresight/coresight-etm4x-core.c | 38 +-
.../coresight/coresight-etm4x-sysfs.c | 3 +
.../coresight/coresight-syscfg-configfs.c | 396 +++++++++
.../coresight/coresight-syscfg-configfs.h | 45 +
.../hwtracing/coresight/coresight-syscfg.c | 829 ++++++++++++++++++
.../hwtracing/coresight/coresight-syscfg.h | 81 ++
include/linux/coresight.h | 11 +
21 files changed, 2746 insertions(+), 36 deletions(-)
create mode 100644 Documentation/trace/coresight/coresight-config.rst
create mode 100644 drivers/hwtracing/coresight/coresight-cfg-afdo.c
create mode 100644 drivers/hwtracing/coresight/coresight-cfg-preload.c
create mode 100644 drivers/hwtracing/coresight/coresight-cfg-preload.h
create mode 100644 drivers/hwtracing/coresight/coresight-config.c
create mode 100644 drivers/hwtracing/coresight/coresight-config.h
create mode 100644 drivers/hwtracing/coresight/coresight-etm4x-cfg.c
create mode 100644 drivers/hwtracing/coresight/coresight-etm4x-cfg.h
create mode 100644 drivers/hwtracing/coresight/coresight-syscfg-configfs.c
create mode 100644 drivers/hwtracing/coresight/coresight-syscfg-configfs.h
create mode 100644 drivers/hwtracing/coresight/coresight-syscfg.c
create mode 100644 drivers/hwtracing/coresight/coresight-syscfg.h
--
2.17.1
On Wed, Jul 14, 2021 at 09:40:15AM +0100, Russell King (Oracle) wrote:
> On Tue, Jul 13, 2021 at 07:13:02PM +0100, Catalin Marinas wrote:
> > We could try to clarify E2.2.1 to simply state that naturally aligned
> > LDRD/STRD are single-copy atomic without any subsequent statement on the
> > translation table.
>
> I think that clarification would be most helpful. Thanks.
Thanks for the suggestion and confirmation, Russell & Catalin.
If so, I will implement the weak functions for
compat_auxtrace_mmap__{read_head|write_tail}; and write the arm/arm64
specific functions with using LDRD/STRD instructions.
For better patches organization, I will use a separate patch set for
enabling the compat functions (in particular patches 10, 11/11) in
the next spin.
Thanks,
Leo
Em Tue, Jul 13, 2021 at 05:31:03PM +0000, Hunter, Adrian escreveu:
> > On Mon, Jul 12, 2021 at 03:14:35PM -0300, Arnaldo Carvalho de Melo wrote:
> > > Em Sun, Jul 11, 2021 at 06:41:04PM +0800, Leo Yan escreveu:
> > > > +++ b/tools/perf/util/env.c
> > > > @@ -11,6 +11,7 @@
> > > > #include <stdlib.h>
> > > > #include <string.h>
> > > > +int kernel_is_64_bit;
> > > > struct perf_env perf_env;
> > > Why can't this be in 'struct perf_env'?
> > Good question. I considered to add it in struct perf_env but finally I used this
> > way; the reason is this variable "kernel_is_64_bit" is only used during
> > recording phase for AUX ring buffer, and don't use it for report. So seems to
> > me it's over complexity to add a new field and just wander if it's necessary to
> > save this field as new feature in the perf header.
> I think we store the arch, so if the "kernel_is_64_bit" calculation depends only on arch
> then I guess we don't need a new feature at the moment.
So, I wasn't suggesting to add this info to the perf.data file header,
just to the in-memory 'struct perf_env'.
And also we should avoid unconditionally initializing things that we may
never need, please structure it as:
static void perf_env__init_kernel_mode(struct perf_env *env)
{
const char *arch = perf_env__raw_arch(env);
if (!strncmp(arch, "x86_64", 6) || !strncmp(arch, "aarch64", 7) ||
!strncmp(arch, "arm64", 5) || !strncmp(arch, "mips64", 6) ||
!strncmp(arch, "parisc64", 8) || !strncmp(arch, "riscv64", 7) ||
!strncmp(arch, "s390x", 5) || !strncmp(arch, "sparc64", 7))
kernel_is_64_bit = 1;
else
kernel_is_64_bit = 0;
}
void perf_env__init(struct perf_env *env)
{
...
env->kernel_is_64_bit = -1;
...
}
bool perf_env__kernel_is_64_bit(struct perf_env *env)
{
if (env->kernel_is_64_bit == -1)
perf_env__init_kernel_mode(env);
return env->kernel_is_64_bit;
}
One thing in my TODO is to crack down on the tons of initializations
perf does unconditionally, last time I looked there are lots :-\
- Arnaldo
> > Combining the comment from Adrian in another email, I think it's good to add
> > a new field "compat_mode" in the struct perf_env, and this field will be
> > initialized in build-record.c. Currently we don't need to save this value into
> > the perf file, if later we need to use this value for decoding phase, then we
> > can add a new feature item to save "compat_mode"
> > into the perf file's header.
> > If you have any different idea, please let me know. Thanks!
This patchset consists of refactoring to allow the decoder to be
created in advance when the AUX records are iterated over. The
AUX record flags are used to communicate whether the data is
formatted or not which is the reason this refactoring is required.
These changes result in some simplifications, removal of early exit
conditions etc.
A change was also made to --dump-raw-trace code to allow the
formatted/unformatted status to persist and for the decoder to
not be continually deleted and recreated.
The changes apply on top of the previous patchset "[PATCH v7 0/2] perf
cs-etm: Split Coresight decode by aux records".
Changes since v1:
* Change 'decoders_per_cpu' variable name to 'decoders' and add a comment
* Add a warning that piped mode is best effort, suggested by Suzuki
James Clark (6):
perf cs-etm: Refactor initialisation of kernel start address
perf cs-etm: Split setup and timestamp search functions
perf cs-etm: Only setup queues when they are modified
perf cs-etm: Suppress printing when resetting decoder
perf cs-etm: Use existing decoder instead of resetting it
perf cs-etm: Pass unformatted flag to decoder
.../perf/util/cs-etm-decoder/cs-etm-decoder.c | 14 +-
tools/perf/util/cs-etm.c | 185 +++++++++---------
2 files changed, 97 insertions(+), 102 deletions(-)
--
2.28.0
Patch release for OpenCSD v.1.1.1 made to address C-API include file
issues for the ETE decoder, raised by work on perf tools
Regards
Mike
--
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK