CTIs are defined in the device tree and associated with other CoreSight
devices. The core CoreSight code has been modified to enable the registration
of the CTI devices on the same bus as the other CoreSight components,
but as these are not actually trace generation / capture devices, they
are not part of the Coresight path when generating trace.
However, the definition of the standard CoreSight device has been extended
to include a reference to an associated CTI device, and the enable / disable
trace path operations will auto enable/disable any associated CTI devices at
the same time.
Programming is at present via sysfs - a full API is provided to utilise the
hardware capabilities. As CTI devices are unprogrammed by default, the auto
enable describe above will have no effect until explicit programming takes
place.
A set of device tree bindings specific to the CTI topology has been defined.
The driver accesses these in a platform agnostic manner, so ACPI bindings
can be added later, once they have been agreed and defined for the CTI device.
Documentation has been updated to describe both the CTI hardware, its use and
programming in sysfs, and the new dts bindings required.
Tested on DB410 board, applies on coresight/next tree - 5.3-rc1 based.
Changes since v3:
* After discussion on CS mailing list, each CTI connection has a triggers<N>
sysfs directory with name and trigger signals listed for the connection.
* Initial code for creating sysfs links between CoreSight components is
introduced and implementation for CTI provided. This allows exploration
of the CoreSight topology within the sysfs infrastructure. Patches for
links between other CoreSight components to follow.
* Power management - CPU hotplug and idle omitted from this set as ongoing
developments may define required direction. Additional patch set to follow.
* Multiple fixes applied as requested by reviewers esp. Matthieu, Suzuki
and Leo.
Changes since v2:
Updates to allow for new features on coresight/next and feedback from
Mathieu and Leo.
1) Rebase and restructuring to apply on top of ACPI support patch set,
currently on coresight/next. of_coresight_cti has been renamed to
coresight-cti-platform and device tree bindings added to this but accessed
in a platform agnostic manner using fwnode for later ACPI support
to be added.
2) Split the sysfs patch info a series of functional patches.
3) Revised the refcount and enabling support.
4) Adopted the generic naming protocol - CTIs are either cti_cpuN or
cti_sysM
5) Various minor presentation /checkpatch issues highlighted in feedback.
6) revised CPU hotplug to cover missing cases needed by ETM.
Changes since v1:
1) Significant restructuring of the source code. Adds cti-sysfs file and
cti device tree file. Patches add per feature rather than per source
file.
2) CPU type power event handling for hotplug moved to CoreSight core,
with generic registration interface provided for all CPU bound CS devices
to use.
3) CTI signal interconnection details in sysfs now generated dynamically
from connection lists in driver. This to fix issue with multi-line sysfs
output in previous version.
4) Full device tree bindings for DB410 and Juno provided (to the extent
that CTI information is available).
5) AMBA driver update for UCI IDs are now upstream so no longer included
in this set.
Mike Leach (15):
coresight: cti: Initial CoreSight CTI Driver
coresight: cti: Add sysfs coresight mgmt reg access.
coresight: cti: Add sysfs access to program function regs
coresight: cti: Add sysfs trigger / channel programming API
dt-bindings: arm: Adds CoreSight CTI hardware definitions.
coresight: cti: Add device tree support for v8 arch CTI
coresight: cti: Add device tree support for custom CTI.
coresight: cti: Enable CTI associated with devices.
coresight: cti: Add connection information to sysfs
coresight: Add generic sysfs link creation functions.
coresight: cti: Add in sysfs links to other coresight devices.
dt-bindings: qcom: Add CTI options for qcom msm8916
dt-bindings: arm: Juno platform - add CTI entries to device tree.
docs: coresight: Update documentation for CoreSight to cover CTI.
docs: sysfs: coresight: Add sysfs documentation for CTI
.../testing/sysfs-bus-coresight-devices-cti | 231 ++++
.../bindings/arm/coresight-ect-cti.txt | 396 ++++++
.../devicetree/bindings/arm/coresight.txt | 7 +
Documentation/trace/coresight-ect.txt | 164 +++
Documentation/trace/coresight.txt | 7 +
MAINTAINERS | 3 +
arch/arm64/boot/dts/arm/juno-base.dtsi | 149 +-
arch/arm64/boot/dts/arm/juno-cs-r1r2.dtsi | 31 +-
arch/arm64/boot/dts/arm/juno-r1.dts | 25 +
arch/arm64/boot/dts/arm/juno-r2.dts | 25 +
arch/arm64/boot/dts/arm/juno.dts | 25 +
arch/arm64/boot/dts/qcom/msm8916.dtsi | 85 +-
drivers/hwtracing/coresight/Kconfig | 12 +
drivers/hwtracing/coresight/Makefile | 4 +
.../coresight/coresight-cti-platform.c | 431 ++++++
.../hwtracing/coresight/coresight-cti-sysfs.c | 1199 +++++++++++++++++
drivers/hwtracing/coresight/coresight-cti.c | 763 +++++++++++
drivers/hwtracing/coresight/coresight-cti.h | 238 ++++
.../hwtracing/coresight/coresight-platform.c | 23 +
drivers/hwtracing/coresight/coresight-priv.h | 10 +
drivers/hwtracing/coresight/coresight.c | 133 +-
include/dt-bindings/arm/coresight-cti-dt.h | 36 +
include/linux/coresight.h | 46 +
23 files changed, 4029 insertions(+), 14 deletions(-)
create mode 100644 Documentation/ABI/testing/sysfs-bus-coresight-devices-cti
create mode 100644 Documentation/devicetree/bindings/arm/coresight-ect-cti.txt
create mode 100644 Documentation/trace/coresight-ect.txt
create mode 100644 drivers/hwtracing/coresight/coresight-cti-platform.c
create mode 100644 drivers/hwtracing/coresight/coresight-cti-sysfs.c
create mode 100644 drivers/hwtracing/coresight/coresight-cti.c
create mode 100644 drivers/hwtracing/coresight/coresight-cti.h
create mode 100644 include/dt-bindings/arm/coresight-cti-dt.h
--
2.17.1
Review of ETMV4 sysfs code resulted in a number of minor issues being
discovered.
Patch set fixes these issues:-
1) Update for ETM v4.4 archtecture.
2) Add missing single shot comparator API.
3) Misc fixes and improvements to sysfs API
4) Updated programmers documentation and reference.
Changes since v1 (from reviews by Mathieu and Leo):-
Usability patch split into 2 separate functional patches.
Docs patch split into 3 patches.
Misc style and comment typo fixes.
Mike Leach (11):
coresight: etm4x: Fixes for ETM v4.4 architecture updates.
coresight: etm4x: Fix input validation for sysfs.
coresight: etm4x: Add missing API to set EL match on address filters
coresight: etm4x: Fix issues with start-stop logic.
coresight: etm4x: Improve usability of sysfs - include/exclude addr.
coresight: etm4x: Improve usability of sysfs - CID and VMID masks.
coresight: etm4x: Add view comparator settings API to sysfs.
coresight: etm4x: Add missing single-shot control API to sysfs
coresight: etm4x: docs: Update ABI doc for sysfs features added.
coresight: docs: Create common sub-directory for coresight trace.
coresight: etm4x: docs: Adds detailed document for programming etm4x.
.../testing/sysfs-bus-coresight-devices-etm4x | 183 ++++---
.../{ => coresight}/coresight-cpu-debug.txt | 0
.../coresight/coresight-etm4x-reference.txt | 458 ++++++++++++++++++
.../trace/{ => coresight}/coresight.txt | 0
MAINTAINERS | 3 +-
.../coresight/coresight-etm4x-sysfs.c | 312 +++++++++++-
drivers/hwtracing/coresight/coresight-etm4x.c | 32 +-
drivers/hwtracing/coresight/coresight-etm4x.h | 18 +-
8 files changed, 905 insertions(+), 101 deletions(-)
rename Documentation/trace/{ => coresight}/coresight-cpu-debug.txt (100%)
create mode 100644 Documentation/trace/coresight/coresight-etm4x-reference.txt
rename Documentation/trace/{ => coresight}/coresight.txt (100%)
--
2.17.1
Hi Yabin,
On Tue, 17 Sep 2019 at 15:53, Yabin Cui <yabinc(a)google.com> wrote:
>
> Hi guys,
>
> I am trying to reduce ETM data lose. There seems to have two types of data lose:
> 1. caused by instruction trace buffer overflow, which generates an Overflow packet after recovering.
> 2. caused by ETR buffer overflow, which generates an PERF_RECORD_AUX with PERF_AUX_FLAG_TRUNCATED.
>
> In practice, the second one is unlikely to happen when I set ETR buffer size to 4M, and can be improved by setting higher buffer size.
> The first one happens much more frequently, can happen about 21K times when generating 5.3M etm data.
> So I want to know if there is any way to reduce that.
>
> I found in the ETM arch manual that the overflow seems controlled by TRCSTALLCTLR, a stall control register.
> But TRCIDR3.NOOVERFLOW is not supported on my experiment device, which seems using qualcomm sdm845.
> TRCSTALLCTLR.ISTALL seems can be set by writing to mode file in /sys. But I didn't find any way to set
> TRCSTALLCTLR.LEVEL in linux kernel. So I wonder if it is ever used.
>
> Do you guys have any suggestions?
In order to get a timely response to your questions I advise to CC the
coresight mailing list (which I have included). There is a lot of
knowledgeable people there that can also help, especially with
architecture specific configuration.
TRCSTALLCTLR.LEVEL is currently not accessible to users because we
simply never needed to use the feature. If using the ISTALL and LEVEL
parameters help with your use case send a patch that exposes the
entire TRCSTALLCTLR register (rather than just the LEVEL field) and
I'll be happy to fold it in.
Thanks,
Mathieu
>
> Best,
> Yabin
>
This patch series adds support for thread stack and callchain.
Patch 01 is to refactor the instruction size calculation and it is a
preparation for patch 02.
Patch 02 is to add thread stack support, after applying this patch then
the option '-F,+callindent' can be used by perf script tool; patch 03
is to add branch filter thus the perf tool can only display function
calls and returns after enable the call indentation or call chain
related options. Patch 04 is the patch to synthesize call chain for the
instruction samples.
Patch 05 allows the instruction sample can be handled synchronously with
the thread stack, thus it fixes an error for the callchain generation.
This patch set has been tested on 96boards Hikey620.
Test for option '-F,+callindent':
Before:
# perf script -F,+callindent
main 2808 1 branches: coresight_test1 ffff8634f5c8 coresight_test1+0x3c (/root/coresight_test/libcstest.so)
main 2808 1 branches: printf@plt aaaaba8d37ec main+0x28 (/root/coresight_test/main)
main 2808 1 branches: printf@plt aaaaba8d36bc printf@plt+0xc (/root/coresight_test/main)
main 2808 1 branches: _init aaaaba8d3650 _init+0x30 (/root/coresight_test/main)
main 2808 1 branches: _dl_fixup ffff86373b4c _dl_runtime_resolve+0x40 (/lib/aarch64-linux-gnu/ld-2.28.so)
main 2808 1 branches: _dl_lookup_symbol_x ffff8636e078 _dl_fixup+0xb8 (/lib/aarch64-linux-gnu/ld-2.28.so)
[...]
After:
# perf script -F,+callindent
main 2808 1 branches: coresight_test1@plt aaaaba8d37d8 main+0x14 (/root/coresight_test/main)
main 2808 1 branches: _dl_fixup ffff86373b4c _dl_runtime_resolve+0x40 (/lib/aarch64-linux-gnu/ld-2.28.s
main 2808 1 branches: _dl_lookup_symbol_x ffff8636e078 _dl_fixup+0xb8 (/lib/aarch64-linux-gnu/ld-2.28.so)
main 2808 1 branches: do_lookup_x ffff8636a49c _dl_lookup_symbol_x+0x104 (/lib/aarch64-linux-gnu/ld-2.28.
main 2808 1 branches: check_match ffff86369bf0 do_lookup_x+0x238 (/lib/aarch64-linux-gnu/ld-2.28.so)
main 2808 1 branches: strcmp ffff86369888 check_match+0x70 (/lib/aarch64-linux-gnu/ld-2.28.so)
main 2808 1 branches: printf@plt aaaaba8d37ec main+0x28 (/root/coresight_test/main)
main 2808 1 branches: _dl_fixup ffff86373b4c _dl_runtime_resolve+0x40 (/lib/aarch64-linux-gnu/ld-2.28.s
main 2808 1 branches: _dl_lookup_symbol_x ffff8636e078 _dl_fixup+0xb8 (/lib/aarch64-linux-gnu/ld-2.28.so)
main 2808 1 branches: do_lookup_x ffff8636a49c _dl_lookup_symbol_x+0x104 (/lib/aarch64-linux-gnu/ld-2.28.
main 2808 1 branches: _dl_name_match_p ffff86369af0 do_lookup_x+0x138 (/lib/aarch64-linux-gnu/ld-2.28.so)
main 2808 1 branches: strcmp ffff8636f7f0 _dl_name_match_p+0x18 (/lib/aarch64-linux-gnu/ld-2.28.so)
[...]
Test for option '--itrace=g':
Before:
# perf script --itrace=g16l64i100
main 1579 100 instructions: ffff0000102137f0 group_sched_in+0xb0 ([kernel.kallsyms])
main 1579 100 instructions: ffff000010213b78 flexible_sched_in+0xf0 ([kernel.kallsyms])
main 1579 100 instructions: ffff0000102135ac event_sched_in.isra.57+0x74 ([kernel.kallsyms])
main 1579 100 instructions: ffff000010219344 perf_swevent_add+0x6c ([kernel.kallsyms])
main 1579 100 instructions: ffff000010214854 perf_event_update_userpage+0x4c ([kernel.kallsyms])
[...]
After:
# perf script --itrace=g16l64i100
main 1579 100 instructions:
ffff000010213b78 flexible_sched_in+0xf0 ([kernel.kallsyms])
ffff00001020c0b4 visit_groups_merge+0x12c ([kernel.kallsyms])
main 1579 100 instructions:
ffff0000102135ac event_sched_in.isra.57+0x74 ([kernel.kallsyms])
ffff0000102137a0 group_sched_in+0x60 ([kernel.kallsyms])
ffff000010213b84 flexible_sched_in+0xfc ([kernel.kallsyms])
ffff00001020c0b4 visit_groups_merge+0x12c ([kernel.kallsyms])
main 1579 100 instructions:
ffff000010219344 perf_swevent_add+0x6c ([kernel.kallsyms])
ffff0000102135f4 event_sched_in.isra.57+0xbc ([kernel.kallsyms])
ffff0000102137a0 group_sched_in+0x60 ([kernel.kallsyms])
ffff000010213b84 flexible_sched_in+0xfc ([kernel.kallsyms])
ffff00001020c0b4 visit_groups_merge+0x12c ([kernel.kallsyms])
[...]
Changes from v1:
* Added comments for task thread handling (Mathieu).
* Split patch 02 into two patches, one is for support thread stack and
another is for callchain support (Mathieu).
* Added a new patch to support branch filter.
Leo Yan (5):
perf cs-etm: Refactor instruction size handling
perf cs-etm: Support thread stack
perf cs-etm: Support branch filter
perf cs-etm: Support callchain for instruction sample
perf cs-etm: Correct callchain for instruction sample
tools/perf/util/cs-etm.c | 141 ++++++++++++++++++++++++++++++++-------
1 file changed, 118 insertions(+), 23 deletions(-)
--
2.17.1
Some hardware will ignore bit TRCPDCR.PU which is used to signal
to hardware that power should not be removed from the trace unit.
Let's mitigate against this by conditionally saving and restoring
the trace unit state when the CPU enters low power states.
This patchset introduces a firmware property named
'arm,coresight-loses-context-with-cpu' - when this is present the
hardware state will be conditionally saved and restored.
A module parameter 'pm_save_enable' is also introduced which can
be configured to override the firmware property. This parameter
also provides a means to save/restore state when external agents
are used.
The hardware state is only ever saved and restored when a coresight
session is present.
Changes since v5:
- Fix indentation, comment style and add implicit fallthrough comment
- Use NOTIFY_* for all return values in pm notifier callback
- Rename PARAM_PM_SAVE_EXTERNAL to PARAM_PM_SAVE_ALWAYS
- Update module parameter description
- Add comment to explain why we keep power on
- Rebased onto coresight/next c165d8947bc4 ("eeprom: Deprecate the legacy eeprom driver")
Changes since v4:
- Rename fwnode property to "arm,coresight-loses-context-with-cpu" as this
doesn't imply a software policy
- Update the device tree binding document to indicate that this property
isn't specific to ETMs - also provide a longer description more generic
description with an example of why it might be used
- Set the module parameter at probe based on the value determined by firmware.
The user can still override the firmware via the kernel command line, this
has the effect of hiding the PARAM_PM_SAVE_FIRMWARE option from the user -
though we still internally use it to allow us to determine if the user has
set the parameter.
- Remove unnecessary call to smp_processor_id
- Move etm4_needs_save_restore helper to coresight.c and rename
- Rebased onto coresight/next a04d8683f577 ("...ity of etm4_os_unlock comment")
- Drop Reviewed-By from Suzuki on "coresight: etm4x: save/restore st..." patch
as content changed too much
- Add module option to that keeps clocks/power enabled at probe and saves
state when external or self-hosted is in use.
Changes since v3:
- Only save/restore when self-hosted is being used and detect this
without relying on the coresight registers (which may not be
available)
- Only allocate memory for etmv4_save_state at probe time when
configuration indicates it may be used
- Set pm_save_enable param to 0444 such that it is static after
boot
- Save/restore TRCPDCR
- Add missing comments to struct etm4_drvdata documentation
- Rebased onto coresight/next (8f1f9857)
Changes since v2:
- Move the PM notifier block from drvdata to file static
- Add section names to document references
- Add additional information to commit messages
- Remove trcdvcvr and trcdvcmr from saved state and add a comment to
describe why
- Ensure TRCPDCR_PU is set after restore and add a comment to explain
why we bother toggling TRCPDCR_PU on save/restore
- Reword the pm_save_enable options and add comments
- Miscellaneous style changes
- Move device tree binding documentation to its own patch
Changes since v1:
- Rebased onto coresight/next
- Correcly pass bit number rather than BIT macro to coresight_timeout
- Abort saving state if a timeout occurs
- Fix completely broken pm_notify handling and unregister handler on error
- Use state_needs_restore to ensure state is restored only once
- Add module parameter description to existing boot_enable parameter
and use module_param instead of module_param_named
- Add firmware bindings for coresight-needs-save-restore
- Rename 'disable_pm_save' to 'pm_save_enable' which allows for
disabled, enabled or firmware
- Update comment on etm4_os_lock, it incorrectly indicated that
the code unlocks the trace registers
- Add comments to explain use of OS lock during save/restore
- Fix incorrect error description whilst waiting for PM stable
- Add WARN_ON_ONCE when cpu isn't as expected during save/restore
- Various updates to commit messages
Andrew Murray (3):
coresight: etm4x: save/restore state across CPU low power states
dt-bindings: arm: coresight: Add support for
coresight-loses-context-with-cpu
coresight: etm4x: save/restore state for external agents
.../devicetree/bindings/arm/coresight.txt | 9 +
drivers/hwtracing/coresight/coresight-etm4x.c | 351 +++++++++++++++++-
drivers/hwtracing/coresight/coresight-etm4x.h | 64 ++++
drivers/hwtracing/coresight/coresight.c | 8 +-
include/linux/coresight.h | 13 +
5 files changed, 443 insertions(+), 2 deletions(-)
--
2.21.0
Hello,
I have had some success streaming instruction trace to my Linux PC using RDDI (from ARM Development studio 2019) and a DStreamer ST. However I am also
having some problems:
1. Failure to stop trace cleanly
I have enabled streaming from an ETM attached to a Cortex-R5 core.
As described in the RDDI documentation, to stop streaming, I call StreamingTrace_Flush() and wait for
RDDI_STREAMING_TRACE_EVENT_TYPE_END_OF_DATA event.
Next i call StreamingTrace_Stop() which succeeds and then try StreamingTrace_Detach() which seems to fail with the following assert.
tpp.c:84: __pthread_tpp_change_priority: Assertion `new_prio == -1 || (new_prio >= fifo_min_prio && new_prio <= fifo_max_prio)' failed.
Aborted (core dumped)
If i try to avoid calling StreamingTrace_{Detach, Stop} then Debug_CloseConn to the core never returns and i get a seg fault.
2. Raw trace does not start with A-SYNC bits
I do see A-SYNC bits later but my understanding was ETM always sends the A-SYNC bits first.
Cheers,
raj
Some hardware will ignore bit TRCPDCR.PU which is used to signal
to hardware that power should not be removed from the trace unit.
Let's mitigate against this by conditionally saving and restoring
the trace unit state when the CPU enters low power states.
This patchset introduces a firmware property named
'arm,coresight-loses-context-with-cpu' - when this is present the
hardware state will be conditionally saved and restored.
A module parameter 'pm_save_enable' is also introduced which can
be configured to override the firmware property. This parameter
also provides a means to save/restore state when external agents
are used.
The hardware state is only ever saved and restored when a coresight
session is present.
The last patch should be considered as an RFC as further consideration is
required relating to where the pm_save_enable parameter lives and to
determine if the external agent support should be a pm_save_enable option
or a new kernel option.
Changes since v4:
- Rename fwnode property to "arm,coresight-loses-context-with-cpu" as this
doesn't imply a software policy
- Update the device tree binding document to indicate that this property
isn't specific to ETMs - also provide a longer description more generic
description with an example of why it might be used
- Set the module parameter at probe based on the value determined by firmware.
The user can still override the firmware via the kernel command line, this
has the effect of hiding the PARAM_PM_SAVE_FIRMWARE option from the user -
though we still internally use it to allow us to determine if the user has
set the parameter.
- Remove unnecessary call to smp_processor_id
- Move etm4_needs_save_restore helper to coresight.c and rename
- Rebased onto coresight/next a04d8683f577 ("...ity of etm4_os_unlock comment")
- Drop Reviewed-By from Suzuki on "coresight: etm4x: save/restore st..." patch
as content changed too much
- Add module option to that keeps clocks/power enabled at probe and saves
state when external or self-hosted is in use.
Changes since v3:
- Only save/restore when self-hosted is being used and detect this
without relying on the coresight registers (which may not be
available)
- Only allocate memory for etmv4_save_state at probe time when
configuration indicates it may be used
- Set pm_save_enable param to 0444 such that it is static after
boot
- Save/restore TRCPDCR
- Add missing comments to struct etm4_drvdata documentation
- Rebased onto coresight/next (8f1f9857)
Changes since v2:
- Move the PM notifier block from drvdata to file static
- Add section names to document references
- Add additional information to commit messages
- Remove trcdvcvr and trcdvcmr from saved state and add a comment to
describe why
- Ensure TRCPDCR_PU is set after restore and add a comment to explain
why we bother toggling TRCPDCR_PU on save/restore
- Reword the pm_save_enable options and add comments
- Miscellaneous style changes
- Move device tree binding documentation to its own patch
Changes since v1:
- Rebased onto coresight/next
- Correcly pass bit number rather than BIT macro to coresight_timeout
- Abort saving state if a timeout occurs
- Fix completely broken pm_notify handling and unregister handler on error
- Use state_needs_restore to ensure state is restored only once
- Add module parameter description to existing boot_enable parameter
and use module_param instead of module_param_named
- Add firmware bindings for coresight-needs-save-restore
- Rename 'disable_pm_save' to 'pm_save_enable' which allows for
disabled, enabled or firmware
- Update comment on etm4_os_lock, it incorrectly indicated that
the code unlocks the trace registers
- Add comments to explain use of OS lock during save/restore
- Fix incorrect error description whilst waiting for PM stable
- Add WARN_ON_ONCE when cpu isn't as expected during save/restore
- Various updates to commit messages
Andrew Murray (3):
coresight: etm4x: save/restore state across CPU low power states
dt-bindings: arm: coresight: Add support for
coresight-loses-context-with-cpu
coresight: etm4x: save/restore state for external agents
.../devicetree/bindings/arm/coresight.txt | 9 +
drivers/hwtracing/coresight/coresight-etm4x.c | 339 +++++++++++++++++-
drivers/hwtracing/coresight/coresight-etm4x.h | 64 ++++
drivers/hwtracing/coresight/coresight.c | 8 +-
include/linux/coresight.h | 13 +
5 files changed, 431 insertions(+), 2 deletions(-)
--
2.21.0
Hi,
We have an ARM v8 based SOC that has got Marvell's ARM trace implementation,
with the Trace formatter disabled.
ie. FFSR[FtNotPresent] in ETR register space reports 0b1.
Have couple of queries related to the tracing and decoding of this hardware configuration.
1. Does openCSD tool supports decoding of trace capture from hardware that has
trace formatter disabled.
>From the openCSD RRADME,
"The library will decode formatted trace in three stages:
Frame Deformatting : Removal CoreSight frame formatting from individual trace streams.
Packet Processing : Separate individual trace streams into discrete packets.
Packet Decode : Convert the packets into fully decoded trace describing the program flow on a core."
So, is it possible for openCSD to carry out decoding bypassing the first stage mentioned above.
2. Linux Coresight ETR driver as part of managing the trace buffer wrap condition,
a barrier packet is written at the beginning which is essentially a Frame synchronization packet.
This gives an impression that that ETR driver is making an assumption that the trace formatter is implemented in hardware.
Should we not need a fix here, to accommodate hardware configurations that doesn't support trace formatter as well ?
Thanks.
Linu Cherian
Arm and arm64 architecture reserve some memory regions prior to the
symbol '_stext' and these memory regions later will be used by device
module and BPF jit. The current code misses to consider these memory
regions thus any address in the regions will be taken as user space
mode, but perf cannot find the corresponding dso with the wrong CPU
mode so we misses to generate samples for device module and BPF
related trace data.
This patch parse the link scripts to get the memory size prior to start
address and reduce this size from 'machine>->kernel_start', then can
get a fixed up kernel start address which contain memory regions for
device module and BPF. Finally, machine__get_kernel_start() can reflect
more complete kernel memory regions and perf can successfully generate
samples.
The reason for parsing the link scripts is Arm architecture changes text
offset dependent on different platforms, which define multiple text
offsets in $kernel/arch/arm/Makefile. This offset is decided when build
kernel and the final value is extended in the link script, so we can
extract the used value from the link script. We use the same way to
parse arm64 link script as well. If fail to find the link script, the
pre start memory size is assumed as zero, in this case it has no any
change caused with this patch.
Below is detailed info for testing this patch:
- Install or build LLVM/Clang;
- Configure perf with ~/.perfconfig:
root@debian:~# cat ~/.perfconfig
# this file is auto-generated.
[llvm]
clang-path = /mnt/build/llvm-build/build/install/bin/clang
kbuild-dir = /mnt/linux-kernel/linux-cs-dev/
clang-opt = "-g"
dump-obj = true
[trace]
show_zeros = yes
show_duration = no
no_inherit = yes
show_timestamp = no
show_arg_names = no
args_alignment = 40
show_prefix = yes
- Run 'perf trace' command with eBPF event:
root@debian:~# perf trace -e string \
-e $kernel/tools/perf/examples/bpf/augmented_raw_syscalls.c
- Read eBPF program memory mapping in kernel:
root@debian:~# echo 1 > /proc/sys/net/core/bpf_jit_kallsyms
root@debian:~# cat /proc/kallsyms | grep -E "bpf_prog_.+_sys_[enter|exit]"
ffff00000008a0d0 t bpf_prog_e470211b846088d5_sys_enter [bpf]
ffff00000008c6a4 t bpf_prog_29c7ae234d79bd5c_sys_exit [bpf]
- Launch any program which accesses file system frequently so can hit
the system calls trace flow with eBPF event;
- Capture CoreSight trace data with filtering eBPF program:
root@debian:~# perf record -e cs_etm/@tmc_etr0/ \
--filter 'filter 0xffff00000008a0d0/0x800' -a sleep 5s
- Decode the eBPF program symbol 'bpf_prog_f173133dc38ccf87_sys_enter':
root@debian:~# perf script -F,ip,sym
Frame deformatter: Found 4 FSYNCS
0 [unknown]
ffff00000008a1ac bpf_prog_e470211b846088d5_sys_enter
ffff00000008a250 bpf_prog_e470211b846088d5_sys_enter
0 [unknown]
ffff00000008a124 bpf_prog_e470211b846088d5_sys_enter
0 [unknown]
ffff00000008a14c bpf_prog_e470211b846088d5_sys_enter
ffff00000008a13c bpf_prog_e470211b846088d5_sys_enter
ffff00000008a14c bpf_prog_e470211b846088d5_sys_enter
0 [unknown]
ffff00000008a180 bpf_prog_e470211b846088d5_sys_enter
0 [unknown]
ffff00000008a1ac bpf_prog_e470211b846088d5_sys_enter
ffff00000008a190 bpf_prog_e470211b846088d5_sys_enter
ffff00000008a1ac bpf_prog_e470211b846088d5_sys_enter
ffff00000008a250 bpf_prog_e470211b846088d5_sys_enter
0 [unknown]
ffff00000008a124 bpf_prog_e470211b846088d5_sys_enter
0 [unknown]
ffff00000008a14c bpf_prog_e470211b846088d5_sys_enter
0 [unknown]
ffff00000008a180 bpf_prog_e470211b846088d5_sys_enter
[...]
Cc: Mathieu Poirier <mathieu.poirier(a)linaro.org>
Cc: Alexander Shishkin <alexander.shishkin(a)linux.intel.com>
Cc: Jiri Olsa <jolsa(a)redhat.com>
Cc: Namhyung Kim <namhyung(a)kernel.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose(a)arm.com>
Cc: Adrian Hunter <adrian.hunter(a)intel.com>
Cc: coresight(a)lists.linaro.org
Cc: linux-arm-kernel(a)lists.infradead.org
Signed-off-by: Leo Yan <leo.yan(a)linaro.org>
---
tools/perf/Makefile.config | 22 ++++++++++++++++++++++
tools/perf/util/machine.c | 15 ++++++++++++++-
2 files changed, 36 insertions(+), 1 deletion(-)
diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
index e4988f49ea79..d7ff839d8b20 100644
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -48,9 +48,20 @@ ifeq ($(SRCARCH),x86)
NO_PERF_REGS := 0
endif
+ARM_PRE_START_SIZE := 0
+
ifeq ($(SRCARCH),arm)
NO_PERF_REGS := 0
LIBUNWIND_LIBS = -lunwind -lunwind-arm
+ ifneq ($(wildcard $(srctree)/arch/$(SRCARCH)/kernel/vmlinux.lds),)
+ # Extract info from lds:
+ # . = ((0xC0000000)) + 0x00208000;
+ # ARM_PRE_START_SIZE := 0x00208000
+ ARM_PRE_START_SIZE := $(shell egrep ' \. \= \({2}0x[0-9a-fA-F]+\){2}' \
+ $(srctree)/arch/$(SRCARCH)/kernel/vmlinux.lds | \
+ sed -e 's/[(|)|.|=|+|<|;|-]//g' -e 's/ \+/ /g' -e 's/^[ \t]*//' | \
+ awk -F' ' '{printf "0x%x", $$2}' 2>/dev/null)
+ endif
endif
ifeq ($(SRCARCH),arm64)
@@ -58,8 +69,19 @@ ifeq ($(SRCARCH),arm64)
NO_SYSCALL_TABLE := 0
CFLAGS += -I$(OUTPUT)arch/arm64/include/generated
LIBUNWIND_LIBS = -lunwind -lunwind-aarch64
+ ifneq ($(wildcard $(srctree)/arch/$(SRCARCH)/kernel/vmlinux.lds),)
+ # Extract info from lds:
+ # . = ((((((((0xffffffffffffffff)) - (((1)) << (48)) + 1) + (0)) + (0x08000000))) + (0x08000000))) + 0x00080000;
+ # ARM_PRE_START_SIZE := (0x08000000 + 0x08000000 + 0x00080000) = 0x10080000
+ ARM_PRE_START_SIZE := $(shell egrep ' \. \= \({8}0x[0-9a-fA-F]+\){2}' \
+ $(srctree)/arch/$(SRCARCH)/kernel/vmlinux.lds | \
+ sed -e 's/[(|)|.|=|+|<|;|-]//g' -e 's/ \+/ /g' -e 's/^[ \t]*//' | \
+ awk -F' ' '{printf "0x%x", $$6+$$7+$$8}' 2>/dev/null)
+ endif
endif
+CFLAGS += -DARM_PRE_START_SIZE=$(ARM_PRE_START_SIZE)
+
ifeq ($(SRCARCH),csky)
NO_PERF_REGS := 0
endif
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index f6ee7fbad3e4..e993f891bb82 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -2687,13 +2687,26 @@ int machine__get_kernel_start(struct machine *machine)
machine->kernel_start = 1ULL << 63;
if (map) {
err = map__load(map);
+ if (err)
+ return err;
+
/*
* On x86_64, PTI entry trampolines are less than the
* start of kernel text, but still above 2^63. So leave
* kernel_start = 1ULL << 63 for x86_64.
*/
- if (!err && !machine__is(machine, "x86_64"))
+ if (!machine__is(machine, "x86_64"))
machine->kernel_start = map->start;
+
+ /*
+ * On arm/arm64, the kernel uses some memory regions which are
+ * prior to '_stext' symbol; to reflect the complete kernel
+ * address space, compensate these pre-defined regions for
+ * kernel start address.
+ */
+ if (!strcmp(perf_env__arch(machine->env), "arm") ||
+ !strcmp(perf_env__arch(machine->env), "arm64"))
+ machine->kernel_start -= ARM_PRE_START_SIZE;
}
return err;
}
--
2.17.1