On 16/05/2025 5:07 pm, Leo Yan wrote:
Besides managing tracers (ETM) in CPU PM and hotplug flows, the CoreSight framework is found the issues below:
Firstly, on some hardware platforms, CoreSight links (e.g., funnels and replicators, etc) reside in a cluster power domain. If the cluster is powered off, the link components also will lose their context. In this case, Arm CoreSight drivers report errors when detect unpaired self-host claim tags.
Secondly, if a path has been activated from per CPU's tracer (ETM) to links and a sink in Sysfs mode, then when the CPU is hot-plugged off, only the associated ETM will be disabled. Afterwards, the links and the sink always keep on and no chance to be disabled.
The last issue was reported by Yabin Cui (Google) that the TRBE driver misses to save and restore context during CPU low power states. As a result, it may cause hardware lockup issue on some devices.
To resolve the power management issues, this series extends CPU power management to cover the entire activated path, including links and sinks. It moves CPU PM and hotplug notifiers from the ETMv4 driver to the CoreSight core layer. The core layer has sufficient info to maintain activated paths and can traverse the entire path to manipulate CoreSight modules accordingly.
Patch 01 is to fix a bug in ETMv4 save and restore callbacks.
Patches 02 ~ 06 move CPU PM code from ETMv4 driver to the core layer, and extends to maintain activated paths and control links.
Patches 07 and 08 support save and restore context for per-CPU sink (TRBE). Note, for avoid long latency, system level's sinks in an activated path are not touched during CPU suspend and resume.
Patches 09 ~ 11 move CPU hotplug notifier from ETMv4 driver to the core layer. The entire path will be controlled if the corresponding CPU is hot-plugged on or off.
This series has been verified on Hikey960 for CPUIdle and hotplug. And it is tested on FVP for verifying TRBE with idle states.
Hi Leo,
I ran this stress test on Juno by enabling and disabling concurrently with no sleeps:
# echo 1 > /sys/bus/coresight/devices/tmc_etr0/enable_sink # while true; do \ echo 0 > /sys/devices/system/cpu/cpu2/online; \ echo 1 > /sys/devices/system/cpu/cpu2/online; \ done &
# while true; do \ echo 1 > /sys/bus/coresight/devices/etm2/enable_source; \ echo 0 > /sys/bus/coresight/devices/etm2/enable_source; \ done &
I get lots of these:
coresight tmc_etr0: timeout while waiting for TMC to be Ready coresight tmc_etr0: Failed to enable : TMC not ready
And then even after disabling the source and sink the Perf tests don't pass anymore:
# perf test -F -vvv "arm coresight trace" --- start --- Recording trace (only user mode) with path: CPU0 => tmc_etf0 Looking at perf.data file for dumping branch samples: CoreSight path testing (CPU0 -> tmc_etf0): FAIL
I suppose it's possible this isn't entirely related to your changes, and I know we've seen some of those timeout issues before. But it's probably worth investigating.
Thanks James