Hi Leo
On Wed, 12 Nov 2025 at 11:19, Leo Yan leo.yan@arm.com wrote:
Hi Mike,
On Wed, Nov 12, 2025 at 09:16:46AM +0000, Mike Leach wrote:
Hi Leo,
The save/restore sequence is only used if the ETM is active at the time of power down.
However, by removing the specific save/restore structure, and using the existing drvdata->config structure, the actual state can be changed by sysfs from another PE, as the sysfs accesses only write to the drvdata->config structure and not the actual hardware. Thus this change permits the corruption of the saved state, which will result in restore != saved..
While using the same code is a good idea avoiding lots of duplication, it may be necessary to have a copy of the drvdata->config passed in to the relevant functions.
We have considered in the past if it might be worthwhile isolating the sysfs and perf config settings in the etm - it may be worth re-visiting this again with a save restore copy in mind to.
Yes, Levi reminded me he already has some patches [1] to address the issue. Basically, the series returns -EBUSY if trying to write sysfs knobs but the device has been enabled.
This does solve the problem and as such should become part of this set rather than have it as a separate set, or must precede this set so no potential issue is created using this set.
Actually, we have another solution (credits to Levi who has worked on it with some internal discussion). Let me elaberate at here:
Similiar to you suggested "isolating sysfs and perf config setting", but isolate in a different way:
User's config: it is a set parameters stored for setting from user space. It can come from sysfs knobs or from cscfs parameters.
Hardware's config: it is a set parameters for setting hardware registers (we can simply use drvdata->config for this purpose).
As a result, every time when a session is enabled (it can be a sysfs session or perf session), we apply user's config for updating hardware's config (e.g., in etm4_enable_sysfs() / etm4_enable_perf()), afterwards, when invoke etm4_disable_hw() / etm4_enable_hw(), we only use hardware's config for register setting.
Has this set been sent to the coresight mailing list - I do not believe I have seen it?
This can naturally resolve the mismatched issue you mentioned in CPU PM save and restore, and this can benefit us for later consolidating cscfg parameters.
What consolidation of cscfg parameters? Not seen anything on this either?
I prefer we can use a separate patch series to address this issue (I should review Levi's patch series so can make things proceed). At the meantime, I would say this patch is still valid.
As a side topic, actually we have found issues for the config is reset when start a session so the user's config does not take affect at all (see etm4_set_default_config()).
This code is used by the perf session - not the sysfs session. This is deliberate to start from a stable state and add any configuration from the perf command line. The perf session deliberately ignores parameters set by sysfs - in fact the first act in parsing the perf event structure in etm4_parse_event_config(), is to completely zero out the entire config structure. Then the default config is set - which guarantees trace is captured, then any perf command line options are set in the registers (e.g. timestamps, address filtering etc). After this if we have s cscfg configuration active, this too is added in.
This ensures that the configuration run on all the ETMs used during the perf session is identical - which cannot be guaranteed if we used sysfs set parameters as part of the perf session, as each ETM might have different parameters set by sysfs.This gives us predictable, stable and repeatable perf configurations.
For sysfs sessions the config for a given etm is whatever has been set using sysfs + any active cscfg config settings.Here it is adbantageous to use a cscfg as this is applied to all etms being enabled by sysfs - without the need to use the syfs files for each individual etm.
Regards
Mike.
[1] https://lore.kernel.org/linux-arm-kernel/20241221165934.1161856-2-yeoreum.yu...
On Tue, 11 Nov 2025 at 18:59, Leo Yan leo.yan@arm.com wrote:
The etm4_enable_trace_unit() function is almost identical to the restore flow, with a few differences listed below:
TRCQCTLR register
The TRCQCTLR register is saved and restored during CPU idle, but it is never touched in the normal flow. Given the Q element is not enabled (TRCCONFIGR.QE bits), it is acceptable to omit saving and restoring this register during idle.
TRCSSCSRn.STATUS bit
The restore flow does not explicitly clear the TRCSSCSRn.STATUS bit but instead directly loads the saved value. In contrast, the normal flow clears this bit to 0 when re-enabling single-shot control.
Therefore, the restore function passes the restart_ss argument as false to etm4_enable_hw() to avoid re-enabling single-shot mode.
Claim Tag Handling
The claim tag is acquired and released in normal flow, on the other hand, the claim protocol is not applied in CPU idle flow - it simply restores the saved value.
The claim bits serve two purpose:
Exclusive access between the kernel driver and an external debugger. During CPU idle, the kernel driver locks the OS Lock, ensuring that the external debugger cannot access the trace unit. Therefore, it is safe to release the claim tag during idle.
Notification to supervisory software to save/restore context for external debuggers. The kernel driver does not touch the external debugger's claim bit (ETMCLAIM[0]).
Based on this, it is safe to acquire and release claim tag in the idle sequence.
OS Lock Behavior
The OS Lock should be locked during CPU idle states. This differs from the normal flow, which unlock it. This special handling remains in the CPU save path.
This commit reuses the normal enable and disable logic in the CPU idle path. The common disable logic is moved into a new function __etm4_disable_hw(), which omits the etm4_cs_{unlock|lock}() pairs and is convenient to call from the save callback.
Why omit cs lock/unlock? Not unlocking the software lock prevents the claim tag registers from being written, which contradicts 3) above, and prevents other writes during the disable sequence.
Sorry for confusion.
I will update the commit log:
Split the hardware disable sequence into a new function __etm4_disable_hw(), which performs the common disable logic without the CoreSight lock.
__etm4_cpu_save() has its own CoreSight lock handling, it calls __etm4_disable_hw() so without duplicate lock issues.
Thanks, Leo