Re: ETM probe failure since commit 9b6a3f3633a5cc9("coresight: etmv4: Fix CPU power management setup in probe() function")

27 Jul 2020

      On 2020-07-24 21:28, Suzuki K Poulose wrote:
...
On 07/24/2020 04:38 PM, Mike Leach wrote:
...
Hi Sai,
On Fri, 24 Jul 2020 at 12:44, Sai Prakash Ranjan
saiprakash.ranjan@codeaurora.org wrote:
...
Hi Mike,
On 2020-07-24 16:35, Mike Leach wrote:
...
Hi Sai,
On Fri, 24 Jul 2020 at 08:48, Sai Prakash Ranjan
saiprakash.ranjan@codeaurora.org wrote:
...
Hi Mike,
Since commit 9b6a3f3633a5cc9("coresight: etmv4: Fix CPU power
management
setup in probe() function"),
ETM probe fails consistently like below:
localhost ~ # dmesg | grep -i etm
[    6.460602] coresight-etm4x: probe of 7040000.etm failed with 
error
-16
[    6.524756] coresight etm1: CPU1: ETM v4.2 initialized
[    6.531152] coresight etm2: CPU2: ETM v4.2 initialized
[    6.538495] coresight etm3: CPU3: ETM v4.2 initialized
[    6.545124] coresight etm4: CPU4: ETM v4.2 initialized
[    6.552904] coresight etm5: CPU5: ETM v4.2 initialized
[    6.559714] coresight etm6: CPU6: ETM v4.2 initialized
[    6.569596] coresight etm7: CPU7: ETM v4.2 initialized
localhost ~ #
Most of the time its for ETM0 but I occasionally see ETM1 and other
ETMs
failing to probe,
but the some ETM probe failure is always there. I'm using SC7180 
based
platform on 5.4 kernel
which has all the coresight patches backported.
If I revert that commit, I don't see the issue at all. In case you 
can
identify something which
might be causing this, please let me know. I'm planning to look 
into
this as well in the meantime.
I'm not seeing any issues - using DB410 + 5.8 kernel.
The patch is a clean-up & fixes an issue that the goto skipped an
unlock on error,  rather than any functional change.
Yes, I tested on another platform which is based on SDM845 and there 
the
issue is not seen.
...
The only differences I can see are:-

the cpu_pm_register_notifier() call is earlier in the sequence.

Shouldn't make a difference as cpu_pm and hotplug are different
systems.
2) the error from cpuhp_setup_state_nocalls_cpuslocked() is no 
longer
ignored.
I would suggest that 2) may be the issue on your system - if you are
now seeing an error that was not being processed before?
Yes the error is from cpuhp_setup_state_nocalls_cpuslocked(),
Looking at the code, (-16 / -EBUSY)this seems to come from the
internal cpuhp_store_callbacks() function in cpu.c. This prevents
multiple registrations of callbacks for a given state.
It could be that on your system there is an issue with a race on the
etm4_count variable, allowing two calls to the function
cpuhp_setup_state_nocalls_cpuslocked() function, with one hitting the
error.
I would consider protecting this either by mutex or turning it into an
atomic to see if that fixes your problem
That looks quite possible. We rely on etm4x_count to detect whether we
have registered/unregistered the notifiers. We could simply register 
the
notifiers and leave them registered at driver registration time. And
only remove them when we remove the driver. It is fine to execute the
callback in the absence of the ETM on the CPU. It is not a fast path
anyway.
This will also help us to solve the CPU hotplug issues. i.e,
if a CPU is not brought online during the etm4 driver probe,
we can never enable ETM on the CPU anymore. You can trigger
this by booting a system with maxcpus=1 and later bringing
the CPUs online manually.
Yes I never noticed but ETM hotplug seems broken.
As for this race, I can reliably trigger this race now on other 
platforms
with latest kernel if I do async probe, i.e., with 
PROBE_PREFER_ASYNCHRONOUS
set as probe_type and also we need 
"arm,coresight-loses-context-with-cpu" as well.
Thanks,
Sai
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member
of Code Aurora Forum, hosted by The Linux Foundation

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

Re: ETM probe failure since commit 9b6a3f3633a5cc9("coresight: etmv4: Fix CPU power management setup in probe() function")