Hi Mike,
On 2020-07-24 21:08, Mike Leach wrote:
Hi Sai,
On Fri, 24 Jul 2020 at 12:44, Sai Prakash Ranjan saiprakash.ranjan@codeaurora.org wrote:
Hi Mike,
On 2020-07-24 16:35, Mike Leach wrote:
Hi Sai,
On Fri, 24 Jul 2020 at 08:48, Sai Prakash Ranjan saiprakash.ranjan@codeaurora.org wrote:
Hi Mike,
Since commit 9b6a3f3633a5cc9("coresight: etmv4: Fix CPU power management setup in probe() function"), ETM probe fails consistently like below:
localhost ~ # dmesg | grep -i etm [ 6.460602] coresight-etm4x: probe of 7040000.etm failed with error -16 [ 6.524756] coresight etm1: CPU1: ETM v4.2 initialized [ 6.531152] coresight etm2: CPU2: ETM v4.2 initialized [ 6.538495] coresight etm3: CPU3: ETM v4.2 initialized [ 6.545124] coresight etm4: CPU4: ETM v4.2 initialized [ 6.552904] coresight etm5: CPU5: ETM v4.2 initialized [ 6.559714] coresight etm6: CPU6: ETM v4.2 initialized [ 6.569596] coresight etm7: CPU7: ETM v4.2 initialized localhost ~ #
Most of the time its for ETM0 but I occasionally see ETM1 and other ETMs failing to probe, but the some ETM probe failure is always there. I'm using SC7180 based platform on 5.4 kernel which has all the coresight patches backported.
If I revert that commit, I don't see the issue at all. In case you can identify something which might be causing this, please let me know. I'm planning to look into this as well in the meantime.
I'm not seeing any issues - using DB410 + 5.8 kernel.
The patch is a clean-up & fixes an issue that the goto skipped an unlock on error, rather than any functional change.
Yes, I tested on another platform which is based on SDM845 and there the issue is not seen.
The only differences I can see are:-
- the cpu_pm_register_notifier() call is earlier in the sequence.
Shouldn't make a difference as cpu_pm and hotplug are different systems. 2) the error from cpuhp_setup_state_nocalls_cpuslocked() is no longer ignored.
I would suggest that 2) may be the issue on your system - if you are now seeing an error that was not being processed before?
Yes the error is from cpuhp_setup_state_nocalls_cpuslocked(),
Looking at the code, (-16 / -EBUSY)this seems to come from the internal cpuhp_store_callbacks() function in cpu.c. This prevents multiple registrations of callbacks for a given state. It could be that on your system there is an issue with a race on the etm4_count variable, allowing two calls to the function cpuhp_setup_state_nocalls_cpuslocked() function, with one hitting the error. I would consider protecting this either by mutex or turning it into an atomic to see if that fixes your problem
Yes, converting etm4_count to atomic variable works and I can't trigger the issue anymore.
Thanks, Sai