On 12/11/20 8:31 PM, Mathieu Poirier wrote:
On Fri, Nov 27, 2020 at 10:32:28AM +0000, Suzuki K Poulose wrote:
On 11/10/20 12:45 PM, Anshuman Khandual wrote:
perf handle structure needs to be shared with the TRBE IRQ handler for capturing trace data and restarting the handle. There is a probability of an undefined reference based crash when etm event is being stopped while a TRBE IRQ also getting processed. This happens due the release of perf handle via perf_aux_output_end(). This stops the sinks via the link before releasing the handle, which will ensure that a simultaneous TRBE IRQ could not happen.
Or in other words :
We now have :
update_buffer()
perf_aux_output_end(handle)
... disable_path()
This is problematic due to various reasons :
- The semantics of update_buffer() is not clear. i.e, whether it should leave the "sink" "stopped" or "disabled" or "active"
I'm a little confused by the above as the modes that apply here are CS_MODE_DISABLED and CS_MODE_PERF, so I'll go with those. Let me know if you meant something else.
Sorry, I think it is a bit confusing.
stopped => Sink is in stopped HW state, but the software mode is not changed (i.e, could be PERF or SYSF)
disabled => Sink is in stopped hw state, the software mode is DISABLED
active => Sink is active and flushing trace, with respective mode (PERF vs SYSFS).
So far ->update_buffer() doesn't touch drvdata->mode and as such it is still set to CS_MODE_PERF when the update has completed.
- This breaks the recommended trace collection sequence of "flush" and "stop" from source to the sink for trace collection. i.e, we stop the source now. But don't flush the components from source to sink, rather we stop and flush from the sink. And we flush and stop the path after we have collected the trace data at sink, which is pointless.
The above assesment is correct. Fixing it though has far reaching ramifications that go far beyond the scope of this patch.
For a sink with IRQ handler, if we don't stop the sink with update_buffer(), we could have a situation :
update_buffer()
perf_aux_outpuf_end(handle) # handle is invalid now
-----------------> IRQ -> irq_handler() perf_aux_output_end(handle) # Wrong !
disable_path()
That's the picture of the issue I had in my head when looking at the code - I'm glad we came to the same conclusion.
The sysfs mode is fine, as we defer the trace collection to disable_path().
The proposed patch is still racy, as we could still hit the problem.
So, to avoid all of these situations, I think we should defer the the update_buffer() to sink_ops->disable(), when we have flushed and stopped the all the components upstream and avoid any races with the IRQ handler.
i.e,
source_ops->stop(csdev);
disable_path(handle); // similar to the enable_path
sink_ops->disable(csdev, handle) { /* flush & stop */
/* collect trace */ perf_aux_output_end(handle, size); }
That is one solution. The advantage here is that it takes care of the flusing problem you described above. On the flip side it is moving a lot of code around, something that is better to do in another set.
Another solution is to disable the TRBE IRQ in ->udpate_buffer(). The ETR does the same kind of thing with tmc_flush_and_stop(). I don't know how feasible that is but it would be a simple solution for this set. Properly flushing the pipeline could be done later. I'm fine with either approach.
Agreed. I think this is reasonable forthis set. i.e, leave the hardware disabled. We could do the proper solution above as a separate series, to keep the changes incremental.
Kind regards Suzuki