Hi Mathieu,
On 16/04/2019 20:37, Mathieu Poirier wrote:
Hi Robert,
On Thu, 11 Apr 2019 at 12:52, Robert Walker robert.walker@arm.com wrote:
Hi Mathieu,
On 25/03/2019 21:56, Mathieu Poirier wrote:
This is the second revision of a patchset that adds support for CPU-wide trace scenarios and as such, it is now possible to issue the following commands:
# perf record -e cs_etm/@20070000.etr/ -C 2,3 $COMMAND # perf record -e cs_etm/@20070000.etr/ -a $COMMAND
The solution is designed to work for both 1:1 and N:1 source/sink topologies, though the former hasn't been tested for lack of access to HW.
Most of the changes revolve around allowing more than one event to use a sink when operated from perf. More specifically the first event to use a sink switches it on while the last one is tasked to aggregate traces and switching off the device.
This is the kernel part of the solution, with the user space portion to be released in a later set. All patches (user and kernel) have been rebased on v5.1-rc2 and are hosted here[1]. Everything has been tested on Juno and 410c dragonboard platforms.
Regards, Mathieu
[1]. https://git.linaro.org/people/mathieu.poirier/coresight.git (5.1-rc2-cpu-wide-v2)
I've tested this patch set and the associated perf patches on the HiKey 960 - trace collection and decode appears to work OK. However, in order to get the timestamps in the trace stream, I needed to enable the CoreSight Timestamp generator before starting trace. Without this, all the timestamp packets had a value of 0. This will likely affect other platforms. For testing purposes, I enabled it by poking the control register directly via /dev/mem, but for full support you will need a driver for this component (it's fairly simple - just a single register to write to enable / disable) and entries in the device tree / ACPI tables - it's similar to the helper devices like CTI & CATU which aren't on the trace data path, but are associated with a device that is.
Thank you for taking the time to test this. Can I add your "Tested-by:" to the set?
Yes, please do. I've also tested v3 of these patches.
Platforms where the timestamp generator needs to explicitly be enabled are slowly emerging - I have heard of the issue on the CS mailing list about a month ago. Since I don't have HW to test the feature it will not be part of this set.
Also, the use of a counter to generate the timestamps periodically conflicts with the ETM strobing patch we've been using for AutoFDO. This strobing requires 2 counters and as most ETM implementations only have 2 counters, there is only one available if one is used for timestamps. We'll have to do some investigation to work out a way around this.
I noticed that clocks were in short supply and as such added an explicity test to make sure there were enough of them before proceeding. Like topology issues there isn't much we can currently do other than preventing a trace session from moving forward if there isn't enough counters.
My current thinking on this is that when using the strobing mode for AutoFDO, we only get short bursts of trace from each core and are only interested in following the program flow during that burst, so precise alignment between the instructions streams of each core is less important (and unlikely - we wouldn't expect the strobes of multiple cores to coincide). We get a timestamp as a result of the trace burst starting which is sufficient for coarse alignment of the bursts. So I've reworked my strobing patch to use both counters for the strobing and not enable the timestamp counter when strobing is enabled.
Regards
Rob