Hi Mathieu,
On Tue, 2017-04-25 at 12:03 -0600, Mathieu Poirier wrote:
Good day to all,
A patch sent by Suzuki a few weeks ago [1] unearthed a problem with how we deal with the "enable_sink" flag in the CS core. So far we have been concentrating on system-wide trace scenarios [2] but per- CPU [3] scenarios are also valid. In system-wide mode a single event is generated by the perf user space and communicated to the kernel. In per-CPU mode an event is generated for each CPU present in the system or specified on the cmd line, and that is where our handling of the "enable_sink" flag fails (get back to me if you want more details on that).
My solution is to add the sink definition to the perf_event_attr structure [4] that gets sent down to the kernel from user space. That way there is no confusion about what sink belongs to what event. To do that I will need to have a chat with the guys in the #perf IRC channel, something I expect to be fairly tedious.
But before moving ahead we need to agree on the syntax we want to have in the future. That way what I do now with the perf folks doesn't have to be undone in a few months.
For the following I will be using figure 2-9 on page 2-33 in this document [5].
So far we have been using this syntax:
# perf record -e cs_etm/@20070000.etr/ --per-thread $COMMAND
This will instruct perf to select the ETR as a sink. Up to now not specifying a sink is treated as an error condition since perf doesn't know what sink to select.
The main goal of writing all this is that I am suggesting to revisit that.
What I am proposing is that _if_ a sink is omitted on the perf command line, the perf infrastructure will pick the _first_ sink it finds when doing a walk through of the CS topology. This is very advantageous when thinking about the syntax required to support upcoming systems where we have a one-to-one mapping between source and sink.
This seems like a good solution to me.
In such a system specifying sinks for each CPU on the perf command line simply doesn't scale. Even on a small system I don't see users specifying a sink for each CPU. Since the sink for each CPU will be the first one found during the walk through, it is implicit that this sink should be used and doesn't need to be specified explicitly.
It would also allow for the support of topologies like Juno-R1 [5] where we have a couple of ETF in the middle. Those are perfectly valid sinks but right now the current scheme doesn't allow us to use them. If we pick the first sink we find along the way we can automatically support something like this.
Not sure I understand what you mean here - we can use the ETF on Juno by specifying it on the command line - I was doing the same yesterday. Have I missed the point here?
I have reflected quite extensively on this and I think it can work. The only time it can fail is if at some point we we get more than one sink associated with each tracer. But how likely is this?
Take care with replicators - in the Juno topology the TPIU and ETR are effectively the same number of nodes away from the ETM. Without an intervening ETF then the "first" becomes dependent on the ordering of exploring the branches on the replicator. This could be handled as an error if this case is detected - easily solved by actually specifying the desired sink.
Further - programmable replicators can route different trace IDs to different sinks. This would be a specialised case - e.g ETM trace to ETR and main memory, STM trace off chip to an external debugger. This probably doesn't affect the perf command line specification but may need to program the links with appropriate trace IDs in future.
A second implementation detail may be to handle the unreachable sink. If the user specifies a sink that is not in the path then a suitable error needs to be output. Not sure what happens now if the STM sink is specified for Juno r1/r2?
Regards
Mike
What we decide now will not be undone easily, if at all. Please read my email a couple of times and give it some consideration. Comment and ideas are welcomed.
Best regards, Mathieu
[1]. https://patchwork.kernel.org/patch/9657141/ [2]. perf record -e cs_etm/@20070000.etr/u --per-thread $COMMAND [3]. perf record -e cs_etm/@20070000.etr/u --C 0,2-3 $COMMAND [4]. http://lxr.free-electrons.com/source/include/uapi/linux/perf_eve nt.h#L283 [5]. http://infocenter.arm.com/help/topic/com.arm.doc.ddi0515d.b/DDI0 515D_b_juno_arm_development_platform_soc_trm.pdf _______________________________________________ CoreSight mailing list CoreSight@lists.linaro.org https://lists.linaro.org/mailman/listinfo/coresight