Good day to all,
A patch sent by Suzuki a few weeks ago [1] unearthed a problem with how we deal with the "enable_sink" flag in the CS core. So far we have been concentrating on system-wide trace scenarios [2] but per-CPU [3] scenarios are also valid. In system-wide mode a single event is generated by the perf user space and communicated to the kernel. In per-CPU mode an event is generated for each CPU present in the system or specified on the cmd line, and that is where our handling of the "enable_sink" flag fails (get back to me if you want more details on that).
My solution is to add the sink definition to the perf_event_attr structure [4] that gets sent down to the kernel from user space. That way there is no confusion about what sink belongs to what event. To do that I will need to have a chat with the guys in the #perf IRC channel, something I expect to be fairly tedious.
But before moving ahead we need to agree on the syntax we want to have in the future. That way what I do now with the perf folks doesn't have to be undone in a few months.
For the following I will be using figure 2-9 on page 2-33 in this document [5].
So far we have been using this syntax:
# perf record -e cs_etm/@20070000.etr/ --per-thread $COMMAND
This will instruct perf to select the ETR as a sink. Up to now not specifying a sink is treated as an error condition since perf doesn't know what sink to select.
The main goal of writing all this is that I am suggesting to revisit that.
What I am proposing is that _if_ a sink is omitted on the perf command line, the perf infrastructure will pick the _first_ sink it finds when doing a walk through of the CS topology. This is very advantageous when thinking about the syntax required to support upcoming systems where we have a one-to-one mapping between source and sink.
In such a system specifying sinks for each CPU on the perf command line simply doesn't scale. Even on a small system I don't see users specifying a sink for each CPU. Since the sink for each CPU will be the first one found during the walk through, it is implicit that this sink should be used and doesn't need to be specified explicitly.
It would also allow for the support of topologies like Juno-R1 [5] where we have a couple of ETF in the middle. Those are perfectly valid sinks but right now the current scheme doesn't allow us to use them. If we pick the first sink we find along the way we can automatically support something like this.
I have reflected quite extensively on this and I think it can work. The only time it can fail is if at some point we we get more than one sink associated with each tracer. But how likely is this?
What we decide now will not be undone easily, if at all. Please read my email a couple of times and give it some consideration. Comment and ideas are welcomed.
Best regards, Mathieu
[1]. https://patchwork.kernel.org/patch/9657141/ [2]. perf record -e cs_etm/@20070000.etr/u --per-thread $COMMAND [3]. perf record -e cs_etm/@20070000.etr/u --C 0,2-3 $COMMAND [4]. http://lxr.free-electrons.com/source/include/uapi/linux/perf_event.h#L283 [5]. http://infocenter.arm.com/help/topic/com.arm.doc.ddi0515d.b/DDI0515D_b_juno_...