Hi Mike,
On 16/08/18 22:15, Mike Bazov wrote:
Greetings,
When tracing via sysFS and keeping the default configuration,
everything that is happening on a processor is logged. That is called "CPU-wide". Any process that get scheduled out of the processor won't be traced. On perf one can execute: # perf record -e cs_etm/@20070000.etr/ --per-thread my_application (example 1) # perf record -e cs_etm/@20070000.etr/ -C 2,3,4 my_application (example 2) For example 1, perf will switch on the tracer associated to the CPU where my_application has been installed for execution. If the process gets scheduled on a different CPU perf will do the right thing and follow it around. That is called "per-thread". In example 2 everything that is happening on CPU 2,3,4 will be traced for as long as my_application is executing, regardless of where my_application is executing. That is also a CPU-wide trace scenario.
I was more under the impression that CPU-wide records everything that the CPU executes (this is achieved by using sysfs like you described), regardless of the term "thread". I don't really understand why CPU-wide is the right term for example 2. Both of the examples record "per-thread", except of the CPU mask. Example 1 doesn't mask any CPUs, where example 2 masks all CPUs except 2, 3, 4, It still doesn't record "CPU-wide", only "per-thread", but on non-masked CPUs(if it weren't "per-thread", it wouldn't care about scheduling a thread and disable/enable accordingly). I'm a little confused, It really seems like sysfs==cpu-wide and perf==per-thread. Perhaps chagning the modes to "CS_MODE_PER_THREAD", "CS_MODE_CPU_WIDE" and make the sysfs and perf implementation use these modes is something that solves the puzzle for me.
As such you will find places like that where things aren't exactly how
they should be - heck, I find them in my original code all the time.
You should test it but once again I think you are correct - coresight_enable_source() should be called from etm_event_start().
After a second look, actually i think calling coresight_enable_source() will be problematic. Using the sysfs implementation(coresight.c) from perf is problematic, since it maintains a reference count per-device. If there's a sysfs session running on a tracer, and perf uses the __same__ path to the sink and the same source, using coresight_enable_path() will result in simply increasing the reference count without returning any errors.. So calling source_ops(source)->enable() directly actually will result in an error, which is the expected behavior(?)
I had a patch to switch the perf API to use corsight_enable_source(), but dropped it to keep the perf changes smaller. I can resurrect it and post the same.
Btw, have you explored using the "perf_event" API from the kernel ?
e.g, You could create an event using perf_event_kernel_counter(). May be it is easier to switch to the perf event API (and enhance it there) rather than adding another mode to the mix. I will do some digging.
Kind regards Suzuki