On Thu, 16 Aug 2018 at 02:10, Mike Bazov mike@perception-point.io wrote:
Depending on the trace scenario there is a couple more things to keep in mind:
- CPU-wide or per-thread mode: In CPU-wide mode sources can use the
same sink simultaneously. In per-thread mode only a single source can use the sink. 2) The session: In CPU-wide mode we don't want sources from different trace session to use the same sink.
CPU-wide/per-thread are different names for sysfs/perf?(respectively).
They are not.
When tracing via sysFS and keeping the default configuration, everything that is happening on a processor is logged. That is called "CPU-wide". Any process that get scheduled out of the processor won't be traced. On perf one can execute:
# perf record -e cs_etm/@20070000.etr/ --per-thread my_application (example 1)
# perf record -e cs_etm/@20070000.etr/ -C 2,3,4 my_application (example 2)
For example 1, perf will switch on the tracer associated to the CPU where my_application has been installed for execution. If the process gets scheduled on a different CPU perf will do the right thing and follow it around. That is called "per-thread".
In example 2 everything that is happening on CPU 2,3,4 will be traced for as long as my_application is executing, regardless of where my_application is executing. That is also a CPU-wide trace scenario.
When you say "we don't want sources from different trace session to use the same sink", you mean we don't want to mix a sink with perf/sysfs modes simultaneously on the same sink?
First, we never want perf and sysFS to use the same sink.
Second, say we have the following two commands being executed one after the other:
# perf record -e cs_etm/@20070000.etr/ -C 0,1 my_application # perf record -e cs_etm/@20070000.etr/ -C 2,3 another_application
We can't allow the second session to use the same sink as the first one.
Look at what we've done in coresight-etm-perf.c. There you'll find code that uses the internal CS implementation to serve the perf interface. I suggest you do the same thing but to serve internal kernel clients (which I suspect will ultimately be some sort of module). The hope here is to expose as little as possible of the internal CS core code.
I think i'll introduce a module called "coresight-api.c", that implements the API. I'll just send the final proof-of-concept for the mailing list's review, i'm working on it.
As for the above proposition I can't comment much without having a better understanding of what you're trying to do. I suggest to put a proof of concept together and send it to the list for review - that will help us understand what you want to do.
Will do.
While doing so I suggest you keep an eye on the coresight next tree[1]. There is a lot of things currently happening in the subsystem and the goal posts are moving regularly.
Thanks.
Some more technical questions regarding the code(things i encountered as part of the development process), i'd appreciate some guidance:
- I noticed coresight_enable()/coresight_disable() are actually exported for other modules to use, and declared in "coresight.h". Is there any reason to export these functions of only sysfs uses them? Like i said in the first post, i used these APIs, but i had a really hard time finding the coresight device i need from the existing code without exporting the coresight device bus symbol.. so they aren't really practical. Why not put them in "coresight-priv.h"?
That is a relic from the original implementation where individual drivers could be compiled as modules. That functionality was removed because it quickly became difficult to account for all the failure conditions when modules are removed arbitrarily. The symbols don't need to be exported and at first glance the function can probably be move to coresight-priv.h
- I noticed the reference counting for enabling/disabling coresight sources is actually only respected in "coresight.c", and not in the perf implementation. When enabling a source in perf the reference count is increased(source_ops(source)->enable() is called directely). Any reason for doing so?
The first interface the CS framework worked with was sysFS. After that the perf API became available. Confronted with the vastness and complexity of setting up a complete end-to-end solution (from kernel driver to user space decoding) my goal quickly became to "just make it work". The logic was that once we have something out, other people (like you) would join in to help.
As such you will find places like that where things aren't exactly how they should be - heck, I find them in my original code all the time. You should test it but once again I think you are correct - coresight_enable_source() should be called from etm_event_start().
A non-related question: are there any restrictions on the mailing clients for this mailing list? i'm using gmail, and i'm not sure if it's acceptable here(i know some mailing lists disallow it).
No restriction related to gmail - that's what Linaro is using. Using plain text mode is highly appreciated though.
Thanks! Mike.
On Wed, Aug 15, 2018 at 7:10 PM, Mathieu Poirier mathieu.poirier@linaro.org wrote:
On Tue, 14 Aug 2018 at 08:10, Mike Bazov mike@perception-point.io wrote:
Hello,
Interesting idea. Could you explain how you deal with the situation where
paths for different sources need different configuration of the downstream
CoreSight? The situation I’m thinking of is (ignoring funnels):
ETM#1
\
ETF -> ETR
/
ETM#2
A path with source ETM#1 to sink ETF can’t be active at the same time as a
path from source ETM#2 to sink ETR, because the former will need the ETF
to be in buffer mode while the latter will need the ETF to be in FIFO mode.
I’d expect you could build these two incompatible paths, but not
simultaneously enable them? So coresight_enable_path would check
that any other paths using the same ETF were using it in the same mode,
and if it was idle, it would switch it into the right mode.
I haven't thought about that scenario, but you are right. Also the solution you've provided seems great.
When enabling a path, you can only enable it if it isn't already enabled, or if it is enabled with the same configuration.
Depending on the trace scenario there is a couple more things to keep in mind:
- CPU-wide or per-thread mode: In CPU-wide mode sources can use the
same sink simultaneously. In per-thread mode only a single source can use the sink.
- The session: In CPU-wide mode we don't want sources from different
trace session to use the same sink.
Also how is the trace source id handled? As we have only about 120 possible
trace source ids and we have chips with 128 cores funnelling into one sink,
we can’t have a fixed allocation of trace sources to trace source ids (i.e. we
can’t fix it in the device tree or anything like that). So we need to be able to
dynamically allocate trace source ids. Could that be done in
coresight_enable_path? So all enabled paths would have distinct
trace source ids.
Seems like a great idea for the API.
One thing i can't really understand is, why haven't these problems occur in sysfs/perf mode? seems like they aren't really
specific to the API I proposed.
Simply because CS is complex, quirky, in the process of maturing and the team has very limited resources. We fix things based on the use case we currently work on.
Thanks,
Mike.
On Tue, Aug 14, 2018 at 4:31 PM, Al Grant Al.Grant@arm.com wrote:
Hi Mike,
Interesting idea. Could you explain how you deal with the situation where
paths for different sources need different configuration of the downstream
CoreSight? The situation I’m thinking of is (ignoring funnels):
ETM#1
\ ETF -> ETR /
ETM#2
A path with source ETM#1 to sink ETF can’t be active at the same time as a
path from source ETM#2 to sink ETR, because the former will need the ETF
to be in buffer mode while the latter will need the ETF to be in FIFO mode.
I’d expect you could build these two incompatible paths, but not
simultaneously enable them? So coresight_enable_path would check
that any other paths using the same ETF were using it in the same mode,
and if it was idle, it would switch it into the right mode.
Also how is the trace source id handled? As we have only about 120 possible
trace source ids and we have chips with 128 cores funnelling into one sink,
we can’t have a fixed allocation of trace sources to trace source ids (i.e. we
can’t fix it in the device tree or anything like that). So we need to be able to
dynamically allocate trace source ids. Could that be done in
coresight_enable_path? So all enabled paths would have distinct
trace source ids.
Al
From: CoreSight coresight-bounces@lists.linaro.org On Behalf Of Mike Bazov Sent: 14 August 2018 14:01 To: Mathieu Poirier mathieu.poirier@linaro.org Cc: coresight@lists.linaro.org Subject: Re: Enabling Coresight in atomic context.
Hello,
Patches are always welcomed and I don't think there is an "easy" way
to get out of this one. What you want to do will probably end up
being fairly complex. I would start by closely understanding how
operation of the CS infrastructure is done from the perf interface
you should be find just sticking to the kernel part. There
reservation of a "path" and memory for the sink is done in preparatory
steps where it is permitted to sleep (non-atomic). After that
components can be enabled from an atomic context, i.e when the process
of interest is installed on a processor. Currently things are woven
with the perf_aux_output_[begin|end]() interface but that could easily
be decoupled.
On the aspect of trace collection, did you envision using the entries
in devFS? If that is the case a mechanism to correlate tracer
configuration and trace data will need to be developed, just like what
we did for perf.
Taking a step back, tracers can also be found on X86 and MIPs (if I'm
not mistaking) architectures. As such the new kernel API would have
to be usable by those as well, which complicates the task even
further.
So all that being said I think it is feasible, but be prepared to
invest a significant amount of time and effort.
The "generic" tracing kernel API is a different thing. In it's Coresight implementation it will use the kernel API I need.
After taking a few days to understand how the infrastructure works, to make the API as flexible as it can be, I thought about this:
Just like there's a perf implementation and a sysfs implementation, the "api" implementation(coresight-api) will be introduced, which will also be
a new mode(CS_MODE_API).
I propose these APIs(some of them exist, but need to be exported and changed a little):
coresight_build_path(struct coresight_device *source, struct coresight_device *sink):
Create a coresight path from the provided source and sink,.
coresight_enable_path(struct coresight_path *path): Enable a Coresight path except the source. This will
also glue a source to a specific path. You cannot assign a different path to this source until the path is destroyed.
coresight_disable_path(struct coresight_path *path)
Disable the path to the sink, including the sink.(if there is more than 1 path to the same sink, does not disable the sink until a refcount reaches 0).
coresight_destroy_path(struct coresight_path *path):
Frees the path, releases the source from that path. The source device can be assigned to a different path.
coresight_enable_source(struct coresight_device *source);
Enables the source. This will actually make the source device play the actual trace data in to the sink(i.e. etm4_enable_hw(), or, increase a refcount if
the source is already playing). Uses the path assigned in "coresight_enable_sink()".
coresight_disable_source(struct coresight_device *source);
Disables the source. This will stop the source from playing trace data(or, if the refcount > 0, decrease the refcount).
Uses the path assigned in "coresight_enable_sink()".
coresight_read_sink(struct coresight_device *sink, void *buf, size_t size);
Read trace data from the sink(advance the read pointer).
coresight_setup_sink_buffer(struct coresight_device *sink, void *pages, int nr_pages);
Allocate a sink buffer(similar to the perf functionality)
The sysfs and api modes will use different buffers to avoid collision.
I realize most of the API is actually making the internal coresight implementation "public", but I really think this is necessary. Building a path to a specific sink
is something a user would want to do, as well as disabling and enabling the path whenever he wishes(this is something I actually need).
In order to use this API, the user needs a method of getting the actual (struct coresight_device *). There will be a list of coresight devices
exported in the "coresight.h" header, which can be iterated using a macro "foreach_coresight_device()". The user will be able to extract a specific
sink and source for his needs.
I think this API is powerful, and will give the user full Coresight functionality. From diving into the code, this seems very possible,
and will not require major infrastructure changes.
I will appreciate your thoughts, tips, and hints.
Thanks, Mike.
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.