Re: Enabling Coresight in atomic context.

16 Aug 2018

      On Thu, 16 Aug 2018 at 02:10, Mike Bazov mike@perception-point.io wrote:
...
...
Depending on the trace scenario there is a couple more things to keep in mind:

CPU-wide or per-thread mode: In CPU-wide mode sources can use the

same sink simultaneously.  In per-thread mode only a single source can
use the sink.
2) The session: In CPU-wide mode we don't want sources from different
trace session to use the same sink.
CPU-wide/per-thread are different names for sysfs/perf?(respectively).
They are not.
When tracing via sysFS and keeping the default configuration,
everything that is happening on a processor is logged.  That is called
"CPU-wide".  Any process that get scheduled out of the processor won't
be traced.  On perf one can execute:
# perf record -e cs_etm/@20070000.etr/ --per-thread my_application  (example 1)
# perf record -e cs_etm/@20070000.etr/ -C 2,3,4 my_application  (example 2)
For example 1, perf will switch on the tracer associated to the CPU
where my_application has been installed for execution.  If the process
gets scheduled on a different CPU perf will do the right thing and
follow it around.  That is called "per-thread".
In example 2 everything that is happening on CPU 2,3,4 will be traced
for as long as my_application is executing, regardless of where
my_application is executing.  That is also a CPU-wide trace scenario.
...
When you say "we don't want sources from different trace session to use the same sink",
you mean we don't want to mix a sink with perf/sysfs modes simultaneously on the same sink?
First, we never want perf and sysFS to use the same sink.
Second, say we have the following two commands being executed one
after the other:
# perf record -e cs_etm/@20070000.etr/ -C 0,1 my_application
# perf record -e cs_etm/@20070000.etr/ -C 2,3 another_application
We can't allow the second session to use the same sink as the first one.
...
...
Look at what we've done in coresight-etm-perf.c.  There you'll find
code that uses the internal CS implementation to serve the perf
interface.  I suggest you do the same thing but to serve internal
kernel clients (which I suspect will ultimately be some sort of
module).  The hope here is to expose as little as possible of the
internal CS core code.
I think i'll introduce a module called "coresight-api.c", that implements
the API. I'll just send the final proof-of-concept for the mailing list's review, i'm working on it.
...
As for the above proposition I can't comment much without having a
better understanding of what you're trying to do.  I suggest to put a
proof of concept together and send it to the list for review - that
will help us understand what you want to do.
Will do.
...
While doing so I suggest
you keep an eye on the coresight next tree[1].  There is a lot of
things currently happening in the subsystem and the goal posts are
moving regularly.
Thanks.
Some more technical questions regarding the code(things i encountered as part of the development process),
i'd appreciate some guidance:

I noticed coresight_enable()/coresight_disable() are actually exported for other modules to use,
 and declared in "coresight.h". Is there any reason to export these functions of only sysfs uses them?
 Like i said in the first post, i used these APIs, but i had a really hard time finding the coresight device i need
 from the existing code without exporting the coresight device bus symbol.. so they aren't really practical.
 Why not put them in "coresight-priv.h"?

That is a relic from the original implementation where individual
drivers could be compiled as modules.  That functionality was removed
because it quickly became difficult to account for all the failure
conditions when modules are removed arbitrarily.  The symbols don't
need to be exported and at first glance the function can probably be
move to coresight-priv.h
...

I noticed the reference counting for enabling/disabling coresight sources is actually
 only respected in "coresight.c", and not in the perf implementation. When enabling a
 source in perf the reference count is increased(source_ops(source)->enable() is called directely).
 Any reason for doing so?

The first interface the CS framework worked with was sysFS.  After
that the perf API became available.  Confronted with the vastness and
complexity of setting up a complete end-to-end solution (from kernel
driver to user space decoding) my goal quickly became to "just make it
work".  The logic was that once we have something out, other people
(like you) would join in to help.
As such you will find places like that where things aren't exactly how
they should be - heck, I find them in my original code all the time.
You should test it but once again I think you are correct -
coresight_enable_source() should be called from etm_event_start().
...
A non-related question: are there any restrictions on the mailing clients for this mailing list? i'm using gmail,
and i'm not sure if it's acceptable here(i know some mailing lists disallow it).
No restriction related to gmail - that's what Linaro is using.  Using
plain text mode is highly appreciated though.
...
Thanks!
Mike.
On Wed, Aug 15, 2018 at 7:10 PM, Mathieu Poirier mathieu.poirier@linaro.org wrote:
...
On Tue, 14 Aug 2018 at 08:10, Mike Bazov mike@perception-point.io wrote:
...
Hello,
...
Interesting idea. Could you explain how you deal with the situation where
...
paths for different sources need different configuration of the downstream
...
CoreSight? The situation I’m thinking of is (ignoring funnels):
...
...
ETM#1
...
\

...
   ETF -> ETR

...
/

...
ETM#2
...
...
A path with source ETM#1 to sink ETF can’t be active at the same time as a
...
path from source ETM#2 to sink ETR, because the former will need the ETF
...
to be in buffer mode while the latter will need the ETF to be in FIFO mode.
...
...
I’d expect you could build these two incompatible paths, but not
...
simultaneously enable them? So coresight_enable_path would check
...
that any other paths using the same ETF were using it in the same mode,
...
and if it was idle, it would switch it into the right mode.
I haven't thought about that scenario, but you are right. Also the solution you've provided seems great.
When enabling a path, you can only enable it if it isn't already enabled, or if it is enabled with the same configuration.
Depending on the trace scenario there is a couple more things to keep in mind:

CPU-wide or per-thread mode: In CPU-wide mode sources can use the

same sink simultaneously.  In per-thread mode only a single source can
use the sink.

The session: In CPU-wide mode we don't want sources from different

trace session to use the same sink.
...
...
Also how is the trace source id handled? As we have only about 120 possible
...
trace source ids and we have chips with 128 cores funnelling into one sink,
...
we can’t have a fixed allocation of trace sources to trace source ids (i.e. we
...
can’t fix it in the device tree or anything like that). So we need to be able to
...
dynamically allocate trace source ids. Could that be done in
...
coresight_enable_path? So all enabled paths would have distinct
...
trace source ids.
Seems like a great idea for the API.
One thing i can't really understand is, why haven't these problems occur in sysfs/perf mode? seems like they aren't really
specific to the API I proposed.
Simply because CS is complex, quirky, in the process of maturing and
the team has very limited resources.  We fix things based on the use
case we currently work on.
...
Thanks,
Mike.
On Tue, Aug 14, 2018 at 4:31 PM, Al Grant Al.Grant@arm.com wrote:
...
Hi Mike,
Interesting idea. Could you explain how you deal with the situation where
paths for different sources need different configuration of the downstream
CoreSight? The situation I’m thinking of is (ignoring funnels):
ETM#1
  \

    ETF -> ETR

  /

ETM#2
A path with source ETM#1 to sink ETF can’t be active at the same time as a
path from source ETM#2 to sink ETR, because the former will need the ETF
to be in buffer mode while the latter will need the ETF to be in FIFO mode.
I’d expect you could build these two incompatible paths, but not
simultaneously enable them? So coresight_enable_path would check
that any other paths using the same ETF were using it in the same mode,
and if it was idle, it would switch it into the right mode.
Also how is the trace source id handled? As we have only about 120 possible
trace source ids and we have chips with 128 cores funnelling into one sink,
we can’t have a fixed allocation of trace sources to trace source ids (i.e. we
can’t fix it in the device tree or anything like that). So we need to be able to
dynamically allocate trace source ids. Could that be done in
coresight_enable_path? So all enabled paths would have distinct
trace source ids.
Al
From: CoreSight coresight-bounces@lists.linaro.org On Behalf Of Mike Bazov
Sent: 14 August 2018 14:01
To: Mathieu Poirier mathieu.poirier@linaro.org
Cc: coresight@lists.linaro.org
Subject: Re: Enabling Coresight in atomic context.
Hello,
...
Patches are always welcomed and I don't think there is an "easy" way
...
to get out of this one.  What you want to do will probably end up
...
being fairly complex.  I would start by closely understanding how
...
operation of the CS infrastructure is done from the perf interface
...
you should be find just sticking to the kernel part.  There
...
reservation of a "path" and memory for the sink is done in preparatory
...
steps where it is permitted to sleep (non-atomic).  After that
...
components can be enabled from an atomic context, i.e when the process
...
of interest is installed on a processor.  Currently things are woven
...
with the perf_aux_output_[begin|end]() interface but that could easily
...
be decoupled.
...
On the aspect of trace collection, did you envision using the entries
...
in devFS?  If that is the case a mechanism to correlate tracer
...
configuration and trace data will need to be developed, just like what
...
we did for perf.
...
Taking a step back, tracers can also be found on X86 and MIPs (if I'm
...
not mistaking) architectures.  As such the new kernel API would have
...
to be usable by those as well, which complicates the task even
...
further.
...
So all that being said I think it is feasible, but be prepared to
...
invest a significant amount of time and effort.
The "generic" tracing kernel API is a different thing. In it's Coresight implementation it will use the kernel API I need.
After taking a few days to understand how the infrastructure works,  to make the API as flexible as it can be, I thought about this:
Just like there's a perf implementation and a sysfs implementation, the "api" implementation(coresight-api) will be introduced, which will also be
a new mode(CS_MODE_API).
I propose these APIs(some of them exist, but need to be exported and changed a little):

coresight_build_path(struct coresight_device *source, struct coresight_device *sink):
Create a coresight path from the provided source and sink,.

coresight_enable_path(struct coresight_path *path):
 Enable a Coresight path except the source. This will
also glue a source to a specific path. You cannot assign a different path to  this source until the path is destroyed.

coresight_disable_path(struct coresight_path *path)
Disable the path to the sink, including the sink.(if there is more than 1 path to the same sink, does not disable the sink until a refcount reaches 0).

coresight_destroy_path(struct coresight_path *path):
Frees the path, releases the source from that path. The source device can be assigned to a different path.

coresight_enable_source(struct coresight_device *source);
Enables the source. This will actually make the source device play the actual trace data in to the sink(i.e. etm4_enable_hw(), or, increase a refcount if
the source is already playing). Uses the path assigned in "coresight_enable_sink()".

coresight_disable_source(struct coresight_device *source);
Disables the source. This will stop the source from playing trace data(or, if the refcount > 0, decrease the refcount).
Uses the path assigned in "coresight_enable_sink()".

coresight_read_sink(struct coresight_device *sink, void *buf, size_t size);
Read trace data from the sink(advance the read pointer).

coresight_setup_sink_buffer(struct coresight_device *sink, void *pages, int nr_pages);
Allocate a sink buffer(similar to the perf functionality)

The sysfs and api modes will use different buffers to avoid collision.
I realize most of the API is actually making the internal coresight implementation "public", but I really think this is necessary. Building a path to a specific sink
is something a user would want to do, as well as disabling and enabling the path whenever he wishes(this is something I actually need).
In order to use this API, the user needs a method of getting the actual (struct coresight_device *). There will be a list of coresight devices
exported in the "coresight.h" header, which can be iterated using a macro "foreach_coresight_device()". The user will be able to extract a specific
sink and source for his needs.
I think this API is powerful, and will give the user full Coresight functionality. From diving into the code, this seems very possible,
and will not require major infrastructure changes.
I will appreciate your thoughts, tips, and hints.
Thanks, Mike.
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

Re: Enabling Coresight in atomic context.