Hi,
While I'm doing some meddlation with the makefiles to build .so libs, it occurs to me that this might be a good time to consider the library names.
Rctdl_c_api and ref_trace_decoder reflect the origins of the code as an ARM project to produce an open source reference trace decode library - prior to this effort being contributed and folded in to OpenCSD.
My thoughts were either
libarm_opencsd.so / libarm_opencsd_c_api.so
or
libarm_cstraced.so / libarm_cstraced_c_api.so
Both of which specify the architecture and function a little better than the old names.
Opinions??
Regards
Mike
(RCTDL appears across much of the source code too, but changing that is a massive job, so I'm not considering that at present).
----------------------------------------------------------------
Mike Leach +44 (0)1254 893911 (Direct)
Principal Engineer +44 (0)1254 893900 (Main)
Arm Blackburn Design Centre +44 (0)1254 893901 (Fax)
Belthorn House
Walker Rd mailto:mike.leach@arm.com
Guide
Blackburn
BB1 2QE
----------------------------------------------------------------
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Hi Mike, Mathieu
I've been reading STM driver and its device tree configurations along
with the specifications recently.
Yesterday, Mathieu asked me a question about the true definition of
STM masters. This morning, I read code and spec again. There's still a
few details need to ask Mike.
Mike, please correct me if anything what I'm writing here is wrong.
>From what I have understood, I think STM masters/channels are a
continuous physical space in STM, a bit like registers on this point.
On my Spreadtrum's board, for example, we are configuring 0x180000
byte space for stimulus ports (i.e. channels). The TRM documents that
CS-STM has 129 masters, 128 for software, each supporting 65536
channels. And my question is :
1) How much physical space each channel should take?
2) Do the channels dump the trace packets in real time?
3) Is it correct that set total 0x180000 byte space for 128 masters
and 128*65536 channels?
Thank you,
Chunyan
Friends,
I'm reading section 3.3.4 (RAM Read Pointer Register) of the
"CoreSight Trace Memory Controller Technical Reference Manual",
revision r0p1 and I'm puzzled.
The second paragraph of the "Purpose" section reads as follow:
"The value written to this register must be a byte-address aligned to
the width of the trace memory databus and to a frame boundary. For
example, for 64-bit wide trace memory and 128-bit wide trace memory,
the four LSBs must be 0s. For 256-bit wide trace memory, the five LSBs
must be 0s..."
So for 64 bit wide memory RRP can be set to values like 0, 8, 16,
24... for 128 bit 0, 16, 32, 48... and for 256 bit 0, 32, 64...
What is perplexing is the statement about the LSBs. Things work for
256 bit and 128 bit with 5 and 4 LSBs respectively but for 64 bit, it
should be 3 and not 4 as mentioned.
Am I missing something here? Can someone double check me?
Thanks,
Mathieu
Good day all,
New bundles can be found here [1]. It is the same 'uname' command but
generated with the only two options currently supported: timestamp and
cycle accurate. We can add more at will but let's start with that.
On V3 the options are pushed to the hardware and as such the trace
format should change accordingly. On ETMv4 the perf framework does
convey the options but they are not configured in the tracers. Even
if the options are set the traces won't reflect it - this is obviously
only temporary and will be fixed after I release the 3rd version of
the patchset.
I have verified the validity of the headers (both for V3 and V4) along
with the information they conveys. The layouts haven't changed from
what I previously had [2] - the only difference is that config
registers are artificially generated from options provided on the cmd
line.
On the topic of configuration registers, at this time TRACEIDR on V3
and TRCCONFIGR on V4 __have the same layout__, i.e bit 12 for cycle
accurate and bit 28 for timestamp. I simply did not see the need to
select a different bit field layout for V4 since Tor's code will
ultimately parse that before feeding it to the decode library. A
different layout for V4 would mean more code for Tor - get back to me
if you think enacting the V4 TRCCONFIGR layout is important.
Keep in mind that nothing of this is set in stone, if you think we can
do better please speak up. Mike and Al, this doesn't support ETMv4.1.
Let's discuss this further at the meeting on Wednesday. For now (and
for a little while) I need to concentrate on getting the 3rd iteration
of the patchset out.
Regards,
Mathieu
[1]. http://people.linaro.org/~mathieu.poirier/openCSD/
[2]. https://git.linaro.org/people/mathieu.poirier/coresight.git/blob/refs/heads…
---------- Forwarded message ----------
From: Mathieu Poirier <mathieu.poirier(a)linaro.org>
Date: 3 November 2015 at 21:27
Subject: Collection of metadata in a multi-session environment
To: private-opencsd <private-opencsd(a)linaro.org>
As reported in my other email (Trace bundles for ETMv3/v4) moving to a
multi-session scheme his forcing me to rethink how metadata is
collected as one can no longer assume that what gets read from sysFS
actually pertains to the session being reported.
When looking at the required registers (once again, please refer to
Mike's document, section 4.3.1 to 4.3.3) a lot of them are RO:
ETMv3/PTM: ETMIDR
ETMv4: TRCIDR[0, ..., 13], TRCAUTHSTATUS
Some are RW but can easily be made RO when using the CS framework from Perf:
ETMv3/PTM: ETMTRACEID
ETMv4: TRCTRACEIDR
In fact I remember having this conversation with Mike in San Francisco
and nobody would cry if we were to let the framework decide the
traceIDs configured on the tracers.
Anything that is RO can still be retrieved using the current sysFS
driven method since their values don't change. That leaves us with
the other RW registers:
ETMv3/PTM: ETMCR, ETMCCR
ETMv4: TRCCONFIGR
The question is, what information is needed from these registers in
order to do trace decoding? Is there anything in there that is coming
from the HW that can't be deduced otherwise? From what I understand
we are talking about user configurable options that could be
communicated to the trace decoding library using other means than
conveying the whole registers. Using a bitmap is the first option
that comes to mind, i.e, if cycle accurate tracing is configured, bit
X is set.
With the above in mind I need you guys to help me identify the
mandatory information in those tracer registers that is needed for
trace decoding. Once again we can discuss this further in our meeting
tomorrow.
Thanks,
Mathieu
---------- Forwarded message ----------
From: Mathieu Poirier <mathieu.poirier(a)linaro.org>
Date: 3 November 2015 at 11:48
Subject: Trace bundles for ETMv3/v4
To: private-opencsd <private-opencsd(a)linaro.org>
After a programming marathon I am please to bring you trace bundles
for both ETMv3[1] and v4[2]. The code is based on the latest
development branch[3], something that seemed like a good idea to avoid
maintaining two different tree. It certainly served the purpose but
other problems inherent to that approach have also showed up - more on
that later.
Since the ETMv4 has more metadata (as per Mike's enclosed document[4])
I had to modify the AUX_TRACE header format. The new format can be
found here [5], with the assignment of the slots described here [6].
The end result is that ETMv3 headers will have zero'ed out slots,
something I can't avoid since the auxiliary area foundation assumes a
fix amount of data for all tracers. It's not my ideal solution but we
can always try to change the world when things are actually working.
The perf.data file found in both bundles have been generated using this command:
$ perf record -vvv -e cs_etm// --per-thread uname
As such the amount of traces should be relatively small. In the
bundles you will also find the .debug directory, the perf.data file
and a medata.txt file. The latter is simply a snapshot of the
relevant registers as listed in Mike's document[4]. The snapshot was
taken *after* the trace run and only there to *somewhat* make sense of
how things were configured, which brings us to the problems I've
hinted about earlier.
Contrary to V1 and V2 the newest code base is configuring tracers as
the event is scheduled rather than at initialisation time, making the
metadata in the perf.data completely irrelevant. I will likely have a
solution for that so hang tight, it's coming.
Please have a look and we'll touch base in the meeting tomorrow.
Regards,
Mathieu
[1]. uname.etmv3.tgz
[2]. uname.etmv4.tgz
[3]. https://git.linaro.org/people/mathieu.poirier/coresight.git/shortlog/refs/h…
[4]. RefTrcDecode-API and Components_0v3.pdf
[5]. https://git.linaro.org/people/mathieu.poirier/coresight.git/blob/refs/heads…
[6]. https://git.linaro.org/people/mathieu.poirier/coresight.git/blob/refs/heads…
Tor, Mike and all,
Here is something i'd like your opinion on...
Before programming the ETMv3/PTM, ETMCR:10 needs to be set to one and
when enabling the tracer, the bit needs to be cleared. Each time the
status of ETMSR:1 needs to be probed before moving on, something that
is quite costly. Is there a official limit of time for this operation
to be carried out?
The same question applies for ETB's FFCR:6 and FFSR:1.
At this time the driver wait for 100 usec before complaining - from
your experience, this this too short or it may need more time?
Thanks,
Mathieu
Al and Mike,
With the work on coresight/Perf integration proceeding as planned the
time to start looking at how configuration parameters can be conveyed
to the tracers using the perf cmd tool is fast approaching, and that's
where I need to pick your brain.
In your opinion and based on your experience with coresight, what are
the 5 most wanted configuration options we need to start with?
The question could also be thought of as the 5 most common thing
people configure when using coresight. Finding a way to give access
to all the configuration option of a tracer via cmd line won't be easy
but I believe it can be done. If we find a way to address the most
commonly used options as an starting point the rest should come
easily.
Please think about it and get back to me. My plan is to get the
discussion going with the perf maintainer about the best way to
proceed sometime this week or the next, depending on schedule.
Thanks,
Mathieu
Gentlemen,
As promised in the meeting the work on IntelPT done by Alex Shishkin
can be found on github [1]. All the user space integration work can
be found under tools/perf/. Adrian Hunter (formally at TI) did most
of the user space work. There is good documentation [2] that shows
how IntelPT is used and how the decoding library is called. On the
flip side the different between full trace and snapshot mode isn't all
that clear - I will touch base on that in my status update in SF. Get
back to me if you really can't wait that long and I'll be happy to
clarify.
I also bring your attention to two web pages. The first one [3] is an
awesome wiki on perf where most of the basics are highlighted. The
second one [4] is Brendan Gregg's in depth look at how he uses perf to
debug real life problems.
Regards,
Mathieu
[1]. https://github.com/virtuoso/linux-perf/tree/intel_pt
[2]. https://github.com/virtuoso/linux-perf/blob/intel_pt/tools/perf/Documentati…
[3]. https://perf.wiki.kernel.org/index.php/Tutorial
[4]. http://www.brendangregg.com/perf.html
Gentlemen,
Here is the log for this weeks team meeting.
Tor, is the timing for the meeting bad for you? You did not attend the
meeting while we all wanted to hear a few words of feedback from you for
the code Mike has published two weeks ago. Have you had a chance to look
at that?
--
Best Regards,
Serge Broslavsky <serge.broslavsky(a)linaro.org>
Core Development Project Manager, Linaro
M: +37129426328 IRC: ototo Skype: serge.broslavsky
http://linaro.org | Open source software for ARM SoCs