On Mon, 1 Oct 2018 at 06:52, Mike Leach mike.leach@linaro.org wrote:
Hi, On Mon, 1 Oct 2018 at 12:33, leo.yan@linaro.org wrote:
On Mon, Oct 01, 2018 at 09:19:33AM +0100, Robert Walker wrote:
[...]
+#ifdef HAVE_CSTRACE_INSTR_INFO
+static int cs_etm_decoder__instr_count(const ocsd_generic_trace_elem *elem) +{
return elem->num_instr_range;
+}
+static int cs_etm_decoder__last_instr_size(const ocsd_generic_trace_elem *elem) +{
return elem->last_instr_sz;
+}
+#else
+static int cs_etm_decoder__instr_count(const ocsd_generic_trace_elem *elem) +{
/* Assume a 4-byte instruction size - will be wrong for T32 */
WARN_ONCE(elem->isa == ocsd_isa_thumb2,
"Instruction counts not available for T32. Please upgrade to OpenCSD >= 0.9.0\n");
For this case, the program should report failure and directly exit?
Here why not check for A32 instruction case? Before version 0.8.x has supported A32 instructions? I understand A32 and A64 both have 4 bytes instruction width, but I don't know if the old OpenCSD lib supports to decode both of them?
The old version of OpenCSD did support A32 and T32 - samples from A32 trace will work with the current version of perf as the assumption of 4 byte instruction sizes for A64 also applies to A32.
Thanks for explaination.
return (elem->en_addr - elem->st_addr) / 4;
+}
+static +int cs_etm_decoder__last_instr_size(const ocsd_generic_trace_elem *elem __maybe_unused) +{
return 4;
+}
+#endif
alternate not needed if you decide not to build against insufficient version.
Just curious, if the API ocsd_get_version() has been provided by all OpenCSD versions?
Seems to me, this patch is to keep backwards compability with v0.8.x; I think we can still support perf build with old version OpenCSD, at the runtime the perf cs_etm decoder can get the OpenCSD version number by using API ocsd_get_version() and check if the OpenCSD version is sufficient for decoding current trace data, if not then reports error and directly exit from program.
ocsd_get_version() is provided by all versions of OpenCSD. The features added in 0.9.x to provide the instruction counts add a new member to the struct that describes each range of executed instructions - so the new code doesn't compile if the old library is installed. Calling ocsd_get_version() could be used to check for the unlikely case that perf has been compiled against 0.9.x, but is now being run against an older version - is this worth doing?
Ah, I missed the building failure when use the new kernel with old library.
From the developer view, I bais to Mike's suggestion to directly report failure when build new kernel with old verison lib; the code can keep as simple as possible and we can easily build latest version OpenCSD lib by ourselves.
Be honest, I am not confident for how this works with distros; this is not for a developer but for end users. E.g. Debian / Ubuntu have officially released for the OpenCSD v0.8.x, should the new kernel support these old version OpenCSD libs? I am not sure if this is the case, if OpenCSD libs have not been released in Debian/Ubuntu, then this is not concern at all.
The first Debian distro to use OpenCSD used 0.8.4. They have picked up later versions as they have been released. Don't know if Ubuntu are picking this up yet.
Any user that is skilled enough to get / use / build an updated version of perf that is not in their current kernel revision, will have the necessary skills to find the latest version of the OpenCSD library should they need it.
I totally agree with the opinion above and have been since the very first release of this patchset. But I asked around at Linaro and was told that most utilities will support building against an older library (yielding a reduced feature set of course). People will typically use autoconf/automake to provide this kind of functionality but I think it is way overkill, hence the idea of introducing a #define in the openCSD library to conditionally compile things.
I am willing to try to do the right thing. As such I suggest we accommodate minor library revision but mandate to move to the latest for major releases. This is something we can talk about in Manchester if we haven't come up with an agreement by that time.
Mathieu
Mike
[...]
+static inline int cs_etm__t32_instr_size(struct cs_etm_queue *etmq,
u64 addr) {
Nitpick: add a new line for '{'.
u8 instrBytes[2];
/*
* The packet records the execution range with an exclusive end address
*
* A64 instructions are constant size, so the last executed
* instruction is A64_INSTR_SIZE before the end address
* Will need to do instruction level decode for T32 instructions as
* they can be variable size (not yet supported).
cs_etm__mem_access(etmq, addr, ARRAY_SIZE(instrBytes), instrBytes);
/* T32 instruction size is indicated by bits[15:11] of the first
* 16-bit word of the instruction: 0b11101, 0b11110 and 0b11111
* denote a 32-bit instruction. */
return packet->end_addr - A64_INSTR_SIZE;
return ((instrBytes[1] & 0xF8) >= 0xE8) ? 4 : 2;
Could use macro to replace 0xF8/0xE8 for more readable?
I don't think adding a macro will make this any more readable - there's a detailed comment about the encoding above.
Yeah, after read the comments I understand the logic :)
Another question, look into the flow as showed blow, T32 must dependent on DSO to calculate instruction address; just curious if we can relay on decoder to decide the T32 instruction address and remove dependency on DSO?
cs_etm__instr_addr() `-> cs_etm__t32_instr_size() `-> cs_etm__mem_access() `-> dso__data_read_offset()
[...]
-static inline u64 cs_etm__instr_addr(const struct cs_etm_packet *packet, +static inline u64 cs_etm__instr_addr(struct cs_etm_queue *etmq,
{const struct cs_etm_packet *packet, u64 offset)
/*
* Only A64 instructions are currently supported, so can get
* instruction address by muliplying.
* Will need to do instruction level decode for T32 instructions as
* they can be variable size (not yet supported).
*/
return packet->start_addr + offset * A64_INSTR_SIZE;
if (packet->isa == CS_ETM_ISA_T32) {
u64 addr = packet->start_addr;
while (offset > 0) {
addr += cs_etm__t32_instr_size(etmq, addr);
offset--;
Should be: offset -= cs_etm__t32_instr_size(etmq, addr)?
No - offset is the number of the instruction we want to find the address of
- e.g. the decoder will tell us that the block has 100 instructions and we
want to generate a sample on the 57th, so we step 57 instructions through the block.
Understand now, sorry for noise.
[...]
Thanks, Leo Yan
-- Mike Leach Principal Engineer, ARM Ltd. Manchester Design Centre. UK _______________________________________________ CoreSight mailing list CoreSight@lists.linaro.org https://lists.linaro.org/mailman/listinfo/coresight