On Mon, Oct 01, 2018 at 09:19:33AM +0100, Robert Walker wrote:
[...]
+#ifdef HAVE_CSTRACE_INSTR_INFO
+static int cs_etm_decoder__instr_count(const ocsd_generic_trace_elem *elem) +{
return elem->num_instr_range;
+}
+static int cs_etm_decoder__last_instr_size(const ocsd_generic_trace_elem *elem) +{
return elem->last_instr_sz;
+}
+#else
+static int cs_etm_decoder__instr_count(const ocsd_generic_trace_elem *elem) +{
/* Assume a 4-byte instruction size - will be wrong for T32 */
WARN_ONCE(elem->isa == ocsd_isa_thumb2,
"Instruction counts not available for T32. Please upgrade to OpenCSD >= 0.9.0\n");
For this case, the program should report failure and directly exit?
Here why not check for A32 instruction case? Before version 0.8.x has supported A32 instructions? I understand A32 and A64 both have 4 bytes instruction width, but I don't know if the old OpenCSD lib supports to decode both of them?
The old version of OpenCSD did support A32 and T32 - samples from A32 trace will work with the current version of perf as the assumption of 4 byte instruction sizes for A64 also applies to A32.
Thanks for explaination.
return (elem->en_addr - elem->st_addr) / 4;
+}
+static +int cs_etm_decoder__last_instr_size(const ocsd_generic_trace_elem *elem __maybe_unused) +{
return 4;
+}
+#endif
alternate not needed if you decide not to build against insufficient version.
Just curious, if the API ocsd_get_version() has been provided by all OpenCSD versions?
Seems to me, this patch is to keep backwards compability with v0.8.x; I think we can still support perf build with old version OpenCSD, at the runtime the perf cs_etm decoder can get the OpenCSD version number by using API ocsd_get_version() and check if the OpenCSD version is sufficient for decoding current trace data, if not then reports error and directly exit from program.
ocsd_get_version() is provided by all versions of OpenCSD. The features added in 0.9.x to provide the instruction counts add a new member to the struct that describes each range of executed instructions - so the new code doesn't compile if the old library is installed. Calling ocsd_get_version() could be used to check for the unlikely case that perf has been compiled against 0.9.x, but is now being run against an older version - is this worth doing?
Ah, I missed the building failure when use the new kernel with old library.
From the developer view, I bais to Mike's suggestion to directly report
failure when build new kernel with old verison lib; the code can keep as simple as possible and we can easily build latest version OpenCSD lib by ourselves.
Be honest, I am not confident for how this works with distros; this is not for a developer but for end users. E.g. Debian / Ubuntu have officially released for the OpenCSD v0.8.x, should the new kernel support these old version OpenCSD libs? I am not sure if this is the case, if OpenCSD libs have not been released in Debian/Ubuntu, then this is not concern at all.
[...]
+static inline int cs_etm__t32_instr_size(struct cs_etm_queue *etmq,
u64 addr) {
Nitpick: add a new line for '{'.
u8 instrBytes[2];
/*
* The packet records the execution range with an exclusive end address
*
* A64 instructions are constant size, so the last executed
* instruction is A64_INSTR_SIZE before the end address
* Will need to do instruction level decode for T32 instructions as
* they can be variable size (not yet supported).
cs_etm__mem_access(etmq, addr, ARRAY_SIZE(instrBytes), instrBytes);
/* T32 instruction size is indicated by bits[15:11] of the first
* 16-bit word of the instruction: 0b11101, 0b11110 and 0b11111
* denote a 32-bit instruction. */
return packet->end_addr - A64_INSTR_SIZE;
return ((instrBytes[1] & 0xF8) >= 0xE8) ? 4 : 2;
Could use macro to replace 0xF8/0xE8 for more readable?
I don't think adding a macro will make this any more readable - there's a detailed comment about the encoding above.
Yeah, after read the comments I understand the logic :)
Another question, look into the flow as showed blow, T32 must dependent on DSO to calculate instruction address; just curious if we can relay on decoder to decide the T32 instruction address and remove dependency on DSO?
cs_etm__instr_addr() `-> cs_etm__t32_instr_size() `-> cs_etm__mem_access() `-> dso__data_read_offset()
[...]
-static inline u64 cs_etm__instr_addr(const struct cs_etm_packet *packet, +static inline u64 cs_etm__instr_addr(struct cs_etm_queue *etmq,
{const struct cs_etm_packet *packet, u64 offset)
/*
* Only A64 instructions are currently supported, so can get
* instruction address by muliplying.
* Will need to do instruction level decode for T32 instructions as
* they can be variable size (not yet supported).
*/
return packet->start_addr + offset * A64_INSTR_SIZE;
if (packet->isa == CS_ETM_ISA_T32) {
u64 addr = packet->start_addr;
while (offset > 0) {
addr += cs_etm__t32_instr_size(etmq, addr);
offset--;
Should be: offset -= cs_etm__t32_instr_size(etmq, addr)?
No - offset is the number of the instruction we want to find the address of
- e.g. the decoder will tell us that the block has 100 instructions and we
want to generate a sample on the 57th, so we step 57 instructions through the block.
Understand now, sorry for noise.
[...]
Thanks, Leo Yan