(It's very urgent, please transfer this email to your CEO. Thanks)
This email is from China domain name registration center, which mainly deal with the domain name registration in China. On June 10, 2019, we received an application from Kaicheng Ltd requested "coresight" as their internet keyword and China (CN) domain names (coresight.cn, coresight.com.cn, coresight.net.cn, coresight.org.cn). But after checking it, we find this name conflict with your company name or trademark. In order to deal with this matter better, it's necessary to send email to you and confirm whether your company have connection with this Chinese company or not?
Best Regards
*************************************
Nick Liu | Service & Operations Manager
China Registry (Head Office) | 6012, Xingdi Building, No. 1698 Yishan Road, Shanghai 201103, China
Tel: +86-02164193517 | Fax: +86-02164198327 | Mob: +86-13816428671
Email: nick(a)chinaregistry.org.cn
Web: www.chinaregistry.org.cn
*************************************
This email contains privileged and confidential information intended for the addressee only. If you are not the intended recipient, please destroy this email and inform the sender immediately. We appreciate you respecting the confidentiality of this information by not disclosing or using the information in this email.
hi all,considering the big progress achieved in coresight drivers, perf, as well as opencsd, the prerequisites for making a move towards developing branch tracing in gdb for arm processors, based on etm are now available. Therefore I am publishing this request for comment, and looking forward for your feedback on this proposal.
Non intrusive execution recording for GDB using ARM CoreSight
Status of this Memo
This memo provides information for Linaro coresight and toolchain communities. Distribution of this memo is unlimited.
Abstract
A method of realizing execution recording in GDB in a non-intrusive way. This method is based on the use of CoreSight hardware tracing, available on ARM Cortex devices.
Table of Contents
1 Introduction 2 State of the art 3 Use cases 3.1 Self hosted debug monitor 3.2 Remote debug monitor 3.3 External debugger 4 Implementation needs 4.1 Self hosted debug monitor 4.2 Remote Debug monitor 4.3 External debugger 5 Remote protocol execution sequence 6 Remote protocol extensions 7 Solutions and alternatives 7.1 Scope definition 7.2 CoreSight infrastructure exposure to the user 7.3 Parameters needed for parsing traces
1. Introduction
CoreSight technology offers a toolset for tracing the execution of a program on a CPU, as well as routing the traces to an external trace port analyzer or storing it in a dedicated internal memory. Those traces do not affects system performance, and can be used as a record for program execution. GDB offers reverse debugging by recording program execution and storing it. GDB offers either full record or program flow (branch) record. Records can be replayed later-on for forwards or backwards debugging. This request for comments is about realizing GDB record and replay functionality using CoreSight technology. it presents typical use cases and discuss different alternatives for realizing above mentioned feature. 2. State of the art
GDB currently supports two execution recording variants: - full record: where registers as well as memory are recorded for each instruction. in this case GDB collects the registers as well as involved memory area after each instruction. currently this has no support for hardware accelerators - branch record: where only program flow is recorded. in this case GDB collects a list of linear execution called blocks. each branch will terminate previous block and start a new one. currently branch is implemented either without hardware acceleration or using Intel branch trace store "bts" and Intel processor trace "pt" hardware accelerator on supported cpus.
3. Use cases
Programs running on ARM processors can be be debugged in many configurations. three of them are selected in this RFC as base for discussion : 3.1. Self hosted debug monitor Those are systems where the debugger program runs on the same cpu as the debugged program and monitors it. user interacts with the debugging session on the target host itself. Linux gdb is an example of such systems. in such a system following setup is considered - Target: a process running on an ARM cortex A - Debugger: gnu gdb via ptrace API (arm-linux-gnueabihf-gdb)
+-----------------------------------+ | Target | | +------------+ | | +------+ | Coresight | | | | | | components:| | | | GDB |<--------->| | | | | | ^ | DWT, ETM, | | | +------+ | | ITM, TPIU | | | ^ | | TMC, ETB | | | | | +------------+ | +----|---------|--------------------+ | | | | arm-linux- | gnueabihf- | gdb | debug: ptrace trace: perf/CoreSight drivers
3.2. Remote debug monitor
Those are usually systems where the debugger program runs on the same cpu as the debugged program and monitors it. user interacts with the debugging session remotely from a PC Linux gdb is an example of such systems. in such a system following setup is considered - Target: a process running on an ARM cortex A - Gdb server: gnu gdbserver (arm-linux-gnueabihf-gdbserver) - Gdb client: gnu gdb (arm-linux-gnueabihf-gdb) - UI: eclipse with needed plugins, MI interface is used.
+--------------------------+ +---------------------------------------+ | Host | | Target | | | | +------------+ | | +-----+ +------+ | | +------+ | Coresight | | | | | | GDB | | | | GDB | | components:| | | | UI |<--->| |<--->|<--->|<--->| |<--------->| | | | | | ^ |Client| ^ | ^ | |Server| ^ | DWT, ETM, | | | +-----+ | +------+ | | | | +------+ | | ITM, TPIU | | | ^ | ^ | | | | ^ | | TMC, ETB | | | | | | | | | | | | +------------+ | +----|-----|-----|------|--+ | +--------|---------|--------------------+ | | | | | | | | | | | | | | Eclipse | arm-linux- | | arm-linux- | | gnueabihf- | TCP/IP gnueabihf- | | gdb | UART gdbserver | GDB MI GDB remote debug: ptrace protocol trace: perf/CoreSight drivers
3.3. External debugger
Those are systems where an external debugger is used. It accesses the target using JTAG or SWD. Target is usually a bare metal embedded systems or systems with an rtos. as an example, following setup is considered: - Target: firmware running on ARM cortex M. - Debugger: external debug and trace device. - Gdb server: OpenOcd. - Gdb Client: arm-none-eabi-gdb. - UI: eclipse with needed plugins, MI interface is used.
+--------------------------------------+ +-------+ +-------------+ | Host | | dbggr | | Target | | | | | | | | +-----+ +------+ +------+ | | | | Coresight | | | | | GDB | | GDB | | | Debug | | components: | | | UI |<--->| |<--->| |<-->|<--->| + |<--->| | | | | ^ |Client| ^ |Server| | ^ | Trace | ^ | DWT, ETM, | | +-----+ | +------+ | +------+ | | | | | | ITM, TPIU | | ^ | ^ | ^ | | | | | | | | | | | | | | | | | | | | +----|-----|-----|------|-----|--------+ | +-------+ | +-------------+ | | | | | | | | | | | | | | Eclipse | arm-none- | OpenOcd | | | eabi-gdb | PyOcd | | | | | | GDB MI GDB remote Ethernet debug: JTAG/SWD protocol USB trace: Serial/Parallel
4. Implementation needs
4.1 Self hosted debug monitor
gdb : arm-linux-gnueabihf-gdb the interface defined in btrace.h for capturing and processing traces has to be implemented for arm CoreSight needed actions: - in btrace-common.h: add needed structures for capturing and handling etm traces - in linux-btrace.h: - add btrace_tinfo_etm - amend btrace_target_info - in linux-btrace.c: change following functions to support etm traces - linux_enable_btrace - linux_disable_btrace - linux_read_btrace - linux_btrace_conf - in arm-linux-nat.c:add an api to - configure btrace - enable btrace - disable btrace - read btrace - in btrace.c - btrace_add_pc btrace_fetch has to be implemented for Coresight this means using opencsd library to parse etms and then reconstruct executed instructions accordingly (btrace_compute_ftrace_1) - in record-btrace.c - add command for showing record btrace etm options - add command for starting tracing with CoreSight and its handler (cmd_record_btrace_etm_start) - adapt cmd_show_record_btrace_cpu ... perf: needed actions: - make sure that perf can start/stop tracing a process with its threads, collect etm traces and deliver them to the user
4.2 Remote Debug monitor
changes described in 7.1 are needed. in addition, and to support remote protocol following changes are needed gdb server: arm-linux-gnueabihf-gdbserver needed actions: - in linux-low - linux_low_read_btrace: add support for etm traces formatting. - linux_low_btrace_conf: :add support for etm configuration formatting. gdb client: arm-linux-gnueabihf-gdb needed actions: - in remote.c - adapt enable_btrace - adapt disable_btrace - in btrace.c - parse_xml_btrace: update btrace.dtd [2] and related data structures btrace_xxx - parse_xml_btrace_conf: update btrace-conf.dtd [3] and related data structures btrace_conf_xxx - extend Remote protocol handling to support coresight etm traces UI: eclipse needed actions make sure that the plugin for recoding execution and replaying it is coping well in case of arm-linux
Remote protocol needs to be extended by -1- Adding Qbtrace:CoreSight (or etm) to start collecting etm traces -2- Amending 'Branch Trace Format' xml specification to consider etm traces transfer -3- Amending 'Branch Trace Configuration Format' xml specification to consider parameters needed for etm
4.3 External debugger
changes described in 4.2 are needed. in addition, and to support tracing a remote dealing with an external debugger (bare metal embedded system) following changes are needed gdb server: OpenOcd needed actions: - rework etm driver to make it up to date. - add a driver for configuring trace interconnect IPs - rework the driver for TPIU. - integrate support for a Trace port analyzer. -Extend remote protocol implementation to support recording Coresight infrastructure of the SoC is to be set in OpenOcd through configuration files. Parameters that are not relevant for gdb are also specified in configuration files (trace sink, trace protocol, port size, trace synch frequency, cycle accurate tracing etc ...) gdb client: arm-none-eabi-gdb needed actions: - extend Remote protocol to support coresight etm traces - integrate etm trace parsing library - interface the parser to record_btrace_target Remote protocol needs -in addition to 4.2- to be extended by - Adding Qbtrace-conf:CoreSight:core=value to support multicore SoC - Adding btrace-conf:CoreSight:id=value to support demultiplexing multiple trace sources - Adding Qbtrace-conf:CoreSight:filter:context=value to support filtering traces belonging to a given process/thread - Adding Qbtrace-conf:CoreSight:filter:start-address=value and Qbtrace-conf:CoreSight:filter:end-address=value to support filtering traces for given functions/blocks/lib - Adding Qbtrace-conf:CoreSight:trigger:on-address=value and Qbtrace-conf:CoreSight:trigger:off-address=value to support triggering tracing or stop tracing if a certain function/block/lib is executed alternatively some of configurations related to filtering and triggering can be delegated to the GDB server. UI: eclipse test and verify that existing plugins cope well with gdb extensions
5. Remote protocol execution sequence
gdb and gdbserver are communicating using the gdb remote protocol. on a semantic level a tracing session runs though following sequence (1) gdb client queries gdb server support for branch trace (2) gdb server answers with - qXfer:btrace:read - qXfer:btrace-conf:read - Qbtrace:off - Qbtrace:CoreSight - Qbtrace-conf:CoreSight:xxx where xxx is the parameter name (3) gdb client sends command to let start emitting and collecting traces (Qbtrace:CoreSight) (4) gdb server executes the commands (5) gdb client sends command to stop emitting and collecting traces (Qbtrace:off) (6) gdb server exectues the command (7) gdb client sends command to get collected traces from trace sink (qXfer:btrace:read:annex:offset,length) (8) gdb server executes the command and sends back collected traces (9) gdb client parses the traces and reconstructs target states
6. Remote protocol extensions
the remote protocol needs be extended with following primitives to support CoreSight tracing - start tracing and traces capture using CoreSight (Qbtrace:CoreSight) the remote protocol can be extended with following primitives to take advantages of etm functionalities. - select the core to trace on in the case of a multicore system gdb client sends command to select the core to trace (Qbtrace-conf:CoreSight:core=value) - set the trace id for the traces gdb client sends command to set trace id (Qbtrace-conf:CoreSight:id=value) - select the context to trace gdb client sends command to select the context to trace (Qbtrace-conf:CoreSight:filter:context=value) - select the address range to trace gdb client sends command to select the address range to trace (Qbtrace-conf:CoreSight:filter:start-address=value) (Qbtrace-conf:CoreSight:filter:end-address=value) - set triggers for starting and stopping tracing gdb client sends command to select the address to trigger tracing (Qbtrace-conf:CoreSight:trigger:on-address=value) (Qbtrace-conf:CoreSight:trigger:off-address=value)
7. alternatives and discussions
7.1. Scope definition
Coresight ETM IP comes in many versions and many implementations. According to its capabilities, it can trace instructions only or instructions and involved data/data address. All ETMs variants support instructions tracing and can therefore be used for for branch tracing.
7.2. CoreSight infrastructure exposure to the user
it is here about assigning the responsibility of configuring Coresight infrastructure to generate and route traces. two alternatives are possible: - coresight infrastructure exposed to gdb client (and UI): in this alternative the user or the UI is responsible for configuring coresight IPs in the SoC, by accessing their registers directly or via coresigh driver. Remote protocol is used to configure trace sink (ETB or TPA) to start/stop collecting traces - coresight infrastructure is not exposed outside of gdbserver. in this case high level commands can be provided by gdbserver remote protocol to setup and configure coresight IPs in the SoC. My recommendation is to extend remote protocol to provide high level commands to setup and configure coresight IPs in the SoC, or to use a different channel to pass configuration parameters to gdb server
7.3. parameters needed for parsing traces Some configuration parameters like etm version, trace id ... (content of registers ETMCR, ETMIDR, ETMCCER, ETMTRACEIDR) are needed for extracting and parsing etm trace, those parameters needs to be exchanged between gdb server and client. following alternatives are possible: - extend the remote protocol to get those params with explicit queries - add them to the content of the response to qXfer:btrace-conf:read - add them to the content of the response to qXfer:btrace:read
Best RegardsZied Guermazi
From: Arnaldo Carvalho de Melo <acme(a)redhat.com>
Suzuki noticed that this should be more useful in a generic header, and
after looking I noticed we have it already in our copy of
include/linux/bits.h in tools/include, so just use it, test built on
x86-64 and ubuntu 19.04 with:
perfbuilder@46646c9e848e:/$ aarch64-linux-gnu-gcc --version |& head -1
aarch64-linux-gnu-gcc (Ubuntu/Linaro 8.3.0-6ubuntu1) 8.3.0
perfbuilder@46646c9e848e:/$
Suggested-by: Suzuki K Poulose <suzuki.poulose(a)arm.com>
Link: https://lkml.kernel.org/r/68c1c548-33cd-31e8-100d-7ffad008c7b2@arm.com
Cc: Alexander Shishkin <alexander.shishkin(a)linux.intel.com>
Cc: Jiri Olsa <jolsa(a)redhat.com>
Cc: Leo Yan <leo.yan(a)linaro.org>
Cc: Mathieu Poirier <mathieu.poirier(a)linaro.org>
Cc: Namhyung Kim <namhyung(a)kernel.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: coresight(a)lists.linaro.org
Cc: linux-arm-kernel(a)lists.infradead.org,
Link: https://lkml.kernel.org/n/tip-69pd3mqvxdlh2shddsc7yhyv@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme(a)redhat.com>
---
tools/perf/util/cs-etm.h | 11 +----------
1 file changed, 1 insertion(+), 10 deletions(-)
diff --git a/tools/perf/util/cs-etm.h b/tools/perf/util/cs-etm.h
index 33b57e748c3d..bc848fd095f4 100644
--- a/tools/perf/util/cs-etm.h
+++ b/tools/perf/util/cs-etm.h
@@ -9,6 +9,7 @@
#include "util/event.h"
#include "util/session.h"
+#include <linux/bits.h>
/* Versionning header in case things need tro change in the future. That way
* decoding of old snapshot is still possible.
@@ -161,16 +162,6 @@ struct cs_etm_packet_queue {
#define CS_ETM_INVAL_ADDR 0xdeadbeefdeadbeefUL
-/*
- * Create a contiguous bitmask starting at bit position @l and ending at
- * position @h. For example
- * GENMASK_ULL(39, 21) gives us the 64bit vector 0x000000ffffe00000.
- *
- * Carbon copy of implementation found in $KERNEL/include/linux/bitops.h
- */
-#define GENMASK(h, l) \
- (((~0UL) - (1UL << (l)) + 1) & (~0UL >> (BITS_PER_LONG - 1 - (h))))
-
#define BMVAL(val, lsb, msb) ((val & GENMASK(msb, lsb)) >> lsb)
#define CS_ETM_HEADER_SIZE (CS_HEADER_VERSION_0_MAX * sizeof(u64))
--
2.20.1
From: Mathieu Poirier <mathieu.poirier(a)linaro.org>
Add support for CPU-wide trace scenarios by correlating range packets
with timestamp packets. That way range packets received on different
ETMQ/traceID channels can be processed and synthesized in chronological
order.
Signed-off-by: Mathieu Poirier <mathieu.poirier(a)linaro.org>
Tested-by: Leo Yan <leo.yan(a)linaro.org>
Cc: Alexander Shishkin <alexander.shishkin(a)linux.intel.com>
Cc: Jiri Olsa <jolsa(a)redhat.com>
Cc: Namhyung Kim <namhyung(a)kernel.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose(a)arm.com>
Cc: coresight(a)lists.linaro.org
Cc: linux-arm-kernel(a)lists.infradead.org
Link: http://lkml.kernel.org/r/20190524173508.29044-18-mathieu.poirier@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme(a)redhat.com>
---
tools/perf/util/cs-etm.c | 254 +++++++++++++++++++++++++++++++++++++--
1 file changed, 246 insertions(+), 8 deletions(-)
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index 91496a3a2209..0c7776b51045 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -90,12 +90,26 @@ struct cs_etm_queue {
};
static int cs_etm__update_queues(struct cs_etm_auxtrace *etm);
+static int cs_etm__process_queues(struct cs_etm_auxtrace *etm);
static int cs_etm__process_timeless_queues(struct cs_etm_auxtrace *etm,
pid_t tid);
+static int cs_etm__get_data_block(struct cs_etm_queue *etmq);
+static int cs_etm__decode_data_block(struct cs_etm_queue *etmq);
/* PTMs ETMIDR [11:8] set to b0011 */
#define ETMIDR_PTM_VERSION 0x00000300
+/*
+ * A struct auxtrace_heap_item only has a queue_nr and a timestamp to
+ * work with. One option is to modify to auxtrace_heap_XYZ() API or simply
+ * encode the etm queue number as the upper 16 bit and the channel as
+ * the lower 16 bit.
+ */
+#define TO_CS_QUEUE_NR(queue_nr, trace_id_chan) \
+ (queue_nr << 16 | trace_chan_id)
+#define TO_QUEUE_NR(cs_queue_nr) (cs_queue_nr >> 16)
+#define TO_TRACE_CHAN_ID(cs_queue_nr) (cs_queue_nr & 0x0000ffff)
+
static u32 cs_etm__get_v7_protocol_version(u32 etmidr)
{
etmidr &= ETMIDR_PTM_VERSION;
@@ -147,6 +161,29 @@ void cs_etm__etmq_set_traceid_queue_timestamp(struct cs_etm_queue *etmq,
etmq->pending_timestamp = trace_chan_id;
}
+static u64 cs_etm__etmq_get_timestamp(struct cs_etm_queue *etmq,
+ u8 *trace_chan_id)
+{
+ struct cs_etm_packet_queue *packet_queue;
+
+ if (!etmq->pending_timestamp)
+ return 0;
+
+ if (trace_chan_id)
+ *trace_chan_id = etmq->pending_timestamp;
+
+ packet_queue = cs_etm__etmq_get_packet_queue(etmq,
+ etmq->pending_timestamp);
+ if (!packet_queue)
+ return 0;
+
+ /* Acknowledge pending status */
+ etmq->pending_timestamp = 0;
+
+ /* See function cs_etm_decoder__do_{hard|soft}_timestamp() */
+ return packet_queue->timestamp;
+}
+
static void cs_etm__clear_packet_queue(struct cs_etm_packet_queue *queue)
{
int i;
@@ -171,6 +208,20 @@ static void cs_etm__clear_packet_queue(struct cs_etm_packet_queue *queue)
}
}
+static void cs_etm__clear_all_packet_queues(struct cs_etm_queue *etmq)
+{
+ int idx;
+ struct int_node *inode;
+ struct cs_etm_traceid_queue *tidq;
+ struct intlist *traceid_queues_list = etmq->traceid_queues_list;
+
+ intlist__for_each_entry(inode, traceid_queues_list) {
+ idx = (int)(intptr_t)inode->priv;
+ tidq = etmq->traceid_queues[idx];
+ cs_etm__clear_packet_queue(&tidq->packet_queue);
+ }
+}
+
static int cs_etm__init_traceid_queue(struct cs_etm_queue *etmq,
struct cs_etm_traceid_queue *tidq,
u8 trace_chan_id)
@@ -458,15 +509,15 @@ static int cs_etm__flush_events(struct perf_session *session,
if (!tool->ordered_events)
return -EINVAL;
- if (!etm->timeless_decoding)
- return -EINVAL;
-
ret = cs_etm__update_queues(etm);
if (ret < 0)
return ret;
- return cs_etm__process_timeless_queues(etm, -1);
+ if (etm->timeless_decoding)
+ return cs_etm__process_timeless_queues(etm, -1);
+
+ return cs_etm__process_queues(etm);
}
static void cs_etm__free_traceid_queues(struct cs_etm_queue *etmq)
@@ -685,6 +736,9 @@ static int cs_etm__setup_queue(struct cs_etm_auxtrace *etm,
unsigned int queue_nr)
{
int ret = 0;
+ unsigned int cs_queue_nr;
+ u8 trace_chan_id;
+ u64 timestamp;
struct cs_etm_queue *etmq = queue->priv;
if (list_empty(&queue->head) || etmq)
@@ -702,6 +756,67 @@ static int cs_etm__setup_queue(struct cs_etm_auxtrace *etm,
etmq->queue_nr = queue_nr;
etmq->offset = 0;
+ if (etm->timeless_decoding)
+ goto out;
+
+ /*
+ * We are under a CPU-wide trace scenario. As such we need to know
+ * when the code that generated the traces started to execute so that
+ * it can be correlated with execution on other CPUs. So we get a
+ * handle on the beginning of traces and decode until we find a
+ * timestamp. The timestamp is then added to the auxtrace min heap
+ * in order to know what nibble (of all the etmqs) to decode first.
+ */
+ while (1) {
+ /*
+ * Fetch an aux_buffer from this etmq. Bail if no more
+ * blocks or an error has been encountered.
+ */
+ ret = cs_etm__get_data_block(etmq);
+ if (ret <= 0)
+ goto out;
+
+ /*
+ * Run decoder on the trace block. The decoder will stop when
+ * encountering a timestamp, a full packet queue or the end of
+ * trace for that block.
+ */
+ ret = cs_etm__decode_data_block(etmq);
+ if (ret)
+ goto out;
+
+ /*
+ * Function cs_etm_decoder__do_{hard|soft}_timestamp() does all
+ * the timestamp calculation for us.
+ */
+ timestamp = cs_etm__etmq_get_timestamp(etmq, &trace_chan_id);
+
+ /* We found a timestamp, no need to continue. */
+ if (timestamp)
+ break;
+
+ /*
+ * We didn't find a timestamp so empty all the traceid packet
+ * queues before looking for another timestamp packet, either
+ * in the current data block or a new one. Packets that were
+ * just decoded are useless since no timestamp has been
+ * associated with them. As such simply discard them.
+ */
+ cs_etm__clear_all_packet_queues(etmq);
+ }
+
+ /*
+ * We have a timestamp. Add it to the min heap to reflect when
+ * instructions conveyed by the range packets of this traceID queue
+ * started to execute. Once the same has been done for all the traceID
+ * queues of each etmq, redenring and decoding can start in
+ * chronological order.
+ *
+ * Note that packets decoded above are still in the traceID's packet
+ * queue and will be processed in cs_etm__process_queues().
+ */
+ cs_queue_nr = TO_CS_QUEUE_NR(queue_nr, trace_id_chan);
+ ret = auxtrace_heap__add(&etm->heap, cs_queue_nr, timestamp);
out:
return ret;
}
@@ -1846,6 +1961,28 @@ static int cs_etm__process_traceid_queue(struct cs_etm_queue *etmq,
return ret;
}
+static void cs_etm__clear_all_traceid_queues(struct cs_etm_queue *etmq)
+{
+ int idx;
+ struct int_node *inode;
+ struct cs_etm_traceid_queue *tidq;
+ struct intlist *traceid_queues_list = etmq->traceid_queues_list;
+
+ intlist__for_each_entry(inode, traceid_queues_list) {
+ idx = (int)(intptr_t)inode->priv;
+ tidq = etmq->traceid_queues[idx];
+
+ /* Ignore return value */
+ cs_etm__process_traceid_queue(etmq, tidq);
+
+ /*
+ * Generate an instruction sample with the remaining
+ * branchstack entries.
+ */
+ cs_etm__flush(etmq, tidq);
+ }
+}
+
static int cs_etm__run_decoder(struct cs_etm_queue *etmq)
{
int err = 0;
@@ -1913,6 +2050,105 @@ static int cs_etm__process_timeless_queues(struct cs_etm_auxtrace *etm,
return 0;
}
+static int cs_etm__process_queues(struct cs_etm_auxtrace *etm)
+{
+ int ret = 0;
+ unsigned int cs_queue_nr, queue_nr;
+ u8 trace_chan_id;
+ u64 timestamp;
+ struct auxtrace_queue *queue;
+ struct cs_etm_queue *etmq;
+ struct cs_etm_traceid_queue *tidq;
+
+ while (1) {
+ if (!etm->heap.heap_cnt)
+ goto out;
+
+ /* Take the entry at the top of the min heap */
+ cs_queue_nr = etm->heap.heap_array[0].queue_nr;
+ queue_nr = TO_QUEUE_NR(cs_queue_nr);
+ trace_chan_id = TO_TRACE_CHAN_ID(cs_queue_nr);
+ queue = &etm->queues.queue_array[queue_nr];
+ etmq = queue->priv;
+
+ /*
+ * Remove the top entry from the heap since we are about
+ * to process it.
+ */
+ auxtrace_heap__pop(&etm->heap);
+
+ tidq = cs_etm__etmq_get_traceid_queue(etmq, trace_chan_id);
+ if (!tidq) {
+ /*
+ * No traceID queue has been allocated for this traceID,
+ * which means something somewhere went very wrong. No
+ * other choice than simply exit.
+ */
+ ret = -EINVAL;
+ goto out;
+ }
+
+ /*
+ * Packets associated with this timestamp are already in
+ * the etmq's traceID queue, so process them.
+ */
+ ret = cs_etm__process_traceid_queue(etmq, tidq);
+ if (ret < 0)
+ goto out;
+
+ /*
+ * Packets for this timestamp have been processed, time to
+ * move on to the next timestamp, fetching a new auxtrace_buffer
+ * if need be.
+ */
+refetch:
+ ret = cs_etm__get_data_block(etmq);
+ if (ret < 0)
+ goto out;
+
+ /*
+ * No more auxtrace_buffers to process in this etmq, simply
+ * move on to another entry in the auxtrace_heap.
+ */
+ if (!ret)
+ continue;
+
+ ret = cs_etm__decode_data_block(etmq);
+ if (ret)
+ goto out;
+
+ timestamp = cs_etm__etmq_get_timestamp(etmq, &trace_chan_id);
+
+ if (!timestamp) {
+ /*
+ * Function cs_etm__decode_data_block() returns when
+ * there is no more traces to decode in the current
+ * auxtrace_buffer OR when a timestamp has been
+ * encountered on any of the traceID queues. Since we
+ * did not get a timestamp, there is no more traces to
+ * process in this auxtrace_buffer. As such empty and
+ * flush all traceID queues.
+ */
+ cs_etm__clear_all_traceid_queues(etmq);
+
+ /* Fetch another auxtrace_buffer for this etmq */
+ goto refetch;
+ }
+
+ /*
+ * Add to the min heap the timestamp for packets that have
+ * just been decoded. They will be processed and synthesized
+ * during the next call to cs_etm__process_traceid_queue() for
+ * this queue/traceID.
+ */
+ cs_queue_nr = TO_CS_QUEUE_NR(queue_nr, trace_chan_id);
+ ret = auxtrace_heap__add(&etm->heap, cs_queue_nr, timestamp);
+ }
+
+out:
+ return ret;
+}
+
static int cs_etm__process_itrace_start(struct cs_etm_auxtrace *etm,
union perf_event *event)
{
@@ -1991,9 +2227,6 @@ static int cs_etm__process_event(struct perf_session *session,
return -EINVAL;
}
- if (!etm->timeless_decoding)
- return -EINVAL;
-
if (sample->time && (sample->time != (u64) -1))
timestamp = sample->time;
else
@@ -2005,7 +2238,8 @@ static int cs_etm__process_event(struct perf_session *session,
return err;
}
- if (event->header.type == PERF_RECORD_EXIT)
+ if (etm->timeless_decoding &&
+ event->header.type == PERF_RECORD_EXIT)
return cs_etm__process_timeless_queues(etm,
event->fork.tid);
@@ -2014,6 +2248,10 @@ static int cs_etm__process_event(struct perf_session *session,
else if (event->header.type == PERF_RECORD_SWITCH_CPU_WIDE)
return cs_etm__process_switch_cpu_wide(etm, event);
+ if (!etm->timeless_decoding &&
+ event->header.type == PERF_RECORD_AUX)
+ return cs_etm__process_queues(etm);
+
return 0;
}
--
2.20.1
From: Mathieu Poirier <mathieu.poirier(a)linaro.org>
This patch deals with timestamp packets received from the decoding
library in order to give the front end packet processing loop a handle
on the time instruction conveyed by range packets have been executed at.
Signed-off-by: Mathieu Poirier <mathieu.poirier(a)linaro.org>
Tested-by: Leo Yan <leo.yan(a)linaro.org>
Cc: Alexander Shishkin <alexander.shishkin(a)linux.intel.com>
Cc: Jiri Olsa <jolsa(a)redhat.com>
Cc: Namhyung Kim <namhyung(a)kernel.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose(a)arm.com>
Cc: coresight(a)lists.linaro.org
Cc: linux-arm-kernel(a)lists.infradead.org
Link: http://lkml.kernel.org/r/20190524173508.29044-17-mathieu.poirier@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme(a)redhat.com>
---
.../perf/util/cs-etm-decoder/cs-etm-decoder.c | 111 +++++++++++++++++-
tools/perf/util/cs-etm.c | 19 +++
tools/perf/util/cs-etm.h | 17 +++
3 files changed, 143 insertions(+), 4 deletions(-)
diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
index ce85e52f989c..bb45e23018ee 100644
--- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
+++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
@@ -269,6 +269,75 @@ cs_etm_decoder__create_etm_packet_printer(struct cs_etm_trace_params *t_params,
trace_config);
}
+static ocsd_datapath_resp_t
+cs_etm_decoder__do_soft_timestamp(struct cs_etm_queue *etmq,
+ struct cs_etm_packet_queue *packet_queue,
+ const uint8_t trace_chan_id)
+{
+ /* No timestamp packet has been received, nothing to do */
+ if (!packet_queue->timestamp)
+ return OCSD_RESP_CONT;
+
+ packet_queue->timestamp = packet_queue->next_timestamp;
+
+ /* Estimate the timestamp for the next range packet */
+ packet_queue->next_timestamp += packet_queue->instr_count;
+ packet_queue->instr_count = 0;
+
+ /* Tell the front end which traceid_queue needs attention */
+ cs_etm__etmq_set_traceid_queue_timestamp(etmq, trace_chan_id);
+
+ return OCSD_RESP_WAIT;
+}
+
+static ocsd_datapath_resp_t
+cs_etm_decoder__do_hard_timestamp(struct cs_etm_queue *etmq,
+ const ocsd_generic_trace_elem *elem,
+ const uint8_t trace_chan_id)
+{
+ struct cs_etm_packet_queue *packet_queue;
+
+ /* First get the packet queue for this traceID */
+ packet_queue = cs_etm__etmq_get_packet_queue(etmq, trace_chan_id);
+ if (!packet_queue)
+ return OCSD_RESP_FATAL_SYS_ERR;
+
+ /*
+ * We've seen a timestamp packet before - simply record the new value.
+ * Function do_soft_timestamp() will report the value to the front end,
+ * hence asking the decoder to keep decoding rather than stopping.
+ */
+ if (packet_queue->timestamp) {
+ packet_queue->next_timestamp = elem->timestamp;
+ return OCSD_RESP_CONT;
+ }
+
+ /*
+ * This is the first timestamp we've seen since the beginning of traces
+ * or a discontinuity. Since timestamps packets are generated *after*
+ * range packets have been generated, we need to estimate the time at
+ * which instructions started by substracting the number of instructions
+ * executed to the timestamp.
+ */
+ packet_queue->timestamp = elem->timestamp - packet_queue->instr_count;
+ packet_queue->next_timestamp = elem->timestamp;
+ packet_queue->instr_count = 0;
+
+ /* Tell the front end which traceid_queue needs attention */
+ cs_etm__etmq_set_traceid_queue_timestamp(etmq, trace_chan_id);
+
+ /* Halt processing until we are being told to proceed */
+ return OCSD_RESP_WAIT;
+}
+
+static void
+cs_etm_decoder__reset_timestamp(struct cs_etm_packet_queue *packet_queue)
+{
+ packet_queue->timestamp = 0;
+ packet_queue->next_timestamp = 0;
+ packet_queue->instr_count = 0;
+}
+
static ocsd_datapath_resp_t
cs_etm_decoder__buffer_packet(struct cs_etm_packet_queue *packet_queue,
const u8 trace_chan_id,
@@ -310,7 +379,8 @@ cs_etm_decoder__buffer_packet(struct cs_etm_packet_queue *packet_queue,
}
static ocsd_datapath_resp_t
-cs_etm_decoder__buffer_range(struct cs_etm_packet_queue *packet_queue,
+cs_etm_decoder__buffer_range(struct cs_etm_queue *etmq,
+ struct cs_etm_packet_queue *packet_queue,
const ocsd_generic_trace_elem *elem,
const uint8_t trace_chan_id)
{
@@ -365,6 +435,23 @@ cs_etm_decoder__buffer_range(struct cs_etm_packet_queue *packet_queue,
packet->last_instr_size = elem->last_instr_sz;
+ /* per-thread scenario, no need to generate a timestamp */
+ if (cs_etm__etmq_is_timeless(etmq))
+ goto out;
+
+ /*
+ * The packet queue is full and we haven't seen a timestamp (had we
+ * seen one the packet queue wouldn't be full). Let the front end
+ * deal with it.
+ */
+ if (ret == OCSD_RESP_WAIT)
+ goto out;
+
+ packet_queue->instr_count += elem->num_instr_range;
+ /* Tell the front end we have a new timestamp to process */
+ ret = cs_etm_decoder__do_soft_timestamp(etmq, packet_queue,
+ trace_chan_id);
+out:
return ret;
}
@@ -372,6 +459,11 @@ static ocsd_datapath_resp_t
cs_etm_decoder__buffer_discontinuity(struct cs_etm_packet_queue *queue,
const uint8_t trace_chan_id)
{
+ /*
+ * Something happened and who knows when we'll get new traces so
+ * reset time statistics.
+ */
+ cs_etm_decoder__reset_timestamp(queue);
return cs_etm_decoder__buffer_packet(queue, trace_chan_id,
CS_ETM_DISCONTINUITY);
}
@@ -404,6 +496,7 @@ cs_etm_decoder__buffer_exception_ret(struct cs_etm_packet_queue *queue,
static ocsd_datapath_resp_t
cs_etm_decoder__set_tid(struct cs_etm_queue *etmq,
+ struct cs_etm_packet_queue *packet_queue,
const ocsd_generic_trace_elem *elem,
const uint8_t trace_chan_id)
{
@@ -417,6 +510,12 @@ cs_etm_decoder__set_tid(struct cs_etm_queue *etmq,
if (cs_etm__etmq_set_tid(etmq, tid, trace_chan_id))
return OCSD_RESP_FATAL_SYS_ERR;
+ /*
+ * A timestamp is generated after a PE_CONTEXT element so make sure
+ * to rely on that coming one.
+ */
+ cs_etm_decoder__reset_timestamp(packet_queue);
+
return OCSD_RESP_CONT;
}
@@ -446,7 +545,7 @@ static ocsd_datapath_resp_t cs_etm_decoder__gen_trace_elem_printer(
trace_chan_id);
break;
case OCSD_GEN_TRC_ELEM_INSTR_RANGE:
- resp = cs_etm_decoder__buffer_range(packet_queue, elem,
+ resp = cs_etm_decoder__buffer_range(etmq, packet_queue, elem,
trace_chan_id);
break;
case OCSD_GEN_TRC_ELEM_EXCEPTION:
@@ -457,11 +556,15 @@ static ocsd_datapath_resp_t cs_etm_decoder__gen_trace_elem_printer(
resp = cs_etm_decoder__buffer_exception_ret(packet_queue,
trace_chan_id);
break;
+ case OCSD_GEN_TRC_ELEM_TIMESTAMP:
+ resp = cs_etm_decoder__do_hard_timestamp(etmq, elem,
+ trace_chan_id);
+ break;
case OCSD_GEN_TRC_ELEM_PE_CONTEXT:
- resp = cs_etm_decoder__set_tid(etmq, elem, trace_chan_id);
+ resp = cs_etm_decoder__set_tid(etmq, packet_queue,
+ elem, trace_chan_id);
break;
case OCSD_GEN_TRC_ELEM_ADDR_NACC:
- case OCSD_GEN_TRC_ELEM_TIMESTAMP:
case OCSD_GEN_TRC_ELEM_CYCLE_COUNT:
case OCSD_GEN_TRC_ELEM_ADDR_UNKNOWN:
case OCSD_GEN_TRC_ELEM_EVENT:
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index 17adf554b679..91496a3a2209 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -80,6 +80,7 @@ struct cs_etm_queue {
struct cs_etm_decoder *decoder;
struct auxtrace_buffer *buffer;
unsigned int queue_nr;
+ u8 pending_timestamp;
u64 offset;
const unsigned char *buf;
size_t buf_len, buf_used;
@@ -133,6 +134,19 @@ int cs_etm__get_cpu(u8 trace_chan_id, int *cpu)
return 0;
}
+void cs_etm__etmq_set_traceid_queue_timestamp(struct cs_etm_queue *etmq,
+ u8 trace_chan_id)
+{
+ /*
+ * Wnen a timestamp packet is encountered the backend code
+ * is stopped so that the front end has time to process packets
+ * that were accumulated in the traceID queue. Since there can
+ * be more than one channel per cs_etm_queue, we need to specify
+ * what traceID queue needs servicing.
+ */
+ etmq->pending_timestamp = trace_chan_id;
+}
+
static void cs_etm__clear_packet_queue(struct cs_etm_packet_queue *queue)
{
int i;
@@ -942,6 +956,11 @@ int cs_etm__etmq_set_tid(struct cs_etm_queue *etmq,
return 0;
}
+bool cs_etm__etmq_is_timeless(struct cs_etm_queue *etmq)
+{
+ return !!etmq->etm->timeless_decoding;
+}
+
static int cs_etm__synth_instruction_sample(struct cs_etm_queue *etmq,
struct cs_etm_traceid_queue *tidq,
u64 addr, u64 period)
diff --git a/tools/perf/util/cs-etm.h b/tools/perf/util/cs-etm.h
index b2a7628620bf..33b57e748c3d 100644
--- a/tools/perf/util/cs-etm.h
+++ b/tools/perf/util/cs-etm.h
@@ -150,6 +150,9 @@ struct cs_etm_packet_queue {
u32 packet_count;
u32 head;
u32 tail;
+ u32 instr_count;
+ u64 timestamp;
+ u64 next_timestamp;
struct cs_etm_packet packet_buffer[CS_ETM_PACKET_MAX_BUFFER];
};
@@ -183,6 +186,9 @@ int cs_etm__process_auxtrace_info(union perf_event *event,
int cs_etm__get_cpu(u8 trace_chan_id, int *cpu);
int cs_etm__etmq_set_tid(struct cs_etm_queue *etmq,
pid_t tid, u8 trace_chan_id);
+bool cs_etm__etmq_is_timeless(struct cs_etm_queue *etmq);
+void cs_etm__etmq_set_traceid_queue_timestamp(struct cs_etm_queue *etmq,
+ u8 trace_chan_id);
struct cs_etm_packet_queue
*cs_etm__etmq_get_packet_queue(struct cs_etm_queue *etmq, u8 trace_chan_id);
#else
@@ -207,6 +213,17 @@ static inline int cs_etm__etmq_set_tid(
return -1;
}
+static inline bool cs_etm__etmq_is_timeless(
+ struct cs_etm_queue *etmq __maybe_unused)
+{
+ /* What else to return? */
+ return true;
+}
+
+static inline void cs_etm__etmq_set_traceid_queue_timestamp(
+ struct cs_etm_queue *etmq __maybe_unused,
+ u8 trace_chan_id __maybe_unused) {}
+
static inline struct cs_etm_packet_queue *cs_etm__etmq_get_packet_queue(
struct cs_etm_queue *etmq __maybe_unused,
u8 trace_chan_id __maybe_unused)
--
2.20.1