CoreSight September 2018

coresight@lists.linaro.org

25 participants
25 discussions

[PATCH v5 0/6] Coresight: Support panic kdump

by Leo Yan

This patch set is to explore Coresight tracing data for postmortem debugging. When kernel panic happens, the Coresight panic kdump can help to save on-chip tracing data and tracer metadata into DRAM, later relies on kdump and crash/perf tools to recovery tracing data for "offline" analysis. Comparing the patch series v4 and previous series, this patch series has heavily refactored the implementation after investigated Intel PT for kdump support. Intel PT calls one function for emergency stopping trace when kernel panic occurs, in the function it reuses perf operation to dump trace data into ring buffer, later crash tool extracts trace data from perf ring buffer. This patch series takes Intel PT as an example to use the same way to stop ETM trace with perf mode. So far the related work is primarily to focus on to support Coresight kdump with perf mode and we can add support SysFS mode if later there have more clear requirement. Comparing to previous series, this patch series also simplifies the handling for tracer metadata. The old series introduced extra data structure and two double link lists to maintain CoreSignt kdump components; in the old implementation, one list was used to track tracer metadata and another list was used to trace dump buffers, later these two lists can be used to retrieve metadata and trace data buffer from vmcore file. In this patch series it directly relies on CoreSight driver global variables to retrieve related info, e.g. for perf mode we can rely on per CPU pointer 'ctx_handle' to get perf ring buffer related info and 'csdev_src' is for per CPU tracer device structure for metadata. The crash extension program now has been enhanced to parse the data structures in the kernel and use them to extract metadata and dump trace data [1]; the crash extension program is updated to build with OpenCSD decoder so this can simplize the decoding process, rather than before needs to use perf to help decoding trace data. This patch series has been verified on 96boards DB410c with below steps, the 'long_loop' is a pretty simple program to only execute big number loops so can generate big amount number of branch instructions. Enable trace on the target board: $ perf record -e cs_etm/(a)825000.etf/ --per-thread ./long_loop & $ sleep 3 $ echo c > /proc/sysrq-trigger Use crash tool for post analysis: $ crash vmcore vmlinux crash> extend arm_cs_dump.so crash> arm_cs_dump -o out [1] https://git.linaro.org/people/leo.yan/crash.git/log/?h=arm_cs_dump_etm_perf Changes from v4: * Support for CoreSight ETM with perf mode; * Add API for crash stop; * Simplized implementation with removing kdump dedicated data structures and functions; Changes from v3: * Following Mathieu suggestion, reworked the panic kdump framework, used kdump array to maintain source and sink device handlers; * According to Mathieu suggestion, optimized panic notifier to firstly dump panic CPU tracing data and then dump other CPUs tracing data; * Refined doc to reflect these implementation changes; * Changed ETMv4 driver to add source device handler at probe phase; * Refactored crash extension program to reflect kernel changes. Changes from v2: * Add the two patches for documentation. * Following Mathieu suggestion, reworked the panic kdump framework, removed the useless flag "PRE_PANIC". * According to comment, changed to add and delete kdump node operations in sink enable/disable functions; * According to Mathieu suggestion, handle kdump node addition/deletion/updating separately for sysFS interface and perf method. Changes from v1: * Add support to dump ETMv4 meta data. * Wrote 'crash' extension csdump.so so rely on it to generate 'perf' format compatible file. * Refactored panic dump driver to support pre & post panic dump. Changes from RFC: * Follow Mathieu's suggestion, use general framework to support dump functionality. * Changed to use perf to analyse trace data. Leo Yan (6): doc: Add Coresight documentation directory doc: Add documentation for Coresight panic kdump coresight: etm4x: Save ID values in config structure coresight: tmc: Update latest value for page index and offset coresight: etm-perf: Add interface to stop etm trace arm64: smp: Stop CoreSight trace for kdump .../trace/{ => coresight}/coresight-cpu-debug.txt | 0 .../trace/coresight/coresight-panic-kdump.txt | 99 ++++++++++++++++++++++ Documentation/trace/{ => coresight}/coresight.txt | 0 MAINTAINERS | 5 +- arch/arm64/kernel/smp.c | 5 ++ drivers/hwtracing/coresight/Kconfig | 10 +++ drivers/hwtracing/coresight/coresight-etm-perf.c | 10 +++ drivers/hwtracing/coresight/coresight-etm4x.c | 7 ++ drivers/hwtracing/coresight/coresight-etm4x.h | 8 ++ drivers/hwtracing/coresight/coresight-tmc-etf.c | 8 ++ include/linux/coresight.h | 6 ++ 11 files changed, 156 insertions(+), 2 deletions(-) rename Documentation/trace/{ => coresight}/coresight-cpu-debug.txt (100%) create mode 100644 Documentation/trace/coresight/coresight-panic-kdump.txt rename Documentation/trace/{ => coresight}/coresight.txt (100%) -- 2.7.4

6 years, 9 months

[PATCH] crash: arm_cs_dump: Support Coresight trace dump with perf mode

by Leo Yan

This patch is to add support Coresight trace dump with perf mode from vmcore file. After kernel panic we can get kernel dump file vmcore and after enabling kernel configuration CONFIG_CORESIGHT_KDUMP the ETM trace data will be stored into vmcore file; this patch is to extract CoreSight ETM metadata from ETM driver data structure and config structure and use OpenCSD lib to decode ETM trace data. It checks perf ring buffer and if the ring buffer is wrap around yet, the code firstly dump the tail of aux pages and then continue to dump data from the head of aux page. So finally we can merge the complete trace data from ring buffer. If the ring buffer isn't wrap around, then it's straightforward to dump data from the head of pages to the last writing point. This is the initial version to only support ETMv4 trace data, later can extend this program to support more tracers. To let this version for easily applying for users, the command is pretty simple: crash> extend arm_cs_dump.so crash> arm_cs_dump -o output_dir Cc: Mathieu Poirier <mathieu.poirier(a)linaro.org> Signed-off-by: Leo Yan <leo.yan(a)linaro.org> --- extensions/arm_cs_dump.c | 955 ++++++++++++++++++++++++++++++++ extensions/arm_cs_dump.mk | 39 ++ extensions/arm_cs_dump/arm_cs_decoder.c | 319 +++++++++++ extensions/arm_cs_dump/arm_cs_decoder.h | 40 ++ 4 files changed, 1353 insertions(+) create mode 100644 extensions/arm_cs_dump.c create mode 100644 extensions/arm_cs_dump.mk create mode 100644 extensions/arm_cs_dump/arm_cs_decoder.c create mode 100644 extensions/arm_cs_dump/arm_cs_decoder.h diff --git a/extensions/arm_cs_dump.c b/extensions/arm_cs_dump.c new file mode 100644 index 0000000..a0f46b0 --- /dev/null +++ b/extensions/arm_cs_dump.c @@ -0,0 +1,955 @@ +/* + * Extension module to extract dump buffer of ARM Coresight Trace + * + * Copyright (C) 2017, 2018 Linaro Ltd + * Author: Leo Yan <leo.yan(a)linaro.org> + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + */ + +#define _GNU_SOURCE +#include "defs.h" +#include <sys/file.h> +#include <sys/types.h> +#include "arm_cs_dump/arm_cs_decoder.h" + +#ifdef DEBUG +#define dbgprintf(...) fprintf(__VA_ARGS__) +#else +#define dbgprintf(...) {} +#endif + +#define INPUT_BLOCK_SIZE 1024 + +#define CS_MODE_NONE 0 +#define CS_MODE_PERF 1 + +struct arm_cs_info { + int mode; + + ulong aux_pages; + int aux_nr_pages; + ulong etm_event_data; + + ulong *buffer_ptr; + int curr_buf_idx; + ulong curr_buf_off; + ulong curr_data_sz; +} *arm_cs_info_list; + + +#define koffset(struct, member) struct##_##member##_offset + +/* at = ((struct *)ptr)->member */ +#define read_value(at, ptr, struct, member) \ +do { \ + readmem(ptr + koffset(struct, member), KVADDR, \ + &at, sizeof(at), #struct "'s " #member, \ + RETURN_ON_ERROR); \ +} while (0) + +#define init_offset(struct, member) \ +do { \ + koffset(struct, member) = MEMBER_OFFSET(#struct, #member); \ + if (koffset(struct, member) < 0) { \ + fprintf(fp, "failed to init the offset, struct:" \ + #struct ", member:" #member); \ + fprintf(fp, "\n"); \ + return -1; \ + } \ +} while (0) + + +static int koffset(perf_output_handle, event); +static int koffset(perf_output_handle, rb); + +static int koffset(ring_buffer, aux_pages); +static int koffset(ring_buffer, aux_nr_pages); +static int koffset(ring_buffer, aux_priv); + +static int koffset(etm_event_data, snk_config); + +static int koffset(cs_buffers, cur); +static int koffset(cs_buffers, offset); +static int koffset(cs_buffers, data_size); + +static int koffset(coresight_device, dev); +static int koffset(coresight_device, type); +static int koffset(coresight_device, subtype); + +static int koffset(device, parent); +static int koffset(device, driver_data); + +static int koffset(etmv4_drvdata, config); +static int koffset(etmv4_drvdata, trcid); + +static int koffset(etmv4_config, cfg); +static int koffset(etmv4_config, idr0); +static int koffset(etmv4_config, idr1); +static int koffset(etmv4_config, idr2); +static int koffset(etmv4_config, idr8); + +/* + * Thanks for Intel PT dump code ptdump.c giving good example for perf + * ring buffer dump. In this file the most perf ring buffer dump code + * reuses Intel PT dump code with minor changes for checking the last + * writing point of ring buffer; it supports to check if wrap around or + * not and copy trace data from perf ring buffer and save into a file. + */ +static inline int is_zero_page(ulong page, int offset) +{ + ulong read_addr = page + offset; + ulong read_size = PAGESIZE() - offset; + char *buf = malloc(PAGESIZE()); + int i; + + if (buf == NULL) { + fprintf(fp, "malloc failed\n"); + return FALSE; + } + + memset(buf, 0, PAGESIZE()); + dbgprintf(fp, "zero page chk: 0x%016lx, %lu\n", read_addr, read_size); + readmem(read_addr, KVADDR, buf, read_size, "zero page check", + FAULT_ON_ERROR); + + for (i = 0; i < PAGESIZE() - offset; i++) { + if (buf[i]) { + free(buf); + return FALSE; + } + } + + free(buf); + return TRUE; +} + +int check_wrap_around(int cpu) +{ + struct arm_cs_info *cs_info_ptr = arm_cs_info_list + cpu; + int wrapped = 0, i, page_idx; + ulong offset, mask, page; + + mask = (((ulong)1)<<PAGESHIFT()) - 1; + offset = cs_info_ptr->curr_buf_off & mask; + page_idx = cs_info_ptr->curr_buf_idx + + (cs_info_ptr->curr_buf_off >> PAGESHIFT()); + + dbgprintf(fp, "[%d] buf: mask=0x%lx\n", cpu, mask); + dbgprintf(fp, "[%d] buf: offset=0x%lx\n", cpu, offset); + dbgprintf(fp, "[%d] buf: page_idx=%d\n", cpu, page_idx); + + for (i=page_idx; i<cs_info_ptr->aux_nr_pages; i++) { + page = cs_info_ptr->buffer_ptr[i]; + + if (!is_zero_page(page, offset)) { + wrapped = 1; + break; + } + + offset = 0; + } + + return wrapped; +} + +int write_buffer_wrapped(int cpu, FILE *out_fp) +{ + struct arm_cs_info *cs_info_ptr = arm_cs_info_list + cpu; + int start_idx, idx, len, ret; + ulong mask, offset, page; + char *buf = malloc(PAGESIZE()); + + if (buf == NULL) { + fprintf(fp, "malloc failed\n"); + return FALSE; + } + + mask = (((ulong)1)<<PAGESHIFT()) - 1; + offset = cs_info_ptr->curr_buf_off & mask; + + start_idx = cs_info_ptr->curr_buf_idx + + (cs_info_ptr->curr_buf_off >> PAGESHIFT()); + + for (idx = start_idx; idx<cs_info_ptr->aux_nr_pages; idx++) { + page = cs_info_ptr->buffer_ptr[idx]; + len = PAGESIZE() - offset; + + readmem(page + offset, KVADDR, buf, len, "read page for write", + FAULT_ON_ERROR); + + dbgprintf(fp, "[%d] R/W1 buff: p=0x%lx, i=%d, o=%lu, l=%d\n", + cpu, page + offset, idx, offset, len); + + ret = fwrite(buf, len, 1, out_fp); + if (!ret) { + fprintf(fp, "[%d] Cannot write file\n", cpu); + free(buf); + return FALSE; + } + + offset = 0; + } + + for (idx = 0; idx < start_idx; idx++) { + page = cs_info_ptr->buffer_ptr[idx]; + len = PAGESIZE() - offset; + + readmem(page + offset, KVADDR, buf, len, "read page for write", + FAULT_ON_ERROR); + + dbgprintf(fp, "[%d] R/W2 buff: p=0x%lx, i=%d, o=%lu, l=%d\n", + cpu, page + offset, idx, offset, len); + + ret = fwrite(buf, len, 1, out_fp); + if (!ret) { + fprintf(fp, "[%d] Cannot write file\n", cpu); + free(buf); + return FALSE; + } + } + + idx = start_idx; + page = cs_info_ptr->buffer_ptr[idx]; + offset = cs_info_ptr->curr_buf_off & mask; + len = offset; + + if (len) { + readmem(page, KVADDR, buf, len, "read page for write", + FAULT_ON_ERROR); + + dbgprintf(fp, "[%d] R/W3 buff: p=0x%lx, i=%d, o=%lu, l=%d\n", cpu, + page, idx, offset, len); + + ret = fwrite(buf, len, 1, out_fp); + if (!ret) { + fprintf(fp, "[%d] Cannot write file\n", cpu); + free(buf); + return FALSE; + } + } + + free(buf); + return TRUE; +} + +int write_buffer_nowrapped(int cpu, FILE *out_fp) +{ + struct arm_cs_info *cs_info_ptr = arm_cs_info_list + cpu; + int last_idx, idx, len, ret; + ulong mask, page; + char *buf = malloc(PAGESIZE()); + + if (buf == NULL) { + fprintf(fp, "malloc failed\n"); + return FALSE; + } + + mask = (((ulong)1)<<PAGESHIFT()) - 1; + last_idx = cs_info_ptr->curr_buf_idx + + (cs_info_ptr->curr_buf_off >> PAGESHIFT()); + + for (idx = 0; idx < last_idx; idx++) { + page = cs_info_ptr->buffer_ptr[idx]; + len = PAGESIZE(); + + readmem(page, KVADDR, buf, len, "read page for write", + FAULT_ON_ERROR); + + dbgprintf(fp, "[%d] R/W1 buff: p=0x%lx, i=%d, o=%lu, l=%d\n", + cpu, page, idx, (ulong)0, len); + + ret = fwrite(buf, len, 1, out_fp); + if (!ret) { + fprintf(fp, "[%d] Cannot write file\n", cpu); + free(buf); + return FALSE; + } + } + + idx = last_idx; + page = cs_info_ptr->buffer_ptr[idx]; + len = cs_info_ptr->curr_buf_off & mask; + + readmem(page, KVADDR, buf, len, "read page for write", + FAULT_ON_ERROR); + + dbgprintf(fp, "[%d] R/W2 buff: p=0x%lx, i=%d, o=%lu, l=%d\n", cpu, + page, idx, (ulong)0, len); + + ret = fwrite(buf, len, 1, out_fp); + if (!ret) { + fprintf(fp, "[%d] Cannot write file\n", cpu); + free(buf); + return FALSE; + } + + free(buf); + return TRUE; +} + +int write_pt_log_buffer_cpu(int cpu, char *fname) +{ + int wrapped, ret; + FILE *out_fp; + + wrapped = check_wrap_around(cpu); + + if ((out_fp = fopen(fname, "w")) == NULL) { + fprintf(fp, "[%d] Cannot open file: %s\n", cpu, fname); + return FALSE; + } + dbgprintf(fp, "[%d] Open file: %s\n", cpu, fname); + + /* + * Write buffer to file + * + * Case 1: Not wrapped around + * + * start end + * | | + * v v + * +------+ +------+ +------+ +------+ + * |buffer| |buffer| ... |buffer| |buffer| + * +------+ +------+ +------+ +------+ + * + * In this case, just write data between 'start' and 'end' + * + * Case 2: Wrapped around + * + * end start + * | | + * v v + * +------+ +------+ +------+ +------+ + * |buffer| |buffer| ... |buffer| |buffer| + * +------+ +------+ +------+ +------+ + * + * In this case, at first write data between 'start' and end of last + * buffer, and then write data between beginning of first buffer and + * 'end'. + */ + if (wrapped) { + dbgprintf(fp, "[%d] wrap around: true\n", cpu); + ret = write_buffer_wrapped(cpu, out_fp); + } else { + dbgprintf(fp, "[%d] wrap around: false\n", cpu); + ret = write_buffer_nowrapped(cpu, out_fp); + } + + fclose(out_fp); + return ret; +} + +static int arm_cs_save_perf_ring_buffer(int cpu) +{ + char trace_file[sizeof("cstrace.9999999.bin")]; + ulong struct_ctx_handle; + ulong struct_ring_buffer; + ulong aux_pages, aux_priv; + ulong struct_cs_buffers; + uint cur; + ulong offset, data_size; + int i, aux_nr_pages, buf_len; + struct arm_cs_info *cs_info_ptr = arm_cs_info_list + cpu; + + /* Get pointer to struct ctx_handle for etm perf */ + if (!symbol_exists("ctx_handle")) { + fprintf(fp, "[CPU%d] symbol not found: pt_ctx\n", cpu); + return FALSE; + } + + struct_ctx_handle = symbol_value("ctx_handle") + kt->__per_cpu_offset[cpu]; + dbgprintf(fp, "struct_ctx_handle 0x%lx\n", struct_ctx_handle); + + read_value(struct_ring_buffer, struct_ctx_handle, + perf_output_handle, rb); + dbgprintf(fp, "struct_ring_buffer 0x%lx\n", struct_ring_buffer); + + if (!struct_ring_buffer) { + fprintf(fp, "No ring buffer\n"); + return FALSE; + } + + /* symbol access check */ + if (STRUCT_EXISTS("ring_buffer") && + !MEMBER_EXISTS("ring_buffer", "aux_pages")) { + fprintf(fp, "[CPU%d] invalid ring_buffer\n", cpu); + return FALSE; + } + + /* array of struct pages for pt buffer */ + read_value(aux_pages, struct_ring_buffer, ring_buffer, aux_pages); + + /* number of pages */ + read_value(aux_nr_pages, struct_ring_buffer, ring_buffer, aux_nr_pages); + + /* private data (struct etm_event_data) */ + read_value(aux_priv, struct_ring_buffer, ring_buffer, aux_priv); + + if (!aux_nr_pages) { + fprintf(fp, "No aux pages\n"); + return FALSE; + } + + cs_info_ptr->aux_pages = aux_pages; + cs_info_ptr->aux_nr_pages = aux_nr_pages; + cs_info_ptr->etm_event_data = aux_priv; + + dbgprintf(fp, "[CPU%d] rb.aux_pages=0x%016lx\n", cpu, aux_pages); + dbgprintf(fp, "[CPU%d] rb.aux_nr_pages=0x%d\n", cpu, aux_nr_pages); + dbgprintf(fp, "[CPU%d] rb.aux_priv=0x%016lx\n", cpu, aux_priv); + + /* Get address of pt buffer */ + buf_len = sizeof(void*)*aux_nr_pages; + cs_info_ptr->buffer_ptr = (ulong *)malloc(buf_len); + if (cs_info_ptr->buffer_ptr == NULL) { + fprintf(fp, "malloc failed\n"); + return FALSE; + } + memset(cs_info_ptr->buffer_ptr, 0, buf_len); + + for (i = 0; i < aux_nr_pages; i++) { + ulong pgaddr = aux_pages + i*sizeof(void*); + ulong page; + + if (!readmem(pgaddr, KVADDR, &page, sizeof(ulong), + "struct page", FAULT_ON_ERROR)) + continue; + + cs_info_ptr->buffer_ptr[i] = page; + + if (!i) + dbgprintf(fp, "[CPU%d] Dump aux pages\n", cpu); + dbgprintf(fp, " %d: 0x%016lx\n", i, page); + } + + read_value(struct_cs_buffers, cs_info_ptr->etm_event_data, + etm_event_data, snk_config); + + dbgprintf(fp, "struct_cs_buffers:0x%016lx\n", struct_cs_buffers); + + read_value(cur, struct_cs_buffers, cs_buffers, cur); + read_value(offset, struct_cs_buffers, cs_buffers, offset); + read_value(data_size, struct_cs_buffers, cs_buffers, data_size); + + cs_info_ptr->curr_buf_idx = cur; + cs_info_ptr->curr_buf_off = offset; + cs_info_ptr->curr_data_sz = data_size; + dbgprintf(fp, "[CPU%d] current bufidx=%d\n", cpu, cur); + dbgprintf(fp, "[CPU%d] current buf offset=%ld\n", cpu, offset); + dbgprintf(fp, "[CPU%d] current data size=%ld\n", cpu, data_size); + + for (i = 0; i < cpu; i++) { + struct arm_cs_info *ptr = arm_cs_info_list + i; + + if (ptr->aux_pages == cs_info_ptr->aux_pages) { + fprintf(fp, "[CPU%d] skip duplicate buffer with [%d]\n", + cpu, i); + cs_info_ptr->mode = CS_MODE_PERF; + return TRUE; + } + } + + sprintf(trace_file, "cstrace.%d.bin", cpu); + + if (write_pt_log_buffer_cpu(cpu, trace_file)) + fprintf(fp, "[CPU%d] buffer dump: %s\n", cpu, trace_file); + + cs_info_ptr->mode = CS_MODE_PERF; + + return TRUE; +} + +static int arm_cs_save_tracedata(void) +{ + int online_cpus, cpu; + int ret; + + dbgprintf(fp, "Extract raw trace data...\n"); + + online_cpus = get_cpus_online(); + switch (protocol) { + case OCSD_PROTOCOL_ETMV4I: + for (cpu = 0; cpu < online_cpus; cpu++) { + /* Firstly try perf ring buffer */ + ret = arm_cs_save_perf_ring_buffer(cpu); + if (ret) + continue; + + /* Return for failure */ + return ret; + } + break; + default: + /* Protocol isn't supported yet */ + return FALSE; + } + + return TRUE; +} + +static ulong arm_cs_get_perf_source_handler(int cpu) +{ + ulong struct_csdev, struct_csdev_ptr; + + /* Get pointer to struct ctx_handle for etm perf */ + if (!symbol_exists("csdev_src")) { + fprintf(fp, "[CPU0] symbol not found: pt_ctx\n"); + return 0; + } + + struct_csdev_ptr = symbol_value("csdev_src") + kt->__per_cpu_offset[cpu]; + dbgprintf(fp, "struct_csdev_ptr=0x%lx\n", struct_csdev_ptr); + + readmem(struct_csdev_ptr, KVADDR, &struct_csdev, + sizeof(struct_csdev), "read csdev", RETURN_ON_ERROR); + dbgprintf(fp,"struct_csdev=0x%lx\n", struct_csdev); + + return struct_csdev; +} + +static ulong arm_cs_get_drvdata_handle(ulong struct_csdev) +{ + ulong struct_dev, struct_parent_dev; + ulong struct_drvdata; + + struct_dev = struct_csdev + koffset(coresight_device, dev); + dbgprintf(fp,"struct_dev=0x%lx\n", struct_dev); + + read_value(struct_parent_dev, struct_dev, device, parent); + dbgprintf(fp,"struct_parent_dev=0x%lx\n", struct_parent_dev); + + read_value(struct_drvdata, struct_parent_dev, device, driver_data); + dbgprintf(fp,"struct_drvdata=0x%lx\n", struct_drvdata); + + return struct_drvdata; +} + +static ocsd_err_t arm_cs_create_etmv4_decoder(dcd_tree_handle_t dcd_tree_h, + int cpu) +{ + ulong csdev, etmv4_driverdata, etmv4_config; + ocsd_etmv4_cfg trace_config; + char trcid; + struct arm_cs_info *cs_info_ptr = arm_cs_info_list + cpu; + + switch (cs_info_ptr->mode) { + case CS_MODE_PERF: + csdev = arm_cs_get_perf_source_handler(cpu); + break; + default: + csdev = 0; + break; + } + + if (!csdev) { + fprintf(fp, "Failed to read csdev device handle\n"); + return OCSD_ERR_FAIL; + } + + etmv4_driverdata = arm_cs_get_drvdata_handle(csdev); + if (!etmv4_driverdata) { + fprintf(fp, "Failed to read etmv4 drvdata\n"); + return OCSD_ERR_FAIL; + } + + etmv4_config = etmv4_driverdata + koffset(etmv4_drvdata, config); + dbgprintf(fp,"etmv4_config=0x%lx\n", etmv4_config); + + read_value(trcid, etmv4_driverdata, etmv4_drvdata, trcid); + + /* + * Populate the ETMv4 configuration structure with hard coded + * values from snapshot .ini files. + */ + trace_config.arch_ver = ARCH_V8; + trace_config.core_prof = profile_CortexA; + trace_config.reg_traceidr = trcid; + trace_config.reg_idr9 = 0x0; + trace_config.reg_idr10 = 0x0; + trace_config.reg_idr11 = 0x0; + trace_config.reg_idr12 = 0x0; + trace_config.reg_idr13 = 0x0; + + read_value(trace_config.reg_configr, etmv4_config, etmv4_config, cfg); + read_value(trace_config.reg_idr0, etmv4_config, etmv4_config, idr0); + read_value(trace_config.reg_idr1, etmv4_config, etmv4_config, idr1); + read_value(trace_config.reg_idr2, etmv4_config, etmv4_config, idr2); + read_value(trace_config.reg_idr8, etmv4_config, etmv4_config, idr8); + + dbgprintf(fp, "reg_configr=0x%x\n", trace_config.reg_configr); + dbgprintf(fp, "reg_traceidr=0x%x\n", trace_config.reg_traceidr); + dbgprintf(fp, "reg_idr0=0x%x\n", trace_config.reg_idr0); + dbgprintf(fp, "reg_idr1=0x%x\n", trace_config.reg_idr1); + dbgprintf(fp, "reg_idr2=0x%x\n", trace_config.reg_idr2); + dbgprintf(fp, "reg_idr8=0x%x\n", trace_config.reg_idr8); + dbgprintf(fp, "reg_idr9=0x%x\n", trace_config.reg_idr9); + dbgprintf(fp, "reg_idr10=0x%x\n", trace_config.reg_idr10); + dbgprintf(fp, "reg_idr11=0x%x\n", trace_config.reg_idr11); + dbgprintf(fp, "reg_idr12=0x%x\n", trace_config.reg_idr12); + dbgprintf(fp, "reg_idr13=0x%x\n", trace_config.reg_idr13); + + /* + * create an ETMV4 decoder - no context needed as we have a + * single stream to a single handler. + */ + return arm_cs_create_generic_decoder(dcd_tree_h, + OCSD_BUILTIN_DCD_ETMV4I, + (void *)&trace_config, 0); +} + +/* Create a decoder according to options */ +static ocsd_err_t arm_cs_create_decoder(dcd_tree_handle_t dcd_tree_h) +{ + ocsd_err_t err; + int online_cpus, i; + + online_cpus = get_cpus_online(); + + switch (protocol) { + case OCSD_PROTOCOL_ETMV4I: + for (i = 0; i < online_cpus; i++) { + err = arm_cs_create_etmv4_decoder(dcd_tree_h, i); + if (err != OCSD_OK) + return err; + } + break; + default: + err = OCSD_ERR_NO_PROTOCOL; + break; + } + + return err; +} + +/* Process buffer until done or error */ +static ocsd_err_t arm_cs_process_data_block(dcd_tree_handle_t dcd_tree_h, + int block_index, + uint8_t *p_block, + const uint32_t block_size) +{ + ocsd_err_t ret = OCSD_OK; + ocsd_datapath_resp_t dp_ret = OCSD_RESP_CONT; + uint32_t total_sz = 0; + uint32_t sz = 0; + + while (total_sz < block_size) { + + if (OCSD_DATA_RESP_IS_CONT(dp_ret)) { + + dp_ret = ocsd_dt_process_data(dcd_tree_h, + OCSD_OP_DATA, + block_index + total_sz, + block_size - total_sz, + p_block+ total_sz, + &sz); + total_sz += sz; + + } else if (OCSD_DATA_RESP_IS_WAIT(dp_ret)) { + + dp_ret = ocsd_dt_process_data(dcd_tree_h, + OCSD_OP_FLUSH, + 0, 0, NULL, NULL); + } else { + ret = OCSD_ERR_DATA_DECODE_FATAL; + break; + } + } + + return ret; +} + +static int arm_cs_process_trace_data(FILE *pf) +{ + ocsd_err_t ret = OCSD_OK; + dcd_tree_handle_t dcdtree_handle = C_API_INVALID_TREE_HANDLE; + uint8_t data_buffer[INPUT_BLOCK_SIZE]; + ocsd_trc_index_t index = 0; + size_t data_read; + + /* + * Create a decode tree for this source data. Source data is frame + * formatted, memory aligned from an ETR (no frame syncs) so create + * tree accordingly. + */ + dcdtree_handle = ocsd_create_dcd_tree(OCSD_TRC_SRC_FRAME_FORMATTED, + OCSD_DFRMTR_FRAME_MEM_ALIGN | + OCSD_DFRMTR_RESET_ON_4X_FSYNC); + if (dcdtree_handle == C_API_INVALID_TREE_HANDLE) { + fprintf(fp, "Failed to create dcd tree\n"); + return FALSE; + } + + ret = arm_cs_create_decoder(dcdtree_handle); + if (ret != OCSD_OK) { + fprintf(fp, "Failed to create decoder\n"); + return FALSE; + } + + ret = arm_cs_create_test_memory_acc(dcdtree_handle); + if (ret != OCSD_OK) { + fprintf(fp, "Failed to create memory accessing\n"); + return FALSE; + } + + ocsd_tl_log_mapped_mem_ranges(dcdtree_handle); + + ret = ocsd_dt_set_gen_elem_outfn(dcdtree_handle, + arm_cs_gen_trace_elem_print, 0); + if (ret != OCSD_OK) { + fprintf(fp, "Failed to set elem outfn\n"); + return FALSE; + } + + /* Now push the trace data through the packet processor */ + while (!feof(pf)) { + + /* Read from file */ + data_read = fread(data_buffer, 1, INPUT_BLOCK_SIZE, pf); + if (data_read <= 0) { + if (ferror(pf)) + ret = OCSD_ERR_FILE_ERROR; + break; + } + + /* + * Process a block of data - any packets from the trace stream + * we have configured will appear at the callback. + */ + ret = arm_cs_process_data_block(dcdtree_handle, index, + data_buffer, data_read); + if (ret != OCSD_OK) + break; + + index += data_read; + } + + if (ret != OCSD_OK) { + fprintf(fp, "Failed to process data block.\n"); + return FALSE; + } + + ocsd_dt_process_data(dcdtree_handle, OCSD_OP_EOT, 0, 0, NULL, NULL); + + /* Shut down the mem acc CB if in use. */ + arm_cs_destroy_mem_acc_cb(dcdtree_handle); + + /* + * Dispose of the decode tree - which will dispose of any packet + * processors we created. + */ + ocsd_destroy_dcd_tree(dcdtree_handle); + return TRUE; +} + +static int arm_cs_decode_etmv4_trace_data(int cpu) +{ + FILE *trace_data; + char trace_file[sizeof("cstrace.9999999.bin")]; + char log_file[sizeof("cstrace.9999999.log")]; + int ret; + char message[512]; + + /* Trace file cstrace.X.bin, X is CPU number */ + sprintf(trace_file, "cstrace.%d.bin", cpu); + + trace_data = fopen(trace_file, "rb"); + if (!trace_data) { + fprintf(fp, "Trace file %s doesn't exist\n", trace_file); + return TRUE; + } + + /* + * Set up the logging in the library - enable the error logger, + * with an output printer. + */ + ret = ocsd_def_errlog_init(OCSD_ERR_SEV_INFO, 1); + if (ret != 0) + return FALSE; + + /* Log file cstrace.X.bin, X is CPU number */ + sprintf(log_file, "cstrace.%d.log", cpu); + + /* Set up the output - to file and stdout, set custom logfile name */ + ret = ocsd_def_errlog_config_output(C_API_MSGLOGOUT_FLG_FILE | + C_API_MSGLOGOUT_FLG_STDOUT, + log_file); + + /* Print sign-on message in log */ + sprintf(message, "C-API packet print test\nLibrary Version %s\n\n", + ocsd_get_version_str()); + ocsd_def_errlog_msgout(message); + + /* Process the trace data */ + ret = arm_cs_process_trace_data(trace_data); + if (!ret) + fprintf(fp,"Failed to process trace data.\n"); + + /* Close the data file */ + fclose(trace_data); + return TRUE; +} + +static int arm_cs_decode_tracedata(void) +{ + int online_cpus, cpu; + int ret; + + dbgprintf(fp, "Decode raw trace data...\n"); + + online_cpus = get_cpus_online(); + switch (protocol) { + case OCSD_PROTOCOL_ETMV4I: + for (cpu = 0; cpu < online_cpus; cpu++) { + ret = arm_cs_decode_etmv4_trace_data(cpu); + if (!ret) + return ret; + } + break; + default: + /* Protocol isn't supported yet */ + return FALSE; + } + + return TRUE; +} + +static int arm_cs_dump_prepare(void) +{ + int list_len, online_cpus; + + init_offset(perf_output_handle, event); + init_offset(perf_output_handle, rb); + + init_offset(ring_buffer, aux_pages); + init_offset(ring_buffer, aux_nr_pages); + init_offset(ring_buffer, aux_priv); + + init_offset(etm_event_data, snk_config); + + init_offset(cs_buffers, cur); + init_offset(cs_buffers, offset); + init_offset(cs_buffers, data_size); + + init_offset(coresight_device, dev); + init_offset(coresight_device, type); + init_offset(coresight_device, subtype); + + init_offset(device, parent); + init_offset(device, driver_data); + + init_offset(etmv4_drvdata, config); + init_offset(etmv4_drvdata, trcid); + + init_offset(etmv4_config, cfg); + init_offset(etmv4_config, idr0); + init_offset(etmv4_config, idr1); + init_offset(etmv4_config, idr2); + init_offset(etmv4_config, idr8); + + online_cpus = get_cpus_online(); + list_len = sizeof(struct arm_cs_info) * online_cpus; + arm_cs_info_list = malloc(list_len); + if (arm_cs_info_list == NULL) { + fprintf(fp, "Cannot alloc arm_cs_info_list\n"); + return FALSE; + } + memset(arm_cs_info_list, 0, list_len); + + return TRUE; +} + +int arm_cs_main(void) +{ + int ret; + + ret = arm_cs_save_tracedata(); + if (!ret) + return ret; + + ret = arm_cs_decode_tracedata(); + if (!ret) + return ret; + + return TRUE; +} + +void arm_cs_dump_cmd(void) +{ + char* outdir = NULL; + mode_t mode = S_IRUSR | S_IWUSR | S_IXUSR | + S_IRGRP | S_IXGRP | + S_IROTH | S_IXOTH; /* 0755 */ + int c, ret; + + /* Parse command line option */ + while ((c = getopt(argcnt, args, "t:o:")) != EOF) { + switch(c) { + case 't': + if (!strcmp(optarg, "etmv4")) + protocol = OCSD_PROTOCOL_ETMV4I; + else + argerrs++; + break; + case 'o': + outdir = optarg; + break; + default: + argerrs++; + break; + } + } + + if (argerrs || !outdir) + cmd_usage(pc->curcmd, SYNOPSIS); + + ocsd_def_errlog_init(OCSD_ERR_SEV_INFO, 1); + + if (!arm_cs_dump_prepare()) + return; + + if ((ret = mkdir(outdir, mode))) { + fprintf(fp, "Cannot create directory %s: %d\n", outdir, ret); + return; + } + + if ((ret = chdir(outdir))) { + fprintf(fp, "Cannot chdir %s: %d\n", outdir, ret); + return; + } + + arm_cs_main(); + + chdir(".."); + return; +} + +static char *arm_cs_dump_help[] = { + "arm_cs_dump", + "Dump log buffer of Coresight Trace", + "-o <output-dir>", + " This command extracts coresight trace to the output directory", + NULL +}; + +static struct command_table_entry command_table[] = { + { "arm_cs_dump", arm_cs_dump_cmd, arm_cs_dump_help, 0}, + { NULL }, +}; + +void __attribute__((constructor)) +arm_cs_dump_init(void) +{ + register_extension(command_table); +} + +void __attribute__((destructor)) +arm_cs_dump_fini(void) { } diff --git a/extensions/arm_cs_dump.mk b/extensions/arm_cs_dump.mk new file mode 100644 index 0000000..afe96dc --- /dev/null +++ b/extensions/arm_cs_dump.mk @@ -0,0 +1,39 @@ +# +# Copyright (C) 2018 Linaro Ltd. +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# + +ifeq ($(shell /bin/ls /usr/include/crash/defs.h 2>/dev/null), /usr/include/crash/defs.h) + INCDIR=/usr/include/crash +endif +ifeq ($(shell /bin/ls ./defs.h 2> /dev/null), ./defs.h) + INCDIR=. +endif +ifeq ($(shell /bin/ls ../defs.h 2> /dev/null), ../defs.h) + INCDIR=.. +endif + +SUBDIR=arm_cs_dump +TARGET_CFILES=arm_cs_dump.c $(SUBDIR)/arm_cs_decoder.c + +COMMON_CFLAGS=-Wall -I$(INCDIR) -fPIC -D$(TARGET) + +all: arm_cs_dump.so + +arm_cs_dump.so: $(TARGET_CFILES) $(INCDIR)/defs.h + gcc $(RPM_OPT_FLAGS) $(CFLAGS) $(TARGET_CFLAGS) $(COMMON_CFLAGS) -nostartfiles -shared -rdynamic -o $@ $(TARGET_CFILES) -lelf -lopencsd -lopencsd_c_api + +debug: COMMON_CFLAGS+=-DDEBUG +debug: all + +clean: + rm -f *.so *.o diff --git a/extensions/arm_cs_dump/arm_cs_decoder.c b/extensions/arm_cs_dump/arm_cs_decoder.c new file mode 100644 index 0000000..9281bc5 --- /dev/null +++ b/extensions/arm_cs_dump/arm_cs_decoder.c @@ -0,0 +1,319 @@ +/* + * Extension module to extract dump buffer of ARM Coresight Trace + * + * Copyright (C) 2017, 2018 Linaro Ltd + * Author: Leo Yan <leo.yan(a)linaro.org> + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + */ + +#define _GNU_SOURCE +#include "defs.h" +#include <sys/file.h> +#include <sys/types.h> + +#include "arm_cs_decoder.h" + +#define INPUT_BLOCK_SIZE 1024 + +/* buffer to handle a packet string */ +#define PACKET_STR_LEN 1024 +static char packet_str[PACKET_STR_LEN]; + +test_op_t op_type = TEST_PKT_DECODE; +ocsd_trace_protocol_t protocol = OCSD_PROTOCOL_ETMV4I; + +uint32_t arm_cs_mem_acc_cb(const void *p_context, + const ocsd_vaddr_t address, + const ocsd_mem_space_acc_t mem_space, + const uint32_t size, + uint8_t *buffer) +{ + return 0; +} + +/* + * Create the memory accessor using the callback function + * and attach to decode tree + */ +ocsd_err_t arm_cs_create_mem_acc_cb(dcd_tree_handle_t dcd_tree_h) +{ + ocsd_err_t err; + + err = ocsd_dt_add_callback_mem_acc(dcd_tree_h, + 0x0L, ((uint64_t) -1L), + OCSD_MEM_SPACE_ANY, + &arm_cs_mem_acc_cb, 0); + return err; +} + +/* Remove the callback memory accessor from decode tree */ +void arm_cs_destroy_mem_acc_cb(dcd_tree_handle_t dcd_tree_h) +{ + ocsd_dt_remove_mem_acc(dcd_tree_h, 0x0L, OCSD_MEM_SPACE_ANY); +} + +/* + * Create and attach the memory accessor according to + * required test parameters. + */ +ocsd_err_t arm_cs_create_test_memory_acc(dcd_tree_handle_t handle) +{ + ocsd_err_t ret; + + /* + * Decide how to handle the file - test the normal memory accessor + * (contiguous binary file), a callback accessor or a multi-region + * file (e.g. similar to using the code region in a .so). + * + * The same memory dump file is used in each case, we just present + * it differently to test the API functions. + */ + + ret = arm_cs_create_mem_acc_cb(handle); + return ret; +} + +/* + * Callback function to process the packets in the packet processor + * output stream - simply print them out in this case to the + * library message/error logger. + */ +ocsd_datapath_resp_t arm_cs_packet_handler(void *context, + const ocsd_datapath_op_t op, + const ocsd_trc_index_t index_sop, + const void *p_packet_in) +{ + ocsd_err_t ret; + int offset = 0; + + switch (op) { + case OCSD_OP_DATA: + sprintf(packet_str, "Idx:%" OCSD_TRC_IDX_STR "; ", index_sop); + offset = strlen(packet_str); + + /* + * Got a packet - convert to string and use the libraries' + * message output to print to file and stdoout since the + * test always prints a single ID, we know the protocol type. + */ + ret = ocsd_pkt_str(protocol, p_packet_in, packet_str + offset, + PACKET_STR_LEN - offset); + if (ret != OCSD_OK) + return OCSD_RESP_FATAL_INVALID_PARAM; + + /* Add in <CR> */ + if (strlen(packet_str) == PACKET_STR_LEN - 1) /* max length */ + packet_str[PACKET_STR_LEN-2] = '\n'; + else + strcat(packet_str,"\n"); + + /* Print it using the library output logger. */ + ocsd_def_errlog_msgout(packet_str); + break; + + case OCSD_OP_EOT: + sprintf(packet_str, "*** END OF TRACE ***\n"); + ocsd_def_errlog_msgout(packet_str); + break; + + default: + break; + } + + return OCSD_RESP_CONT; +} + +/* + * Print an array of hex data - used by the packet monitor + * to print hex data from packet. + */ +int arm_cs_print_data_array(const uint8_t *p_array, + const int array_size, + char *buffer, int size) +{ + int printed = 0; + int idx; + + buffer[0] = 0; + + if (size > 9) { + /* set up the header */ + strcat(buffer,"[ "); + printed += 2; + + for (idx = 0; idx < array_size; idx++) { + sprintf(buffer + printed,"0x%02X ", p_array[idx]); + printed += 5; + if ((printed + 5) > size) + break; + } + + strcat(buffer, "];"); + printed += 2; + + } else if (size >= 4) { + sprintf(buffer,"[];"); + printed += 3; + } + + return printed; +} + +/* + * Callback function to process packets and packet data from + * the monitor output of the packet processor. Again print + * them to the library error logger. + */ +void arm_cs_packet_monitor(void *context, const ocsd_datapath_op_t op, + const ocsd_trc_index_t index_sop, + const void *p_packet_in, const uint32_t size, + const uint8_t *p_data) +{ + int offset = 0; + ocsd_err_t ret; + + switch(op) { + case OCSD_OP_DATA: + sprintf(packet_str, "Idx:%" OCSD_TRC_IDX_STR ";", index_sop); + offset = strlen(packet_str); + offset += arm_cs_print_data_array(p_data, size, + packet_str + offset, + PACKET_STR_LEN - offset); + + /* + * Got a packet - convert to string and use the libraries + * message output to print to file and stdoout. + */ + ret = ocsd_pkt_str(protocol, p_packet_in, + packet_str + offset, + PACKET_STR_LEN - offset); + if (ret != OCSD_OK) { + fprintf(fp,"Failed to get packet string.\n"); + return; + } + + /* Add in <CR> */ + if (strlen(packet_str) == PACKET_STR_LEN - 1) /* max length */ + packet_str[PACKET_STR_LEN-2] = '\n'; + else + strcat(packet_str,"\n"); + + /* Print it using the library output logger. */ + ocsd_def_errlog_msgout(packet_str); + break; + + case OCSD_OP_EOT: + sprintf(packet_str, "*** END OF TRACE ***\n"); + ocsd_def_errlog_msgout(packet_str); + break; + + default: + break; + } + + return; +} + +/* Output generic trace elements */ +ocsd_datapath_resp_t arm_cs_gen_trace_elem_print(const void *p_context, + const ocsd_trc_index_t index_sop, + const uint8_t trc_chan_id, + const ocsd_generic_trace_elem *elem) +{ + int offset = 0; + ocsd_err_t ret; + + sprintf(packet_str, "Idx:%" OCSD_TRC_IDX_STR "; TrcID:0x%02X; ", + index_sop, trc_chan_id); + + offset = strlen(packet_str); + + ret = ocsd_gen_elem_str(elem, packet_str + offset, + PACKET_STR_LEN - offset); + if (ret != OCSD_OK) { + strcat(packet_str,"Unable to create element string\n"); + } else { + /* Add in <CR> */ + if (strlen(packet_str) == PACKET_STR_LEN - 1) /* max length */ + packet_str[PACKET_STR_LEN-2] = '\n'; + else + strcat(packet_str,"\n"); + } + + /* Print it using the library output logger. */ + ocsd_def_errlog_msgout(packet_str); + + return OCSD_RESP_CONT; +} + +ocsd_err_t arm_cs_create_generic_decoder(dcd_tree_handle_t handle, + const char *p_name, + const void *p_cfg, + const void *p_context) +{ + ocsd_err_t ret = OCSD_OK; + uint8_t csid = 0; + + if (op_type == TEST_PKT_PRINT) { + /* + * Create a packet processor on the decode tree for the + * configuration we have. We need to supply the configuration. + */ + ret = ocsd_dt_create_decoder(handle, p_name, + OCSD_CREATE_FLG_PACKET_PROC, + p_cfg, &csid); + if (ret != OCSD_OK) { + fprintf(fp,"Failed to create decoder for PKT_PRINT\n"); + return ret; + } + + ret = ocsd_dt_attach_packet_callback(handle, csid, + OCSD_C_API_CB_PKT_SINK, + &arm_cs_packet_handler, + p_context); + if (ret != OCSD_OK) { + fprintf(fp,"Failed to attach CB for PKT_PRINT\n"); + ocsd_dt_remove_decoder(handle, csid); + return ret; + } + } else { + /* + * Full decode - need decoder, and memory dump; + * create the packet decoder and packet processor pair + * from the supplied name. + */ + ret = ocsd_dt_create_decoder(handle, p_name, + OCSD_CREATE_FLG_FULL_DECODER, + p_cfg, &csid); + if (ret != OCSD_OK) { + fprintf(fp,"Failed to create decoder for PKT_DECODE{ONLY}\n"); + return ret; + } + + if (op_type == TEST_PKT_DECODE) { + /* + * Print the packets as well as the decode - use the + * packet processors monitor output this time, as the + * main output is attached to the packet decoder. + */ + ret = ocsd_dt_attach_packet_callback(handle, csid, + OCSD_C_API_CB_PKT_MON, + arm_cs_packet_monitor, p_context); + if (ret != OCSD_OK) { + fprintf(fp,"Failed to attach CB for printing packet\n"); + return ret; + } + } + } + + return OCSD_OK; +} diff --git a/extensions/arm_cs_dump/arm_cs_decoder.h b/extensions/arm_cs_dump/arm_cs_decoder.h new file mode 100644 index 0000000..d8202b4 --- /dev/null +++ b/extensions/arm_cs_dump/arm_cs_decoder.h @@ -0,0 +1,40 @@ +#include <opencsd/c_api/opencsd_c_api.h> + +typedef enum test_op { + TEST_PKT_PRINT, /* Process discrete packets and print */ + TEST_PKT_DECODE, /* Decode and generic output */ + TEST_PKT_DECODEONLY /* Decode and output packets only */ +} test_op_t; + +extern test_op_t op_type; +extern ocsd_trace_protocol_t protocol; + +extern uint32_t arm_cs_mem_acc_cb(const void *p_context, + const ocsd_vaddr_t address, + const ocsd_mem_space_acc_t mem_space, + const uint32_t size, + uint8_t *buffer); +extern ocsd_err_t arm_cs_create_test_memory_acc(dcd_tree_handle_t handle); +extern ocsd_err_t arm_cs_create_mem_acc_cb(dcd_tree_handle_t dcd_tree_h); +extern void arm_cs_destroy_mem_acc_cb(dcd_tree_handle_t dcd_tree_h); +extern ocsd_datapath_resp_t arm_cs_packet_handler(void *context, + const ocsd_datapath_op_t op, + const ocsd_trc_index_t index_sop, + const void *p_packet_in); +extern int arm_cs_print_data_array(const uint8_t *p_array, + const int array_size, + char *buffer, int size); +extern void arm_cs_packet_monitor(void *context, + const ocsd_datapath_op_t op, + const ocsd_trc_index_t index_sop, + const void *p_packet_in, + const uint32_t size, + const uint8_t *p_data); +extern ocsd_datapath_resp_t arm_cs_gen_trace_elem_print(const void *p_context, + const ocsd_trc_index_t index_sop, + const uint8_t trc_chan_id, + const ocsd_generic_trace_elem *elem); +extern ocsd_err_t arm_cs_create_generic_decoder(dcd_tree_handle_t handle, + const char *p_name, + const void *p_cfg, + const void *p_context); -- 2.7.4

6 years, 9 months

[PATCH v2 00/14] coresight: Implement device claim protocol

by Suzuki K Poulose

Coresight architecture defines CLAIM tags for a device to negotiate control of the components (external agent vs self-hosted). Each device has a pair of registers (CLAIMSET & CLAIMCLR) for managing the CLAIM tags. However, the protocol for the CLAIM tags is IMPLEMENTATION DEFINED. PSCI has recommendations for the use of the CLAIM tags to negotiate controls for external agent vs self-hosted use, as defined in ARM DEN 0022D, Section "6.8.1 Debug and Trace save and restore". This series implements the recommended protocol by PSCI. There were two options for the implementation. 1) Have the claim/disclaim operations performed from the coresight generic driver - This doesn't work unfortunately for ETM devices as the need cross-CPU calls to access the CLAIM registers. Also, makes it complex for error recovery and reference counting. 2) Have the claim/disclaim operations performed from the device specific drivers. The disadvantage is that the calls are sprinkled in each driver, but this makes the operation much simpler. This series implements the method (2). The first part of the series prepares different drivers to handle errors from the lower layer and clean up the state. The second part of the series updates the existing drivers to claim/disclaim the devices as necessary. Tested with a hacked coresight driver which modifies the external claim tag via sysfs handle. Applies on coresight/next in Mathieu's tree. Changes since V1: - Handle errors is enabling path and disable only the components that were enabled in the iteration. - Fix build break on arm32 (etm3x) - Update commit description for "coresight: Add support for CLAIM tag protocol" Suzuki K Poulose (14): coresight: Handle failures in enabling a trace path coresight: tmc-etr: Refactor for handling errors coresight: tmc-etr: Handle errors enabling CATU coresight: tmc-etb/etf: Prepare to handle errors enabling coresight: etm4x: Add support for handling errors coresight: etm3: Add support for handling errors coresight: etb10: Handle errors enabling the device coresight: dynamic-replicator: Handle multiple connections coresight: Add support for CLAIM tag protocol coresight: etmx: Claim devices before use coresight: funnel: Claim devices before use coresight: catu: Claim device before use coresight: dynamic-replicator: Claim device for use coreisght: tmc: Claim device before use drivers/hwtracing/coresight/coresight-catu.c | 6 ++ .../coresight/coresight-dynamic-replicator.c | 79 ++++++++++---- drivers/hwtracing/coresight/coresight-etb10.c | 18 +++- drivers/hwtracing/coresight/coresight-etm3x.c | 56 +++++++--- drivers/hwtracing/coresight/coresight-etm4x.c | 51 ++++++--- drivers/hwtracing/coresight/coresight-funnel.c | 26 ++++- drivers/hwtracing/coresight/coresight-priv.h | 7 ++ drivers/hwtracing/coresight/coresight-tmc-etf.c | 95 +++++++++++------ drivers/hwtracing/coresight/coresight-tmc-etr.c | 80 +++++++++----- drivers/hwtracing/coresight/coresight.c | 118 +++++++++++++++++++-- include/linux/coresight.h | 20 ++++ 11 files changed, 434 insertions(+), 122 deletions(-) -- 2.7.4

6 years, 9 months

[PATCH v2 0/2] CoreSight: tmc-etf: Fixes for updating buffer

by Leo Yan

This patch series is to two fixing of updating ring buffer in tmc-etf driver. The first patch is to fix alignment setting for RRP; the second patch tries to fix discarding trace data issue caused by filling barrier packets in the same place, the patch keeps complete trace data with inserting extra barrier packets. This patch series has been rebased on CoreSight next branch: https://git.linaro.org/kernel/coresight.git/log/?h=next with latest commit 3733ca5a6578 ("coresight: tmc: Refactor loops in etb dump"). Changes from v1: * Rebased on CoreSight next branch (Sept 11th, 2018); * Added checking 'lost || to_read > handle->size' to set 'barrier_sz'. Leo Yan (2): coresight: tmc: Fix byte-address alignment for RRP coresight: tmc: Fix writing barrier packets for ring buffer drivers/hwtracing/coresight/coresight-tmc-etf.c | 41 +++++++++++++++++-------- 1 file changed, 29 insertions(+), 12 deletions(-) -- 2.7.4

6 years, 9 months

[PATCH] dts: juno: Enable coresight tmc scatter gather in ETR

by Suzuki K Poulose

We do not enable scatter-gather mode in the TMC-ETR by default to prevent malfunctioning of systems where the ETR may not be properly connected to the memory subsystem to allow for simultaneous READ/WRITE transactions when used in SG mode. Instead we whitelist the platforms where we know that it is safe to use the mode. All revisions of Juno have a proper ETR connection and hence white list them. Cc: Mathieu Poirier <mathieu.poirier(a)linaro.org> Cc: Mike Leach <mike.leach(a)linaro.org> Cc: Sudeep Holla <sudeep.holla(a)arm.com> Cc: Liviu Dudau <liviu.dudau(a)arm.com> Cc: Lorenzo Pierlisi <lorenzo.pieralisi(a)arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose(a)arm.com> --- arch/arm64/boot/dts/arm/juno-base.dtsi | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arm64/boot/dts/arm/juno-base.dtsi b/arch/arm64/boot/dts/arm/juno-base.dtsi index ce56a4a..3596e5d 100644 --- a/arch/arm64/boot/dts/arm/juno-base.dtsi +++ b/arch/arm64/boot/dts/arm/juno-base.dtsi @@ -199,6 +199,7 @@ clocks = <&soc_smc50mhz>; clock-names = "apb_pclk"; power-domains = <&scpi_devpd 0>; + arm,scatter-gather; port { etr_in_port: endpoint { slave-mode; -- 2.7.4

6 years, 9 months

[PATCH 1/2] coresight: tmc: Fix byte-address alignment for RRP

by Leo Yan

>From the comment in the code, it claims the requirement for byte-address alignment for RRP register: 'for 32-bit, 64-bit and 128-bit wide trace memory, the four LSBs must be 0s. For 256-bit wide trace memory, the five LSBs must be 0s'. This isn't consistent with the program, the program sets five LSBs as zeros for 32/64/128-bit wide trace memory and set six LSBs zeros for 256-bit wide trace memory. After checking with the CoreSight Trace Memory Controller technical reference manual (ARM DDI 0461B, section 3.3.4 RAM Read Pointer Register), it proves the comment is right and the program does wrong setting. This patch fixes byte-address alignment for RRP by following correct definition in the technical reference manual. Cc: Mathieu Poirier <mathieu.poirier(a)linaro.org> Cc: Mike Leach <mike.leach(a)linaro.org> Signed-off-by: Leo Yan <leo.yan(a)linaro.org> --- drivers/hwtracing/coresight/coresight-tmc-etf.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c index 0549249..e310613 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c @@ -438,10 +438,10 @@ static void tmc_update_etf_buffer(struct coresight_device *csdev, case TMC_MEM_INTF_WIDTH_32BITS: case TMC_MEM_INTF_WIDTH_64BITS: case TMC_MEM_INTF_WIDTH_128BITS: - mask = GENMASK(31, 5); + mask = GENMASK(31, 4); break; case TMC_MEM_INTF_WIDTH_256BITS: - mask = GENMASK(31, 6); + mask = GENMASK(31, 5); break; } -- 2.7.4

6 years, 9 months

[PATCH] coresight: tmc: Refactor loops in etb dump

by Leo Yan

In ETB dump function tmc_etb_dump_hw() it has nested loops. The second level loop is to iterate index in the range [0 .. drvdata->memwidth); but the index isn't really used in the code, thus the second level loop is useless. This patch is to remove the second level loop; the refactor also reduces indentation and we can use 'break' to replace 'goto' tag. Cc: Mathieu Poirier <mathieu.poirier(a)linaro.org> Signed-off-by: Leo Yan <leo.yan(a)linaro.org> --- drivers/hwtracing/coresight/coresight-tmc-etf.c | 17 +++++++---------- 1 file changed, 7 insertions(+), 10 deletions(-) diff --git a/drivers/hwtracing/coresight/coresight-tmc-etf.c b/drivers/hwtracing/coresight/coresight-tmc-etf.c index 9c599c9..8b34161 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etf.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etf.c @@ -34,23 +34,20 @@ static void tmc_etb_dump_hw(struct tmc_drvdata *drvdata) { char *bufp; u32 read_data, lost; - int i; /* Check if the buffer wrapped around. */ lost = readl_relaxed(drvdata->base + TMC_STS) & TMC_STS_FULL; bufp = drvdata->buf; drvdata->len = 0; while (1) { - for (i = 0; i < drvdata->memwidth; i++) { - read_data = readl_relaxed(drvdata->base + TMC_RRD); - if (read_data == 0xFFFFFFFF) - goto done; - memcpy(bufp, &read_data, 4); - bufp += 4; - drvdata->len += 4; - } + read_data = readl_relaxed(drvdata->base + TMC_RRD); + if (read_data == 0xFFFFFFFF) + break; + memcpy(bufp, &read_data, 4); + bufp += 4; + drvdata->len += 4; } -done: + if (lost) coresight_insert_barrier_packet(drvdata->buf); return; -- 2.7.4

6 years, 9 months

[PATCH v3 0/9] coresight: Update device tree bindings

by Suzuki K Poulose

Coresight uses DT graph bindings to describe the connections of the components. However we have some undocumented usage of the bindings to describe some of the properties of the connections. The coresight driver needs to know the hardware ports invovled in the connection and the direction of data flow to effectively manage the trace sessions. So far we have relied on the "port" address (as described by the generic graph bindings) to represent the hardware port of the component for a connection. The hardware uses separate numbering scheme for input and output ports, which implies, we could have two different (input and output) ports with the same port number. This could create problems in the graph bindings where the label of the port wouldn't match the address. e.g, with the existing bindings we get : port@0{ // Output port 0 reg = <0>; ... }; port@1{ reg = <0>; // Input port 0 endpoint { slave-mode; ... }; }; With the new enforcement in the DT rules, mismatches in label and address are not allowed (as see in the case for port@1). So, we need a new mechanism to describe the hardware port number reliably. Also, we relied on an undocumented "slave-mode" property (see the above example) to indicate if the port is an input port. Let us formalise and switch to a new property to describe the direction of data flow. There were three options considered for the hardware port number scheme: 1) Use natural ordering in the DT to infer the hardware port number. i.e, Mandate that the all ports are listed in the DT and in the ascending order for each class (input and output respectively). Pros : - We don't need new properties and if the existing DTS list them in order (which most of them do), they work out of the box. Cons : - We must list all the ports even if the system cannot/shouldn't use it. - It is prone to human errors (if the order is not kept). 2) Use an explicit property to list both the direction and the hw port number and direction. Define "coresight,hwid" as 2 member array of u32, where the members are port number and the direction respectively. e.g port@0{ reg = <0>; endpoint { coresight,hwid = <0 1>; // Port # 0, Output } }; port@1{ reg = <1>; endpoint { coresight,hwid = <0 0>; // Port # 0, Input }; }; Pros: - The bindings are formal but not so reader friendly and could potentially lead to human errors. Cons: - Backward compatiblity is lost. 3) Use explicit properties (implemented in the series) for the hardware port id and direction. We define a new property "coresight,hwid" for each endpoint in coresight devices to specify the hardware port number explicitly. Also use a separate property "direction" to specify the direction of the data flow. e.g, port@0{ reg = <0>; endpoint { direction = <1>; // Output coresight,hwid = <0>; // Port # 0 } }; port@1{ reg = <1>; endpoint { direction = <0>; // Input coresight,hwid = <0>; // Port # 0 }; }; Pros: - The bindings are formal and reader friendly, and less prone to errors. Cons: - Backward compatibility is lost. After a round of discussions [1], the following option (4) is adopted : 4) Group ports based on the directions under a dedicated node. This has been checked with the upstream DTC tool to resolve the "address mismatch" issue. e.g, out-ports { // Output ports for this component port@0 { // Outport 0 reg = 0; endpoint { ... }; }; port@1 { // Outport 1 reg = 1; endpoint { ... }; }; }; in-ports { // Input ports for this component port@0 { // Inport 0 reg = 0; endpoint { ... }; }; port@1 { // Inport 1 reg = 1; endpoint { ... }; }; }; This series implements Option (4) listed above and falls back to the old bindings if the new bindings are not available. This allows the systems with old bindings work with the new driver. The driver now issues a warning (once) when it encounters the old bindings. The series contains DT update for Juno platform. The remaining in-kernel sources could be updated once we are fine with the proposal. It also cleans up the platform parsing code to reduce the memory usage by reusing the platform description. Applies on coresight/next Changes since V2: - Clean of_coresight_parse_endpoint() to return 1 to indicate a connection record was updated. - Drop documentation for old bindings Changes since V1: - Implement the proposal by Rob. - Drop the DTS updates for all platforms except Juno - Drop the incorrect fix in coresight_register. Instead document the code to prevent people trying to un-fix it again. - Add a patch to drop remote device references in DT graph parsing - Split of_node refcount fixing patch, fix a typo in the comment. - Add Reviewed-by tags from Mathieu. - Drop patches picked up for 4.18-rc series Changes since RFC: - Fixed style issues - Fix an existing memory leak coresight_register (Found in code update) - Fix missing of_node_put() in the existing driver (Reported-by Mathieu) - Update the existing dts in kernel tree. Suzuki K Poulose (9): coresight: Document error handling in coresight_register coresight: platform: Refactor graph endpoint parsing coresight: platform: Fix refcounting for graph nodes coresight: platform: Fix leaking device reference coresight: Fix remote endpoint parsing coresight: Add helper to check if the endpoint is input coresight: platform: Cleanup coresight connection handling coresight: Cleanup coresight DT bindings dts: juno: Update coresight bindings .../devicetree/bindings/arm/coresight.txt | 95 +++++--- arch/arm64/boot/dts/arm/juno-base.dtsi | 161 ++++++------ arch/arm64/boot/dts/arm/juno-cs-r1r2.dtsi | 52 ++-- arch/arm64/boot/dts/arm/juno.dts | 13 +- drivers/hwtracing/coresight/coresight.c | 35 +-- drivers/hwtracing/coresight/of_coresight.c | 269 ++++++++++++++------- include/linux/coresight.h | 9 +- 7 files changed, 359 insertions(+), 275 deletions(-) -- 2.7.4

6 years, 9 months

Re: ThunderX2 and Coresight bring up

by Mathieu Poirier

On Wed, 8 Aug 2018 at 01:59, Tomasz Nowicki <tnowicki(a)caviumnetworks.com> wrote: > > Hi Mathieu, > > It's been a while but I am back to Coresight. > > Let me remind my setup and the issue I am struggling with now. > > Kernel baseline: > https://github.com/Linaro/perf-opencsd (perf-opencsd-v4.16) > OpenCSD: > https://github.com/Linaro/OpenCSD.git (master) > > The simplest Coresight components path I used as a start point: > ETMv4.1 -> TDR -> FUNNEL -> ETF > > As I mentioned TDR is built by Cavium and it was added to aggregate 128 > inputs into one output rather than cascading funnels. TDR has its own > driver just to keep path connected in Linux Coresight framework. > > Here is how I catch some trace data: > sudo perf record -C 0 -e cs_etm/@etf0/ --per-thread test_app The above command line tells perf to trace everything that is happening on CPU0 for as long as "test_app" is executing. In this case the "--per-thread" option is ignored. This is called a CPU-wide trace scenario and is currently not supported for CS (I am currently working on it). If you want to make sure "test_app" executes on CPU0 and that you trace just that you will need to use the "taskset" utility: sudo perf record -e cs_etm/@etf0/ --per-thread taskset 0x1 test_app An alternative to the above would be to CPU-hotplug out CPU128-255 while you are testing. Let's start with that before going further. Thanks, Mathieu > > I need to use -C because my machines has 2 nodes, 32 cores (128 threads) > each and each node has different ETF. So I have to specify which CPU is > the source and for specified ETF sink (EFT0 can be a sink for > CPU0-CPU127, ETF1 can be a sink for CPU128-CPU255). Otherwise Linux > cannot find path for ETMs related to CPU128-CPU255 if I specify ETF0 as > a sink. > > Overall, I can see some data using: > # sudo perf report --stdio --dump > [...] > . ... CoreSight ETM Trace data: size 16384 bytes > Frame deformatter: Found 4 FSYNCS > ID:12 RESET operation on trace decode path > Idx:108; ID:12; I_NOT_SYNC : I Stream not synchronised > Idx:455; ID:12; I_ASYNC : Alignment Synchronisation. > Idx:468; ID:12; I_TRACE_INFO : Trace Info.; INFO=0x0 > Idx:470; ID:12; I_TRACE_ON : Trace On. > Idx:471; ID:12; I_CTXT : Context Packet.; Ctxt: AArch64,EL0, NS; > Idx:473; ID:12; I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; > Addr=0x0000AAABE0B09584; > Idx:483; ID:12; I_ATOM_F1 : Atom format 1.; N > Idx:484; ID:12; I_TIMESTAMP : Timestamp.; Updated val = > 0x1b6a5d937cc1 > Idx:492; ID:12; I_ATOM_F3 : Atom format 3.; NNE > Idx:493; ID:12; I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; > Addr=0x0000AAABE0B0D210; > Idx:504; ID:12; I_ATOM_F3 : Atom format 3.; NEE > Idx:505; ID:12; I_ATOM_F3 : Atom format 3.; NEN > Idx:506; ID:12; I_ATOM_F6 : Atom format 6.; EEEN > Idx:507; ID:12; I_ATOM_F3 : Atom format 3.; NNE > Idx:508; ID:12; I_ATOM_F1 : Atom format 1.; N > Idx:509; ID:12; I_ATOM_F3 : Atom format 3.; NNN > Idx:510; ID:12; I_ATOM_F3 : Atom format 3.; EEN > Idx:512; ID:12; I_ATOM_F1 : Atom format 1.; E > [...] > > However, I still see errors while using: > # sudo perf report --stdio > 0x1e8 [0x60]: failed to process type: 1 > Error: > failed to process sample > # To display the perf.data header info, please use > --header/--header-only options. > > The reason is that cs_etm__process_event() is failing on: > if (!etm->timeless_decoding) > return -EINVAL; > > and etm->timeless_decoding is setup in cs_etm__is_timeless_decoding(). > For some events time bit set and so far I failed to figure out what is > going on. Have you met similar issue so far? Any pointers or hints are > very appreciated. > > One more comment below. > > On 10.01.2018 21:10, Mathieu Poirier wrote: > > On 10 January 2018 at 06:57, Tomasz Nowicki <tnowicki(a)caviumnetworks.com> wrote: > >> Hello Mathieu, > >> > >> Thank you for your response. Please see comments below. > >> > >> On 08.01.2018 17:53, Mathieu Poirier wrote: > >>> > >>> Good day Tomasz, > >>> > >>> > >>> On 5 January 2018 at 05:51, tn <Tomasz.Nowicki(a)caviumnetworks.com> wrote: > >>>> > >>>> Hi Mathieu, > >>>> > >>>> I am bringing up Coresight functiproject zeroonality on ThunderX2. While > >>>> ramping up I > >>>> come across your Connect session: > >>>> > >>>> which I found very helpful. > >>> > >>> > >>> Perfect - a few things have changed since then, see below. > >>> > >>>> > >>>> During my research I had to create new Coresight component driver for > >>>> Linux, > >>>> here is the story. For ThunderX2, we aggregate data trace from all 128 > >>>> ETMs > >>>> into one funnel inport using so called TDR (Trace Data Ring) component. > >>>> This > >>>> should be transparent to software and does not require configuration at > >>>> all. > >>>> However, Linux Coresight framework requires components to be connected > >>>> each > >>>> other so we cannot leave funnel and ETMs disconnected in DT. I decided to > >>>> create pure software component i.e. TDR which is meant to connect chain > >>>> only, no actions on registers. > >>> > >>> > >>> Is this TDR an ARM IP or built in-house by Cavium? > >> > >> > >> This is Cavium specific component which I am going to upstream once I test > >> the whole functionality. > >> > >> And I suppose it > >>> > >>> was added there to aggregate 128 input into one output rather than > >>> cascading funnels? > >> > >> > >> Correct. > >> > >>>> > >>>> Now I am able to enable ETF sink and path from ETM via TDR via FUNNEL up > >>>> to > >>>> ETF and gather some data. To be sure things work properly I want to > >>>> decode > >>>> data using Linaro OpenCSD library following instructions from here: > >>>> > >>>> https://community.arm.com/tools/b/blog/posts/do-a-coresight-trace-on-linux-… > >>> > >>> > >>> Thanks for pointing this out, I didn't know about it. > >>> > >>>> but still got error while doing 'perf report' step. Kernel perf tool > >>>> support > >>>> for OpenCSD is out of tree for now so I may miss some patches. > >>> > >>> > >>> Can you get me a pastebin of the errors you're getting? > >> > >> > >> Sure, see: > >> https://pastebin.com/6YDq8KfC > >> As you see there is not much info about error cause. > >> > >>> > >>>> > >>>> Here is my setup: > >>>> https://github.com/Linaro/perf-opencsd/commits/upstream-v1 (+ ThunderX2 > >>>> specific patches) > >>> > >>> > >>> Oh boy... I wasn't expecting people to use that but I suppose it is > >>> the right thing to do. Keep going with that code. > >>> > >>>> https://github.com/Linaro/OpenCSD/commits/master > >>> > >>> > >>> This, in combination with the upstream-v1 branch should work properly. > >>> That's how I test things on my Juno and Dragon board. > >>> > >>>> > >>>> # echo 1 > etf0/enable_sink > >>>> # perf record -C 0 -e cs_etm// sleep 2 > >>> > >>> > >>> Ok, that won't work as the -C option is currently not supported (I am > >>> working on it). I also suggest to make sure you have the very latest > >>> TIP [1] on branch [2] and to carefully read the README.md. We > >>> recently updated the instructions to fit the newest development. > >>> Lastly we have deprecated enabling the sink from the sysFS interface - > >>> it can still work but no guarantees are provided. It is better to > >>> specify the sink as part of the perf record command line, as shown in > >>> the most recent HOWTO.md. > >> > >> > >> I am able to specify sink as part of the perf record command line only for > >> Linux Perf master branch: > >> https://github.com/Linaro/perf-opencsd/commits/master > >> > >> For upstream-v1 branch I am getting: > >> $ perf record -vvv -e cs_etm/@etf0/ --per-thread uname > >> Using CPUID 0x00000000420f5160 > >> perf: util/evsel.c:783: apply_config_terms: Assertion `!(1)' failed. > >> Aborted (core dumped) > > > > > > Ok, I've uploaded upstream-v2. With that branch everything works fine > > on my side, no changes needed. I added a fix for a regression in the > > perf tip tree and the code required to use the ETR from the perf > > interface. > > > > One thing about the above: "@etf0". Is this really the name you gave > > to the device in the DT? Look under /sys/bus/coresight/devices/ for > > an etf entry. What is listed there should is the name of the ETF as > > it is known to the system. > > Indeed, the name is different but for perf command clarity I use shortcut. > > Thanks, > Tomasz

6 years, 9 months

Fwd: Failed for ETM decoding with db410c snapshot mode

by Mike Leach

+CoreSight ML and Mathieu ---------- Forwarded message ---------- From: Mike Leach <mike.leach(a)linaro.org> Date: 3 September 2018 at 17:39 Subject: Re: Failed for ETM decoding with db410c snapshot mode To: Leo Yan <leo.yan(a)linaro.org> HI Leo, Short summary - there is a problem with the trace collected - not the decoder. See below for details On 3 September 2018 at 08:06, <leo.yan(a)linaro.org> wrote: > Hi Mike, Mathieu, > > [ + CoreSight ML ] > > When I work on the CoreSight + perf tool and used crash extension > program to extract the tracing data from perf aux buffer, finally I > can get the trace data for about 1.6MB from ETF sink from DB410c board. > > To verify the extracted trace data, I used 'snapshot' mode under > OpenCSD code base, you could see the tar file for this [1]. After > you download this file, you could place it under OpenCSD folder: > > $ cp db410c_snapshot_kdump.tgz my_opencsd/decoder/tests/snapshots > $ cd my_opencsd/decoder/tests/snapshots > $ tar zxvf db410c_snapshot_kdump.tgz > $ cd db410c_snapshot_kdump > > $ ../../bin/builddir/trc_pkt_lister This will print raw trace packets as it finds them without attempting any sort of interpretation. > $ ../../bin/builddir/trc_pkt_lister -decode This will try to decode the raw trace packets into a sequence of instructions executed (alongside the raw packets) This is where the packets are being flagged as incorrect. > > If I use the command 'trc_pkt_lister' without any extra options, it > can print out trace packets successfully; but if I add the extra > option '-decode' it uses 'decode all' mode and it reports the errors as: > > 483710 Idx:53086; ID:10; [0xf8 ]; I_ATOM_F3 : Atom format 3.; NNN > 483711 Idx:53086; ID:10; OCSD_GEN_TRC_ELEM_ADDR_NACC( 0xffff000008abc9f0 ) > 483712 Idx:53088; ID:10; [0xdb ]; I_ATOM_F2 : Atom format 2.; EE > 483713 Idx:53194; ID:10; [0x6b 0x8c 0x08 0xfa 0xdc 0x95 0x5c ]; I_COND_RES_F1 : Conditional Result, format 1. This is a conditional result trace packet - however as far as I am aware the trace unit on an A53 (i.e. DB410 core) cannot produce these. Additionally in the entire file I see 2 I_COND packets and 1 I_NUM_DS_MKR - a data synchronisation marker packet. Now Data sync can only ever occur if data trace is supported and enabled. Data trace is architecturally prohibited for A class v8 cores (and unimplemented on most A class v7 cores). If there were tracing of conditional elements occurring, and it were enabled, then the packets should match up - a cond instruction should match with one cond result element. But in the end - event without these inconsistencies - the TRACE_INFO element at the top of the listing tells me that conditional instruction trace is disabled. Thus you are seeing what I believe is the effect of concatenating trace data buffers together (you mention you have 1.6MB of data from the ETF - which is not that large), without inserting barrier packets in between. The decoder cannot spot the boundaries, and will carry on and be out of sync so can mis-read trace packet payload data as header data which will throw off the decode process. When I look at the raw byte data I am seeing this at the top of the listing:- Frame Data; Index 0; ID_DATA[????]; ff Frame Data; Index 0; ID_DATA[0x7f]; 7f ff 7f ff 7f ff This does not look valid at all to me. > 483714 DCD_ETMV4_0016 : 0x0018 (OCSD_ERR_BAD_DECODE_PKT) [Reserved or unknown packet in decoder.]; Unsupported packet type.Trace Packet Lister : Data Path fatal error > 483715 0x0018 (OCSD_ERR_BAD_DECODE_PKT) [Reserved or unknown packet in decoder.]; Unsupported packet type.Trace Packet Lister : Trace buffer done, processed 53216 bytes. > > You also could check detailed log trc_pkt_lister.ppl in the shared > tar packet; After searched for the OpenCSD code and found this error is > due it cannot support some types of packets [2]. > > So want to check what's the best for this issue; seems to me we need > to fix this so it can support well to complete the decoding? > The reason we have not implemented support for these packets, is that we have never seen an implementation that generates them. regards Mike > Thanks in advance for suggestion. > Leo Yan > > [1] http://people.linaro.org/~leo.yan/opencsd_db410c/db410c_snapshot_kdump.tgz > [2] https://github.com/Linaro/OpenCSD/blob/master/decoder/source/etmv4/trc_pkt_… -- Mike Leach Principal Engineer, ARM Ltd. Manchester Design Centre. UK -- Mike Leach Principal Engineer, ARM Ltd. Manchester Design Centre. UK

6 years, 9 months

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

CoreSight September 2018