Hi, this is an incomplete patch for an issue with EL2 kernels, and I'm looking for feedback on how to complete it.
The background is that to support tracing multiple address spaces we get ETM to embed the context id in the trace, and we build with CONFIG_PID_IN_CONTEXTIDR to get the scheduler to put the thread id in CONTEXTIDR_EL1. This is a known technique, it's what context id tracing is designed for.
The problem is when the kernel is running not at EL1 (OS level) but at EL2 (hypervisor level), which is now becoming common. With HCR_EL2.E2H set, the kernel's writes to CONTEXTIDR_EL1 actually change a different physical register, CONTEXTIDR_EL2. However, ETM still traces CONTEXTIDR_EL1. So the context ids in the trace are zero, and trace cannot be reconstructed.
ETM 4.1 has an option VMIDOPT to cause CONTEXTIDR_EL2 to be output in trace, in the VMID field replacing the value of VTTBR.VMID. So we can use that, but the trace follower, collecting events from OpenCSD, needs to be aware it needs to check the VMID field not the CID field. OpenCSD doesn't need to change but perf does. TRCCONFIGR is already in the metadata, so perf consumers can check it to see what's going on.
The patch below does the kernel and userspace side but is not complete. The problem is that userspace perf creates the metadata copy of TRCCONFIGR based on its request (and fills in the other id registers by reading sysfs), but the detection of EL2/E2H happens in the kernel which adjusts TRCCONFIGR, and it's this config which is needed for decode. I see three ways round this:
- have userspace test to see if the kernel is EL2 (somehow) and adjust the metadata to mirror what the kernel is doing
- have the kernel pass the adjusted TRCCONFIGR back so perf can put it in the metadata
- have the perf decoder get the thread id from whichever of VMID and CONTEXTID is available in a PE_CONTEXT element
Obviously, the last is simplest, but it's a bodge, and means that OpenCSD will see VMIDs when its TRCCONFIGR says it won't. It's kind of cleanest to get the real TRCCONFIGR somehow, but how do we do that?
Al
diff --git a/drivers/hwtracing/coresight/coresight-etm4x.c b/drivers/hwtracing/coresight/coresight-etm4x.c index a128b5063f46..96488a0cfdcf 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x.c +++ b/drivers/hwtracing/coresight/coresight-etm4x.c @@ -353,8 +353,32 @@ static int etm4_parse_event_config(struct etmv4_drvdata *drvdata, }
if (attr->config & BIT(ETM_OPT_CTXTID)) - /* bit[6], Context ID tracing bit */ - config->cfg |= BIT(ETM4_CFG_BIT_CTXTID); + { + /* + * Enable context-id tracing. The assumption is that this + * will work with CONFIG_PID_IN_CONTEXTIDR to trace process + * id changes and support decode of multiple processes. + * But ETM's context id trace traces physical CONTEXTIDR_EL1, + * while the logical CONTEXTIDR_EL1 that is written to on + * process switch is either physical CONTEXTIDR_EL1 or + * CONTEXTIDR_EL2 depending on HCR_EL2.E2H. On principle + * we should continue to use logical CONTEXTIDR_EL1. + * In order to trace physical CONTEXTIDR_EL2, we need to + * enable VMID tracing and use the VMIDOPT flag to trace + * CONTEXTIDR_EL2 rather than VTTBR.VMID in the VMID field. + * Trace decoders will need to inspect TRCCONFIGR and use + * either the CID or the VMID field from the trace packet. + */ + if (!(is_kernel_in_hyp_mode() && + (read_sysreg(hcr_el2) & BIT(34)) != 0)) { + /* bit[6], Context ID tracing bit */ + config->cfg |= BIT(ETM4_CFG_BIT_CTXTID); + } else { + /* bits[7,15], trace CONTEXTID_EL2 in VMID field */ + config->cfg |= (BIT(ETM4_CFG_BIT_VMID) | + BIT(ETM4_CFG_BIT_VMIDOPT)); + } + }
/* return stack - enable if selected and supported */ if ((attr->config & BIT(ETM_OPT_RETSTK)) && drvdata->retstack) diff --git a/include/linux/coresight-pmu.h b/include/linux/coresight-pmu.h index b0e35eec6499..c2f47b25daab 100644 --- a/include/linux/coresight-pmu.h +++ b/include/linux/coresight-pmu.h @@ -19,8 +19,10 @@ /* ETMv4 CONFIGR programming bits for the ETM OPTs */ #define ETM4_CFG_BIT_CYCACC 4 #define ETM4_CFG_BIT_CTXTID 6 +#define ETM4_CFG_BIT_VMID 7 #define ETM4_CFG_BIT_TS 11 #define ETM4_CFG_BIT_RETSTK 12 +#define ETM4_CFG_BIT_VMIDOPT 15
static inline int coresight_get_trace_id(int cpu) { diff --git a/tools/include/linux/coresight-pmu.h b/tools/include/linux/coresight-pmu.h index b0e35eec6499..c2f47b25daab 100644 --- a/tools/include/linux/coresight-pmu.h +++ b/tools/include/linux/coresight-pmu.h @@ -19,8 +19,10 @@ /* ETMv4 CONFIGR programming bits for the ETM OPTs */ #define ETM4_CFG_BIT_CYCACC 4 #define ETM4_CFG_BIT_CTXTID 6 +#define ETM4_CFG_BIT_VMID 7 #define ETM4_CFG_BIT_TS 11 #define ETM4_CFG_BIT_RETSTK 12 +#define ETM4_CFG_BIT_VMIDOPT 15
static inline int coresight_get_trace_id(int cpu) { diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c index cd92a99eb89d..a54cad778841 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c @@ -35,6 +35,7 @@ struct cs_etm_decoder { dcd_tree_handle_t dcd_tree; cs_etm_mem_cb_type mem_access; ocsd_datapath_resp_t prev_return; + uint32 thread_id_in_vmid:1; };
static u32 @@ -496,17 +497,24 @@ cs_etm_decoder__buffer_exception_ret(struct cs_etm_packet_queue *queue,
static ocsd_datapath_resp_t cs_etm_decoder__set_tid(struct cs_etm_queue *etmq, + struct cs_etm_decoder *decoder, struct cs_etm_packet_queue *packet_queue, const ocsd_generic_trace_elem *elem, const uint8_t trace_chan_id) { pid_t tid;
- /* Ignore PE_CONTEXT packets that don't have a valid contextID */ - if (!elem->context.ctxt_id_valid) - return OCSD_RESP_CONT; + if (!decoder->thread_id_in_vmid) { + /* Ignore PE_CONTEXT packets that don't have a valid contextID */ + if (!elem->context.ctxt_id_valid) + return OCSD_RESP_CONT; + tid = elem->context.context_id; + } else { + if (!elem->context.vmid_valid) + return OCSD_RESP_CONT; + tid = elem->context.vmid; + }
- tid = elem->context.context_id; if (cs_etm__etmq_set_tid(etmq, tid, trace_chan_id)) return OCSD_RESP_FATAL_SYS_ERR;
@@ -561,7 +569,7 @@ static ocsd_datapath_resp_t cs_etm_decoder__gen_trace_elem_printer( trace_chan_id); break; case OCSD_GEN_TRC_ELEM_PE_CONTEXT: - resp = cs_etm_decoder__set_tid(etmq, packet_queue, + resp = cs_etm_decoder__set_tid(etmq, decoder, packet_queue, elem, trace_chan_id); break; case OCSD_GEN_TRC_ELEM_ADDR_NACC: @@ -595,11 +603,15 @@ static int cs_etm_decoder__create_etm_packet_decoder( OCSD_BUILTIN_DCD_ETMV3 : OCSD_BUILTIN_DCD_PTM; trace_config = &config_etmv3; + decoder->thread_id_in_vmid = 0; break; case CS_ETM_PROTO_ETMV4i: cs_etm_decoder__gen_etmv4_config(t_params, &trace_config_etmv4); decoder_name = OCSD_BUILTIN_DCD_ETMV4I; trace_config = &trace_config_etmv4; + /* If VMID and VMIDOPT are set, thread id is in VMID not CID */ + decoder->thread_id_in_vmid = + ((trace_config_etmv4.reg.configr & 0x8080) == 0x8080); break; default: return -1; IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Hi Al,
On Wed, 22 Apr 2020 at 22:33, Al Grant Al.Grant@arm.com wrote:
Hi, this is an incomplete patch for an issue with EL2 kernels, and I'm looking for feedback on how to complete it.
The background is that to support tracing multiple address spaces we get ETM to embed the context id in the trace, and we build with CONFIG_PID_IN_CONTEXTIDR to get the scheduler to put the thread id in CONTEXTIDR_EL1. This is a known technique, it's what context id tracing is designed for.
The problem is when the kernel is running not at EL1 (OS level) but at EL2 (hypervisor level), which is now becoming common. With HCR_EL2.E2H set, the kernel's writes to CONTEXTIDR_EL1 actually change a different physical register, CONTEXTIDR_EL2. However, ETM still traces CONTEXTIDR_EL1. So the context ids in the trace are zero, and trace cannot be reconstructed.
ETM 4.1 has an option VMIDOPT to cause CONTEXTIDR_EL2 to be output in trace, in the VMID field replacing the value of VTTBR.VMID. So we can use that, but the trace follower, collecting events from OpenCSD, needs to be aware it needs to check the VMID field not the CID field. OpenCSD doesn't need to change but perf does. TRCCONFIGR is already in the metadata, so perf consumers can check it to see what's going on.
The patch below does the kernel and userspace side but is not complete. The problem is that userspace perf creates the metadata copy of TRCCONFIGR based on its request (and fills in the other id registers by reading sysfs), but the detection of EL2/E2H happens in the kernel which adjusts TRCCONFIGR, and it's this config which is needed for decode. I see three ways round this:
- have userspace test to see if the kernel is EL2 (somehow) and adjust the
metadata to mirror what the kernel is doing
- have the kernel pass the adjusted TRCCONFIGR back so perf can put it in the
metadata
Not necessarily the entire configr but at least a flag to indicate VMID contexts are in play.
- have the perf decoder get the thread id from whichever of VMID and
CONTEXTID is available in a PE_CONTEXT element
Obviously, the last is simplest, but it's a bodge, and means that OpenCSD will see VMIDs when its TRCCONFIGR says it won't. It's kind of cleanest to get
OpenCSD won't care. VMID/CID are pass through for the decoder - it's up to the client to use them or not. We use the trace CONFIGR value to spot things like return stack. The ETMv4 protocol tells us if VMID / CID is present and just believe that. The decode doesn't validate what it gets in trace from what is set up in the config / id registers in general when the content self describes. (we do use the architecture version and D / I settings to do a broad brush rejection of certain packet types, but that is about it).
the real TRCCONFIGR somehow, but how do we do that?
As I mentioned on the other thread, I think this problem ties in with the sink information issue - both want to get kernel selections into the AUX DATA headers or something similar. A common approach will be of benefit here.
Regards
Mike
Al
diff --git a/drivers/hwtracing/coresight/coresight-etm4x.c b/drivers/hwtracing/coresight/coresight-etm4x.c index a128b5063f46..96488a0cfdcf 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x.c +++ b/drivers/hwtracing/coresight/coresight-etm4x.c @@ -353,8 +353,32 @@ static int etm4_parse_event_config(struct etmv4_drvdata *drvdata, }
if (attr->config & BIT(ETM_OPT_CTXTID))
/* bit[6], Context ID tracing bit */
config->cfg |= BIT(ETM4_CFG_BIT_CTXTID);
{
/*
* Enable context-id tracing. The assumption is that this
* will work with CONFIG_PID_IN_CONTEXTIDR to trace process
* id changes and support decode of multiple processes.
* But ETM's context id trace traces physical CONTEXTIDR_EL1,
* while the logical CONTEXTIDR_EL1 that is written to on
* process switch is either physical CONTEXTIDR_EL1 or
* CONTEXTIDR_EL2 depending on HCR_EL2.E2H. On principle
* we should continue to use logical CONTEXTIDR_EL1.
* In order to trace physical CONTEXTIDR_EL2, we need to
* enable VMID tracing and use the VMIDOPT flag to trace
* CONTEXTIDR_EL2 rather than VTTBR.VMID in the VMID field.
* Trace decoders will need to inspect TRCCONFIGR and use
* either the CID or the VMID field from the trace packet.
*/
if (!(is_kernel_in_hyp_mode() &&
(read_sysreg(hcr_el2) & BIT(34)) != 0)) {
/* bit[6], Context ID tracing bit */
config->cfg |= BIT(ETM4_CFG_BIT_CTXTID);
} else {
/* bits[7,15], trace CONTEXTID_EL2 in VMID field */
config->cfg |= (BIT(ETM4_CFG_BIT_VMID) |
BIT(ETM4_CFG_BIT_VMIDOPT));
}
} /* return stack - enable if selected and supported */ if ((attr->config & BIT(ETM_OPT_RETSTK)) && drvdata->retstack)
diff --git a/include/linux/coresight-pmu.h b/include/linux/coresight-pmu.h index b0e35eec6499..c2f47b25daab 100644 --- a/include/linux/coresight-pmu.h +++ b/include/linux/coresight-pmu.h @@ -19,8 +19,10 @@ /* ETMv4 CONFIGR programming bits for the ETM OPTs */ #define ETM4_CFG_BIT_CYCACC 4 #define ETM4_CFG_BIT_CTXTID 6 +#define ETM4_CFG_BIT_VMID 7 #define ETM4_CFG_BIT_TS 11 #define ETM4_CFG_BIT_RETSTK 12 +#define ETM4_CFG_BIT_VMIDOPT 15
static inline int coresight_get_trace_id(int cpu) { diff --git a/tools/include/linux/coresight-pmu.h b/tools/include/linux/coresight-pmu.h index b0e35eec6499..c2f47b25daab 100644 --- a/tools/include/linux/coresight-pmu.h +++ b/tools/include/linux/coresight-pmu.h @@ -19,8 +19,10 @@ /* ETMv4 CONFIGR programming bits for the ETM OPTs */ #define ETM4_CFG_BIT_CYCACC 4 #define ETM4_CFG_BIT_CTXTID 6 +#define ETM4_CFG_BIT_VMID 7 #define ETM4_CFG_BIT_TS 11 #define ETM4_CFG_BIT_RETSTK 12 +#define ETM4_CFG_BIT_VMIDOPT 15
static inline int coresight_get_trace_id(int cpu) { diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c index cd92a99eb89d..a54cad778841 100644 --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c @@ -35,6 +35,7 @@ struct cs_etm_decoder { dcd_tree_handle_t dcd_tree; cs_etm_mem_cb_type mem_access; ocsd_datapath_resp_t prev_return;
uint32 thread_id_in_vmid:1;
};
static u32 @@ -496,17 +497,24 @@ cs_etm_decoder__buffer_exception_ret(struct cs_etm_packet_queue *queue,
static ocsd_datapath_resp_t cs_etm_decoder__set_tid(struct cs_etm_queue *etmq,
struct cs_etm_decoder *decoder, struct cs_etm_packet_queue *packet_queue, const ocsd_generic_trace_elem *elem, const uint8_t trace_chan_id)
{ pid_t tid;
/* Ignore PE_CONTEXT packets that don't have a valid contextID */
if (!elem->context.ctxt_id_valid)
return OCSD_RESP_CONT;
if (!decoder->thread_id_in_vmid) {
/* Ignore PE_CONTEXT packets that don't have a valid contextID */
if (!elem->context.ctxt_id_valid)
return OCSD_RESP_CONT;
tid = elem->context.context_id;
} else {
if (!elem->context.vmid_valid)
return OCSD_RESP_CONT;
tid = elem->context.vmid;
}
tid = elem->context.context_id; if (cs_etm__etmq_set_tid(etmq, tid, trace_chan_id)) return OCSD_RESP_FATAL_SYS_ERR;
@@ -561,7 +569,7 @@ static ocsd_datapath_resp_t cs_etm_decoder__gen_trace_elem_printer( trace_chan_id); break; case OCSD_GEN_TRC_ELEM_PE_CONTEXT:
resp = cs_etm_decoder__set_tid(etmq, packet_queue,
resp = cs_etm_decoder__set_tid(etmq, decoder, packet_queue, elem, trace_chan_id); break; case OCSD_GEN_TRC_ELEM_ADDR_NACC:
@@ -595,11 +603,15 @@ static int cs_etm_decoder__create_etm_packet_decoder( OCSD_BUILTIN_DCD_ETMV3 : OCSD_BUILTIN_DCD_PTM; trace_config = &config_etmv3;
decoder->thread_id_in_vmid = 0; break; case CS_ETM_PROTO_ETMV4i: cs_etm_decoder__gen_etmv4_config(t_params, &trace_config_etmv4); decoder_name = OCSD_BUILTIN_DCD_ETMV4I; trace_config = &trace_config_etmv4;
/* If VMID and VMIDOPT are set, thread id is in VMID not CID */
decoder->thread_id_in_vmid =
((trace_config_etmv4.reg.configr & 0x8080) == 0x8080); break; default: return -1;
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. _______________________________________________ CoreSight mailing list CoreSight@lists.linaro.org https://lists.linaro.org/mailman/listinfo/coresight
Hi Al, Mike,
On Thu, Apr 23, 2020 at 03:26:06PM +0100, Mike Leach wrote:
[...]
The patch below does the kernel and userspace side but is not complete. The problem is that userspace perf creates the metadata copy of TRCCONFIGR based on its request (and fills in the other id registers by reading sysfs), but the detection of EL2/E2H happens in the kernel which adjusts TRCCONFIGR, and it's this config which is needed for decode. I see three ways round this:
- have userspace test to see if the kernel is EL2 (somehow) and adjust the
metadata to mirror what the kernel is doing
- have the kernel pass the adjusted TRCCONFIGR back so perf can put it in the
metadata
Not necessarily the entire configr but at least a flag to indicate VMID contexts are in play.
- have the perf decoder get the thread id from whichever of VMID and
CONTEXTID is available in a PE_CONTEXT element
Obviously, the last is simplest, but it's a bodge, and means that OpenCSD will see VMIDs when its TRCCONFIGR says it won't. It's kind of cleanest to get
OpenCSD won't care. VMID/CID are pass through for the decoder - it's up to the client to use them or not. We use the trace CONFIGR value to spot things like return stack. The ETMv4 protocol tells us if VMID / CID is present and just believe that. The decode doesn't validate what it gets in trace from what is set up in the config / id registers in general when the content self describes. (we do use the architecture version and D / I settings to do a broad brush rejection of certain packet types, but that is about it).
the real TRCCONFIGR somehow, but how do we do that?
As I mentioned on the other thread, I think this problem ties in with the sink information issue - both want to get kernel selections into the AUX DATA headers or something similar. A common approach will be of benefit here.
I did a quick check and found OpenCSD's header has defined a field to indicate the exception level; so just curious if we can simply check the exception level in OpenCSD's element, if the exception level is EL2, then perf tool uses VMID rather than CID for the process ID.
If this is feasible, then here don't depend on TRCCONFIGR anymore. Does this make sense? Sorry might introduce noise.
Thanks, Leo
I did a quick check and found OpenCSD's header has defined a field to indicate the exception level; so just curious if we can simply check the exception level in OpenCSD's element, if the exception level is EL2, then perf tool uses VMID rather than CID for the process ID.
If this is feasible, then here don't depend on TRCCONFIGR anymore. Does this make sense? Sorry might introduce noise.
It would be nice if it worked, but the exception level shows what state the CPU was in when generating trace. So if we're tracing EL0, the elements we get from OpenCSD will say EL0. We might not have permissions to see beyond EL0 but we still want to see thread switches. We need to distinguish between EL0 contexts where the thread id was placed in VMID (because the kernel is at EL2+VHE) from EL0 contexts where the thread id is in CID.
Al
Thanks, Leo
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
On 04/22/2020 10:33 PM, Al Grant wrote:
Hi, this is an incomplete patch for an issue with EL2 kernels, and I'm looking for feedback on how to complete it.
The background is that to support tracing multiple address spaces we get ETM to embed the context id in the trace, and we build with CONFIG_PID_IN_CONTEXTIDR to get the scheduler to put the thread id in CONTEXTIDR_EL1. This is a known technique, it's what context id tracing is designed for.
The problem is when the kernel is running not at EL1 (OS level) but at EL2 (hypervisor level), which is now becoming common. With HCR_EL2.E2H set, the kernel's writes to CONTEXTIDR_EL1 actually change a different physical register, CONTEXTIDR_EL2. However, ETM still traces CONTEXTIDR_EL1. So the context ids in the trace are zero, and trace cannot be reconstructed.
ETM 4.1 has an option VMIDOPT to cause CONTEXTIDR_EL2 to be output in trace, in the VMID field replacing the value of VTTBR.VMID. So we can use that, but the trace follower, collecting events from OpenCSD, needs to be aware it needs to check the VMID field not the CID field. OpenCSD doesn't need to change but perf does. TRCCONFIGR is already in the metadata, so perf consumers can check it to see what's going on.
The patch below does the kernel and userspace side but is not complete. The problem is that userspace perf creates the metadata copy of TRCCONFIGR based on its request (and fills in the other id registers by reading sysfs), but the detection of EL2/E2H happens in the kernel which adjusts TRCCONFIGR, and it's this config which is needed for decode. I see three ways round this:
- have userspace test to see if the kernel is EL2 (somehow) and adjust the
metadata to mirror what the kernel is doing
- have the kernel pass the adjusted TRCCONFIGR back so perf can put it in the
metadata
- have the perf decoder get the thread id from whichever of VMID and
CONTEXTID is available in a PE_CONTEXT element
Obviously, the last is simplest, but it's a bodge, and means that OpenCSD will see VMIDs when its TRCCONFIGR says it won't. It's kind of cleanest to get the real TRCCONFIGR somehow, but how do we do that?
We do get TRCCONFIGR in the perf records. We should simply make sure we get the uptodate value (wherever we are getting it from).
Al
diff --git a/drivers/hwtracing/coresight/coresight-etm4x.c b/drivers/hwtracing/coresight/coresight-etm4x.c index a128b5063f46..96488a0cfdcf 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x.c +++ b/drivers/hwtracing/coresight/coresight-etm4x.c @@ -353,8 +353,32 @@ static int etm4_parse_event_config(struct etmv4_drvdata *drvdata, }
if (attr->config & BIT(ETM_OPT_CTXTID))
/* bit[6], Context ID tracing bit */
config->cfg |= BIT(ETM4_CFG_BIT_CTXTID);
{
/*
* Enable context-id tracing. The assumption is that this
* will work with CONFIG_PID_IN_CONTEXTIDR to trace process
* id changes and support decode of multiple processes.
* But ETM's context id trace traces physical CONTEXTIDR_EL1,
* while the logical CONTEXTIDR_EL1 that is written to on
* process switch is either physical CONTEXTIDR_EL1 or
* CONTEXTIDR_EL2 depending on HCR_EL2.E2H. On principle
* we should continue to use logical CONTEXTIDR_EL1.
* In order to trace physical CONTEXTIDR_EL2, we need to
* enable VMID tracing and use the VMIDOPT flag to trace
* CONTEXTIDR_EL2 rather than VTTBR.VMID in the VMID field.
* Trace decoders will need to inspect TRCCONFIGR and use
* either the CID or the VMID field from the trace packet.
*/
if (!(is_kernel_in_hyp_mode() &&
(read_sysreg(hcr_el2) & BIT(34)) != 0)) {
The patch looks good to me. We should be careful about the register access, as we assume that the kernel is running in AArch64 mode above, which may not be necessarily true.
Suzuki
The patch below does the kernel and userspace side but is not complete. The problem is that userspace perf creates the metadata copy of TRCCONFIGR based on its request (and fills in the other id registers by reading sysfs), but the detection of EL2/E2H happens in the kernel which adjusts TRCCONFIGR, and it's this config which is needed for decode. I
see three ways round this:
- have userspace test to see if the kernel is EL2 (somehow) and adjust
the metadata to mirror what the kernel is doing
- have the kernel pass the adjusted TRCCONFIGR back so perf can put it
in the metadata
- have the perf decoder get the thread id from whichever of VMID and
CONTEXTID is available in a PE_CONTEXT element
Obviously, the last is simplest, but it's a bodge, and means that OpenCSD will see VMIDs when its TRCCONFIGR says it won't. It's kind of cleanest to get the real TRCCONFIGR somehow, but how do we do that?
We do get TRCCONFIGR in the perf records. We should simply make sure we get the uptodate value (wherever we are getting it from).
The copy in PERF_RECORD_AUXINFO (which is a synthetic record created by userspace perf) is, I believe, as I said:
"userspace perf creates the metadata copy of TRCCONFIGR based on its request".
So if the kernel modifies it based on information only the kernel knows, there's no current way to get the actual value. That was what I was trying to address with my suggestions.
Have I missed some place the actual TRCCONFIGR is already being returned in other perf records?
Al
Al
diff --git a/drivers/hwtracing/coresight/coresight-etm4x.c b/drivers/hwtracing/coresight/coresight-etm4x.c index a128b5063f46..96488a0cfdcf 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x.c +++ b/drivers/hwtracing/coresight/coresight-etm4x.c @@ -353,8 +353,32 @@ static int etm4_parse_event_config(struct
etmv4_drvdata *drvdata,
} if (attr->config & BIT(ETM_OPT_CTXTID))
/* bit[6], Context ID tracing bit */
config->cfg |= BIT(ETM4_CFG_BIT_CTXTID);
{
/*
* Enable context-id tracing. The assumption is that this
* will work with CONFIG_PID_IN_CONTEXTIDR to trace process
* id changes and support decode of multiple processes.
* But ETM's context id trace traces physical CONTEXTIDR_EL1,
* while the logical CONTEXTIDR_EL1 that is written to on
* process switch is either physical CONTEXTIDR_EL1 or
* CONTEXTIDR_EL2 depending on HCR_EL2.E2H. On principle
* we should continue to use logical CONTEXTIDR_EL1.
* In order to trace physical CONTEXTIDR_EL2, we need to
* enable VMID tracing and use the VMIDOPT flag to trace
* CONTEXTIDR_EL2 rather than VTTBR.VMID in the VMID field.
* Trace decoders will need to inspect TRCCONFIGR and use
* either the CID or the VMID field from the trace packet.
*/
if (!(is_kernel_in_hyp_mode() &&
(read_sysreg(hcr_el2) & BIT(34)) != 0)) {
The patch looks good to me. We should be careful about the register access, as we assume that the kernel is running in AArch64 mode above, which may not be necessarily true.
Suzuki
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
The patch below does the kernel and userspace side but is not complete. The problem is that userspace perf creates the metadata copy of TRCCONFIGR based on its request (and fills in the other id registers by reading sysfs), but the detection of EL2/E2H happens in the kernel which adjusts TRCCONFIGR, and it's this config which is needed for decode. I
see three ways round this:
- have userspace test to see if the kernel is EL2 (somehow) and
adjust the metadata to mirror what the kernel is doing
- have the kernel pass the adjusted TRCCONFIGR back so perf can put
it in the metadata
- have the perf decoder get the thread id from whichever of VMID and
CONTEXTID is available in a PE_CONTEXT element
Obviously, the last is simplest, but it's a bodge, and means that OpenCSD will see VMIDs when its TRCCONFIGR says it won't. It's kind of cleanest to get the real TRCCONFIGR somehow, but how do we do that?
We do get TRCCONFIGR in the perf records. We should simply make sure we get the uptodate value (wherever we are getting it from).
The copy in PERF_RECORD_AUXINFO (which is a synthetic record created by userspace perf) is, I believe, as I said:
This should say PERF_RECORD_AUXTRACE_INFO. It's still a synthesized record (70) created in userspace not the kernel.
"userspace perf creates the metadata copy of TRCCONFIGR based on its request".
So if the kernel modifies it based on information only the kernel knows, there's no current way to get the actual value. That was what I was trying to address with my suggestions.
Have I missed some place the actual TRCCONFIGR is already being returned in other perf records?
Al
Al
diff --git a/drivers/hwtracing/coresight/coresight-etm4x.c b/drivers/hwtracing/coresight/coresight-etm4x.c index a128b5063f46..96488a0cfdcf 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x.c +++ b/drivers/hwtracing/coresight/coresight-etm4x.c @@ -353,8 +353,32 @@ static int etm4_parse_event_config(struct
etmv4_drvdata *drvdata,
} if (attr->config & BIT(ETM_OPT_CTXTID))
/* bit[6], Context ID tracing bit */
config->cfg |= BIT(ETM4_CFG_BIT_CTXTID);
{
/*
* Enable context-id tracing. The assumption is that this
* will work with CONFIG_PID_IN_CONTEXTIDR to trace process
* id changes and support decode of multiple processes.
* But ETM's context id trace traces physical CONTEXTIDR_EL1,
* while the logical CONTEXTIDR_EL1 that is written to on
* process switch is either physical CONTEXTIDR_EL1 or
* CONTEXTIDR_EL2 depending on HCR_EL2.E2H. On principle
* we should continue to use logical CONTEXTIDR_EL1.
* In order to trace physical CONTEXTIDR_EL2, we need to
* enable VMID tracing and use the VMIDOPT flag to trace
* CONTEXTIDR_EL2 rather than VTTBR.VMID in the VMID field.
* Trace decoders will need to inspect TRCCONFIGR and use
* either the CID or the VMID field from the trace packet.
*/
if (!(is_kernel_in_hyp_mode() &&
(read_sysreg(hcr_el2) & BIT(34)) != 0)) {
The patch looks good to me. We should be careful about the register access, as we assume that the kernel is running in AArch64 mode above, which may not be necessarily true.
Suzuki
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. _______________________________________________ CoreSight mailing list CoreSight@lists.linaro.org https://lists.linaro.org/mailman/listinfo/coresight
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
On 05/05/2020 05:00 PM, Al Grant wrote:
The patch below does the kernel and userspace side but is not complete. The problem is that userspace perf creates the metadata copy of TRCCONFIGR based on its request (and fills in the other id registers by reading sysfs), but the detection of EL2/E2H happens in the kernel which adjusts TRCCONFIGR, and it's this config which is needed for decode. I
see three ways round this:
- have userspace test to see if the kernel is EL2 (somehow) and adjust
the metadata to mirror what the kernel is doing
- have the kernel pass the adjusted TRCCONFIGR back so perf can put it
in the metadata
- have the perf decoder get the thread id from whichever of VMID and
CONTEXTID is available in a PE_CONTEXT element
Obviously, the last is simplest, but it's a bodge, and means that OpenCSD will see VMIDs when its TRCCONFIGR says it won't. It's kind of cleanest to get the real TRCCONFIGR somehow, but how do we do that?
We do get TRCCONFIGR in the perf records. We should simply make sure we get the uptodate value (wherever we are getting it from).
The copy in PERF_RECORD_AUXINFO (which is a synthetic record created by userspace perf) is, I believe, as I said:
"userspace perf creates the metadata copy of TRCCONFIGR based on its request".
So if the kernel modifies it based on information only the kernel knows, there's no current way to get the actual value. That was what I was trying to address with my suggestions.
Have I missed some place the actual TRCCONFIGR is already being returned in other perf records?
The sysfs is one place where we could expose this and this could be then consumed by the perf tool while creating the TRCCONFIGR. Since the EL can't change, this could be a onetime probe activity.
Cheers Suzuki
Al
Al
diff --git a/drivers/hwtracing/coresight/coresight-etm4x.c b/drivers/hwtracing/coresight/coresight-etm4x.c index a128b5063f46..96488a0cfdcf 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x.c +++ b/drivers/hwtracing/coresight/coresight-etm4x.c @@ -353,8 +353,32 @@ static int etm4_parse_event_config(struct
etmv4_drvdata *drvdata,
} if (attr->config & BIT(ETM_OPT_CTXTID))
/* bit[6], Context ID tracing bit */
config->cfg |= BIT(ETM4_CFG_BIT_CTXTID);
{
/*
* Enable context-id tracing. The assumption is that this
* will work with CONFIG_PID_IN_CONTEXTIDR to trace process
* id changes and support decode of multiple processes.
* But ETM's context id trace traces physical CONTEXTIDR_EL1,
* while the logical CONTEXTIDR_EL1 that is written to on
* process switch is either physical CONTEXTIDR_EL1 or
* CONTEXTIDR_EL2 depending on HCR_EL2.E2H. On principle
* we should continue to use logical CONTEXTIDR_EL1.
* In order to trace physical CONTEXTIDR_EL2, we need to
* enable VMID tracing and use the VMIDOPT flag to trace
* CONTEXTIDR_EL2 rather than VTTBR.VMID in the VMID field.
* Trace decoders will need to inspect TRCCONFIGR and use
* either the CID or the VMID field from the trace packet.
*/
if (!(is_kernel_in_hyp_mode() &&
(read_sysreg(hcr_el2) & BIT(34)) != 0)) {
The patch looks good to me. We should be careful about the register access, as we assume that the kernel is running in AArch64 mode above, which may not be necessarily true.
Suzuki
Hi Al, Suzuki,
On Wed, May 06, 2020 at 10:17:15AM +0100, Suzuki Kuruppassery Poulose wrote:
On 05/05/2020 05:00 PM, Al Grant wrote:
The patch below does the kernel and userspace side but is not complete. The problem is that userspace perf creates the metadata copy of TRCCONFIGR based on its request (and fills in the other id registers by reading sysfs), but the detection of EL2/E2H happens in the kernel which adjusts TRCCONFIGR, and it's this config which is needed for decode. I
see three ways round this:
- have userspace test to see if the kernel is EL2 (somehow) and adjust
the metadata to mirror what the kernel is doing
- have the kernel pass the adjusted TRCCONFIGR back so perf can put it
in the metadata
- have the perf decoder get the thread id from whichever of VMID and
CONTEXTID is available in a PE_CONTEXT element
Obviously, the last is simplest, but it's a bodge, and means that OpenCSD will see VMIDs when its TRCCONFIGR says it won't. It's kind of cleanest to get the real TRCCONFIGR somehow, but how do we do that?
We do get TRCCONFIGR in the perf records. We should simply make sure we get the uptodate value (wherever we are getting it from).
The copy in PERF_RECORD_AUXINFO (which is a synthetic record created by userspace perf) is, I believe, as I said:
"userspace perf creates the metadata copy of TRCCONFIGR based on its request".
So if the kernel modifies it based on information only the kernel knows, there's no current way to get the actual value. That was what I was trying to address with my suggestions.
Have I missed some place the actual TRCCONFIGR is already being returned in other perf records?
The sysfs is one place where we could expose this and this could be then consumed by the perf tool while creating the TRCCONFIGR. Since the EL can't change, this could be a onetime probe activity.
I spent some time to look into this issue, I think we need to note one thing is: since the metadata is captured at the beginning when execute Perf tool, it should be prior to setting TRCCONFIGR in the CoreSight driver when enable perf event. Especially, if we consider for the per-thread case, if the traced task is not migrated on a specific CPU (let's say CPU_a), then CPU_a's TRCCONFIGR will not be set properly until the trace task is migrated to CPU_a.
So we can see in the perf tool it has the function cs_etmv4_get_config(), it doesn't read out any value from sysfs for the register TRCCONFIGR, alternatively it just generates a value for TRCCONFIGR based on perf's 'attr.config'. On the other hand, in the kernel side, it needs to maintain the same logic in the function etm4_parse_event_config(), which also parses 'attr.config' and set into TRCCONFIGR.
For fixing this issue, I think one potential direction is to change the function etm_event_init() in the kernel, we can use it to invoke a function like etm4_parse_event_config(), so this can allow the register TRCCONFIGR to be ready in the initialisation phase. Then, as Suzuki suggested, in the perf tool we can use sysfs node to read register TRCCONFIGR. To be honest, I don't verify this is feasible or not, but from reading the code, looks like this is feasible with below flow:
record__open() `-> evsel__open() `-> evsel__open_cpu() `-> perf_event_open() `-> perf_init_event() `-> perf_try_init_event() `-> etm_event_init() `-> Set TRCCONFIGR
record__synthesize() `-> perf_event__synthesize_auxtrace_info() `-> auxtrace_record__info_fill() `-> cs_etm_info_fill() `-> Use sysfs to read out TRCCONFIGR
Thanks, Leo
On Fri, 8 May 2020 at 00:11, Leo Yan leo.yan@linaro.org wrote:
Hi Al, Suzuki,
On Wed, May 06, 2020 at 10:17:15AM +0100, Suzuki Kuruppassery Poulose wrote:
On 05/05/2020 05:00 PM, Al Grant wrote:
The patch below does the kernel and userspace side but is not
complete.
The problem is that userspace perf creates the metadata copy of TRCCONFIGR based on its request (and fills in the other id
registers
by reading sysfs), but the detection of EL2/E2H happens in the
kernel
which adjusts TRCCONFIGR, and it's this config which is needed for
decode. I
see three ways round this:
- have userspace test to see if the kernel is EL2 (somehow) and
adjust
the metadata to mirror what the kernel is doing
- have the kernel pass the adjusted TRCCONFIGR back so perf can
put it
in the metadata
- have the perf decoder get the thread id from whichever of VMID
and
CONTEXTID is available in a PE_CONTEXT element
Obviously, the last is simplest, but it's a bodge, and means that OpenCSD will see VMIDs when its TRCCONFIGR says it won't. It's
kind of
cleanest to get the real TRCCONFIGR somehow, but how do we do that?
We do get TRCCONFIGR in the perf records. We should simply make sure
we get
the uptodate value (wherever we are getting it from).
The copy in PERF_RECORD_AUXINFO (which is a synthetic record created by userspace perf) is, I believe, as I said:
"userspace perf creates the metadata copy of TRCCONFIGR based on its request".
So if the kernel modifies it based on information only the kernel
knows,
there's no current way to get the actual value. That was what I was
trying
to address with my suggestions.
Have I missed some place the actual TRCCONFIGR is already being
returned
in other perf records?
The sysfs is one place where we could expose this and this could be then consumed by the perf tool while creating the TRCCONFIGR. Since the EL can't change, this could be a onetime probe activity.
I'm joining this thread on the late and as such don't have all the context. But sysfs is never a good place to export run-time configuration registers as there could be multiple perf sessions active at the same time. This isn't a big issue for N:1 topologies but very real for 1:1.
I spent some time to look into this issue, I think we need to note one thing is: since the metadata is captured at the beginning when execute Perf tool, it should be prior to setting TRCCONFIGR in the CoreSight driver when enable perf event. Especially, if we consider for the per-thread case, if the traced task is not migrated on a specific CPU (let's say CPU_a), then CPU_a's TRCCONFIGR will not be set properly until the trace task is migrated to CPU_a.
So we can see in the perf tool it has the function cs_etmv4_get_config(), it doesn't read out any value from sysfs for the register TRCCONFIGR, alternatively it just generates a value for TRCCONFIGR based on perf's 'attr.config'. On the other hand, in the kernel side, it needs to maintain the same logic in the function etm4_parse_event_config(), which also parses 'attr.config' and set into TRCCONFIGR.
For fixing this issue, I think one potential direction is to change the function etm_event_init() in the kernel, we can use it to invoke a function like etm4_parse_event_config(), so this can allow the register TRCCONFIGR to be ready in the initialisation phase. Then, as Suzuki suggested, in the perf tool we can use sysfs node to read register TRCCONFIGR. To be honest, I don't verify this is feasible or not, but from reading the code, looks like this is feasible with below flow:
record__open() `-> evsel__open() `-> evsel__open_cpu() `-> perf_event_open() `-> perf_init_event() `-> perf_try_init_event() `-> etm_event_init() `-> Set TRCCONFIGR
record__synthesize() `-> perf_event__synthesize_auxtrace_info() `-> auxtrace_record__info_fill() `-> cs_etm_info_fill() `-> Use sysfs to read out TRCCONFIGR
Thanks, Leo _______________________________________________ CoreSight mailing list CoreSight@lists.linaro.org https://lists.linaro.org/mailman/listinfo/coresight
From: CoreSight coresight-bounces@lists.linaro.org On Behalf Of Mathieu Poirier Sent: 08 May 2020 16:44 To: leo.yan@linaro.org Cc: Coresight ML coresight@lists.linaro.org Subject: Re: Context id tracing with kernel at EL2
On Fri, 8 May 2020 at 00:11, Leo Yan <leo.yan@linaro.orgmailto:leo.yan@linaro.org> wrote: Hi Al, Suzuki,
On Wed, May 06, 2020 at 10:17:15AM +0100, Suzuki Kuruppassery Poulose wrote:
On 05/05/2020 05:00 PM, Al Grant wrote:
The patch below does the kernel and userspace side but is not complete. The problem is that userspace perf creates the metadata copy of TRCCONFIGR based on its request (and fills in the other id registers by reading sysfs), but the detection of EL2/E2H happens in the kernel which adjusts TRCCONFIGR, and it's this config which is needed for decode. I
see three ways round this:
- have userspace test to see if the kernel is EL2 (somehow) and adjust
the metadata to mirror what the kernel is doing
- have the kernel pass the adjusted TRCCONFIGR back so perf can put it
in the metadata
- have the perf decoder get the thread id from whichever of VMID and
CONTEXTID is available in a PE_CONTEXT element
Obviously, the last is simplest, but it's a bodge, and means that OpenCSD will see VMIDs when its TRCCONFIGR says it won't. It's kind of cleanest to get the real TRCCONFIGR somehow, but how do we do that?
We do get TRCCONFIGR in the perf records. We should simply make sure we get the uptodate value (wherever we are getting it from).
The copy in PERF_RECORD_AUXINFO (which is a synthetic record created by userspace perf) is, I believe, as I said:
"userspace perf creates the metadata copy of TRCCONFIGR based on its request".
So if the kernel modifies it based on information only the kernel knows, there's no current way to get the actual value. That was what I was trying to address with my suggestions.
Have I missed some place the actual TRCCONFIGR is already being returned in other perf records?
The sysfs is one place where we could expose this and this could be then consumed by the perf tool while creating the TRCCONFIGR. Since the EL can't change, this could be a onetime probe activity.
I'm joining this thread on the late and as such don't have all the context. But sysfs is never a good place to export run-time configuration registers as there could be multiple perf sessions active at the same time. This isn't a big issue for N:1 topologies but very real for 1:1.
I agree. Event-specific data (e.g. actual TRCCONFIGR) should be returned through some mechanism specific to the event. For example, we could have a new ioctl() specific to CoreSight events (“return trace config”). Or we could use one of the flag bits in the PERF_RECORD_AUX record. There are 64 bits, of which 4 are currently used as generic flags (PERF_AUX_FLAG_TRUNCATED etc.). If we could reserve some of those as event-specific flags (say PERF_AUX_FLAG_EVENT0..7), then for ETM AUX buffers we could use a flag to indicate if the thread id is in CID or VMID in the trace. That would be my preference.
The suggestion might have been to use sysfs to indicate if the kernel is at EL1 or EL2. That’s not runtime varying, so sysfs should be fine for that. However it then forces perf userspace to make the inference that at EL2 the thread id is always in VMID. That’s currently the case, but it’s the sort of heuristic rule we really don’t want to be pushing into perf.
And if we do that it will need to be captured at “perf record” time and stored in a new field in PERF_RECORD_AUXTRACE_INFO. The advantage of using an AUX flag is that we don’t change any formats – it’s just using a flag in an existing flags word to help us decode traces that are currently undecodeable.
Al
I spent some time to look into this issue, I think we need to note one thing is: since the metadata is captured at the beginning when execute Perf tool, it should be prior to setting TRCCONFIGR in the CoreSight driver when enable perf event. Especially, if we consider for the per-thread case, if the traced task is not migrated on a specific CPU (let's say CPU_a), then CPU_a's TRCCONFIGR will not be set properly until the trace task is migrated to CPU_a.
So we can see in the perf tool it has the function cs_etmv4_get_config(), it doesn't read out any value from sysfs for the register TRCCONFIGR, alternatively it just generates a value for TRCCONFIGR based on perf's 'attr.config'. On the other hand, in the kernel side, it needs to maintain the same logic in the function etm4_parse_event_config(), which also parses 'attr.config' and set into TRCCONFIGR.
For fixing this issue, I think one potential direction is to change the function etm_event_init() in the kernel, we can use it to invoke a function like etm4_parse_event_config(), so this can allow the register TRCCONFIGR to be ready in the initialisation phase. Then, as Suzuki suggested, in the perf tool we can use sysfs node to read register TRCCONFIGR. To be honest, I don't verify this is feasible or not, but from reading the code, looks like this is feasible with below flow:
record__open() `-> evsel__open() `-> evsel__open_cpu() `-> perf_event_open() `-> perf_init_event() `-> perf_try_init_event() `-> etm_event_init() `-> Set TRCCONFIGR
record__synthesize() `-> perf_event__synthesize_auxtrace_info() `-> auxtrace_record__info_fill() `-> cs_etm_info_fill() `-> Use sysfs to read out TRCCONFIGR
Thanks, Leo _______________________________________________ CoreSight mailing list CoreSight@lists.linaro.orgmailto:CoreSight@lists.linaro.org https://lists.linaro.org/mailman/listinfo/coresight IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
On 05/08/2020 04:43 PM, Mathieu Poirier wrote:
On Fri, 8 May 2020 at 00:11, Leo Yan <leo.yan@linaro.org mailto:leo.yan@linaro.org> wrote:
Hi Al, Suzuki, On Wed, May 06, 2020 at 10:17:15AM +0100, Suzuki Kuruppassery Poulose wrote: > On 05/05/2020 05:00 PM, Al Grant wrote: > > > > The patch below does the kernel and userspace side but is not complete. > > > > The problem is that userspace perf creates the metadata copy of > > > > TRCCONFIGR based on its request (and fills in the other id registers > > > > by reading sysfs), but the detection of EL2/E2H happens in the kernel > > > > which adjusts TRCCONFIGR, and it's this config which is needed for decode. I > > > see three ways round this: > > > > > > > > - have userspace test to see if the kernel is EL2 (somehow) and adjust > > > > the metadata to mirror what the kernel is doing > > > > > > > > - have the kernel pass the adjusted TRCCONFIGR back so perf can put it > > > > in the metadata > > > > > > > > - have the perf decoder get the thread id from whichever of VMID and > > > > CONTEXTID is available in a PE_CONTEXT element > > > > > > > > Obviously, the last is simplest, but it's a bodge, and means that > > > > OpenCSD will see VMIDs when its TRCCONFIGR says it won't. It's kind of > > > > cleanest to get the real TRCCONFIGR somehow, but how do we do that? > > > > > > We do get TRCCONFIGR in the perf records. We should simply make sure we get > > > the uptodate value (wherever we are getting it from). > > > > The copy in PERF_RECORD_AUXINFO (which is a synthetic record created > > by userspace perf) is, I believe, as I said: > > > > "userspace perf creates the metadata copy of TRCCONFIGR based on > > its request". > > > > So if the kernel modifies it based on information only the kernel knows, > > there's no current way to get the actual value. That was what I was trying > > to address with my suggestions. > > > > Have I missed some place the actual TRCCONFIGR is already being returned > > in other perf records? > > The sysfs is one place where we could expose this and this could be then > consumed by the perf tool while creating the TRCCONFIGR. Since the EL > can't change, this could be a onetime probe activity.
I'm joining this thread on the late and as such don't have all the context. But sysfs is never a good place to export run-time configuration registers as there could be multiple perf sessions active at the same time. This isn't a big issue for N:1 topologies but very real for 1:1.
I don't see how this affected by the sink topology, as this is an ETM configuration.
By sysfs, I mean, not necessarily under /sys/bus/coreight/.../etmN/
but, under :
/sys/bus/event_source/.../cs_etm/
We already expose various different information there. e.g, nr_addr_filters, sinks/
So we could expose something like :
trcconfigr/
and could describe how each of the individual event configs affect the TRCCONFIGR register. So that the perf tool can construct the TRCCONFIGR register from scratch just by looking at the trcconfigr directory.
e.g,
# cd /sys/bus/event_source/devices/cs_etm/ # for f in $(find format -type f); do echo $(basename $f) - $(cat $f); done cycacc - config:12 sinkid - config2:0-31 contextid - config:14 retstack - config:29 timestamp - config:28
$ for f in $(find trcconfigr -type f); do echo $(basename $f) - $(cat $f); done cycacc - 0x10 // Bit 4 timestamp - 0x800 // Bit 11 contextid - 0x8080 // Bit15 | Bit 7 On EL2 kernels - 0x40 // Bit 6 on EL1 kernels retstack - 0x1000 // Bit12
That way, the perf tool can scale automatically to cover for the future config changes that affect the TRCCONFIGR. It simply needs to map the "config-name" to trace config by the kernel provided bits. This will also solve the etm3 vs etm4 differences in bit positions and hopefully for the future IPs.
Suzuki
I spent some time to look into this issue, I think we need to note one thing is: since the metadata is captured at the beginning when execute Perf tool, it should be prior to setting TRCCONFIGR in the CoreSight driver when enable perf event. Especially, if we consider for the per-thread case, if the traced task is not migrated on a specific CPU (let's say CPU_a), then CPU_a's TRCCONFIGR will not be set properly until the trace task is migrated to CPU_a. So we can see in the perf tool it has the function cs_etmv4_get_config(), it doesn't read out any value from sysfs for the register TRCCONFIGR, alternatively it just generates a value for TRCCONFIGR based on perf's 'attr.config'. On the other hand, in the kernel side, it needs to maintain the same logic in the function etm4_parse_event_config(), which also parses 'attr.config' and set into TRCCONFIGR. For fixing this issue, I think one potential direction is to change the function etm_event_init() in the kernel, we can use it to invoke a function like etm4_parse_event_config(), so this can allow the register TRCCONFIGR to be ready in the initialisation phase. Then, as Suzuki suggested, in the perf tool we can use sysfs node to read register TRCCONFIGR. To be honest, I don't verify this is feasible or not, but from reading the code, looks like this is feasible with below flow: record__open() `-> evsel__open() `-> evsel__open_cpu() `-> perf_event_open() `-> perf_init_event() `-> perf_try_init_event() `-> etm_event_init() `-> Set TRCCONFIGR record__synthesize() `-> perf_event__synthesize_auxtrace_info() `-> auxtrace_record__info_fill() `-> cs_etm_info_fill() `-> Use sysfs to read out TRCCONFIGR Thanks, Leo _______________________________________________ CoreSight mailing list CoreSight@lists.linaro.org <mailto:CoreSight@lists.linaro.org> https://lists.linaro.org/mailman/listinfo/coresight
On Fri, 22 May 2020 at 09:30, Suzuki K Poulose suzuki.poulose@arm.com wrote:
On 05/08/2020 04:43 PM, Mathieu Poirier wrote:
On Fri, 8 May 2020 at 00:11, Leo Yan <leo.yan@linaro.org mailto:leo.yan@linaro.org> wrote:
Hi Al, Suzuki, On Wed, May 06, 2020 at 10:17:15AM +0100, Suzuki Kuruppassery Poulose wrote: > On 05/05/2020 05:00 PM, Al Grant wrote: > > > > The patch below does the kernel and userspace side but is not complete. > > > > The problem is that userspace perf creates the metadata copy of > > > > TRCCONFIGR based on its request (and fills in the other id registers > > > > by reading sysfs), but the detection of EL2/E2H happens in the kernel > > > > which adjusts TRCCONFIGR, and it's this config which is needed for decode. I > > > see three ways round this: > > > > > > > > - have userspace test to see if the kernel is EL2 (somehow) and adjust > > > > the metadata to mirror what the kernel is doing > > > > > > > > - have the kernel pass the adjusted TRCCONFIGR back so perf can put it > > > > in the metadata > > > > > > > > - have the perf decoder get the thread id from whichever of VMID and > > > > CONTEXTID is available in a PE_CONTEXT element > > > > > > > > Obviously, the last is simplest, but it's a bodge, and means that > > > > OpenCSD will see VMIDs when its TRCCONFIGR says it won't. It's kind of > > > > cleanest to get the real TRCCONFIGR somehow, but how do we do that? > > > > > > We do get TRCCONFIGR in the perf records. We should simply make sure we get > > > the uptodate value (wherever we are getting it from). > > > > The copy in PERF_RECORD_AUXINFO (which is a synthetic record created > > by userspace perf) is, I believe, as I said: > > > > "userspace perf creates the metadata copy of TRCCONFIGR based on > > its request". > > > > So if the kernel modifies it based on information only the kernel knows, > > there's no current way to get the actual value. That was what I was trying > > to address with my suggestions. > > > > Have I missed some place the actual TRCCONFIGR is already being returned > > in other perf records? > > The sysfs is one place where we could expose this and this could be then > consumed by the perf tool while creating the TRCCONFIGR. Since the EL > can't change, this could be a onetime probe activity.
I'm joining this thread on the late and as such don't have all the context. But sysfs is never a good place to export run-time configuration registers as there could be multiple perf sessions active at the same time. This isn't a big issue for N:1 topologies but very real for 1:1.
I don't see how this affected by the sink topology, as this is an ETM configuration.
By sysfs, I mean, not necessarily under /sys/bus/coreight/.../etmN/
but, under :
/sys/bus/event_source/.../cs_etm/
We already expose various different information there. e.g, nr_addr_filters, sinks/
So we could expose something like :
trcconfigr/
and could describe how each of the individual event configs affect the TRCCONFIGR register. So that the perf tool can construct the TRCCONFIGR register from scratch just by looking at the trcconfigr directory.
e.g,
# cd /sys/bus/event_source/devices/cs_etm/ # for f in $(find format -type f); do echo $(basename $f) - $(cat $f); done cycacc - config:12 sinkid - config2:0-31 contextid - config:14 retstack - config:29 timestamp - config:28
$ for f in $(find trcconfigr -type f); do echo $(basename $f) - $(cat $f); done cycacc - 0x10 // Bit 4 timestamp - 0x800 // Bit 11 contextid - 0x8080 // Bit15 | Bit 7 On EL2 kernels - 0x40 // Bit 6 on EL1 kernels retstack - 0x1000 // Bit12
So those are configurations that apply to specific trace sessions. When dealing with concurrent trace sessions there is no way to guarantee the information exported to sysfs applies to the trace session that would read the information.
That way, the perf tool can scale automatically to cover for the future config changes that affect the TRCCONFIGR. It simply needs to map the "config-name" to trace config by the kernel provided bits. This will also solve the etm3 vs etm4 differences in bit positions and hopefully for the future IPs.
Suzuki
I spent some time to look into this issue, I think we need to note one thing is: since the metadata is captured at the beginning when execute Perf tool, it should be prior to setting TRCCONFIGR in the CoreSight driver when enable perf event. Especially, if we consider for the per-thread case, if the traced task is not migrated on a specific CPU (let's say CPU_a), then CPU_a's TRCCONFIGR will not be set properly until the trace task is migrated to CPU_a. So we can see in the perf tool it has the function cs_etmv4_get_config(), it doesn't read out any value from sysfs for the register TRCCONFIGR, alternatively it just generates a value for TRCCONFIGR based on perf's 'attr.config'. On the other hand, in the kernel side, it needs to maintain the same logic in the function etm4_parse_event_config(), which also parses 'attr.config' and set into TRCCONFIGR. For fixing this issue, I think one potential direction is to change the function etm_event_init() in the kernel, we can use it to invoke a function like etm4_parse_event_config(), so this can allow the register TRCCONFIGR to be ready in the initialisation phase. Then, as Suzuki suggested, in the perf tool we can use sysfs node to read register TRCCONFIGR. To be honest, I don't verify this is feasible or not, but from reading the code, looks like this is feasible with below flow: record__open() `-> evsel__open() `-> evsel__open_cpu() `-> perf_event_open() `-> perf_init_event() `-> perf_try_init_event() `-> etm_event_init() `-> Set TRCCONFIGR record__synthesize() `-> perf_event__synthesize_auxtrace_info() `-> auxtrace_record__info_fill() `-> cs_etm_info_fill() `-> Use sysfs to read out TRCCONFIGR Thanks, Leo _______________________________________________ CoreSight mailing list CoreSight@lists.linaro.org <mailto:CoreSight@lists.linaro.org> https://lists.linaro.org/mailman/listinfo/coresight
On 05/25/2020 04:56 PM, Mathieu Poirier wrote:
On Fri, 22 May 2020 at 09:30, Suzuki K Poulose suzuki.poulose@arm.com wrote:
On 05/08/2020 04:43 PM, Mathieu Poirier wrote:
On Fri, 8 May 2020 at 00:11, Leo Yan <leo.yan@linaro.org mailto:leo.yan@linaro.org> wrote:
Hi Al, Suzuki, On Wed, May 06, 2020 at 10:17:15AM +0100, Suzuki Kuruppassery Poulose wrote: > On 05/05/2020 05:00 PM, Al Grant wrote: > > > > The patch below does the kernel and userspace side but is not complete. > > > > The problem is that userspace perf creates the metadata copy of > > > > TRCCONFIGR based on its request (and fills in the other id registers > > > > by reading sysfs), but the detection of EL2/E2H happens in the kernel > > > > which adjusts TRCCONFIGR, and it's this config which is needed for decode. I > > > see three ways round this: > > > > > > > > - have userspace test to see if the kernel is EL2 (somehow) and adjust > > > > the metadata to mirror what the kernel is doing > > > > > > > > - have the kernel pass the adjusted TRCCONFIGR back so perf can put it > > > > in the metadata > > > > > > > > - have the perf decoder get the thread id from whichever of VMID and > > > > CONTEXTID is available in a PE_CONTEXT element > > > > > > > > Obviously, the last is simplest, but it's a bodge, and means that > > > > OpenCSD will see VMIDs when its TRCCONFIGR says it won't. It's kind of > > > > cleanest to get the real TRCCONFIGR somehow, but how do we do that? > > > > > > We do get TRCCONFIGR in the perf records. We should simply make sure we get > > > the uptodate value (wherever we are getting it from). > > > > The copy in PERF_RECORD_AUXINFO (which is a synthetic record created > > by userspace perf) is, I believe, as I said: > > > > "userspace perf creates the metadata copy of TRCCONFIGR based on > > its request". > > > > So if the kernel modifies it based on information only the kernel knows, > > there's no current way to get the actual value. That was what I was trying > > to address with my suggestions. > > > > Have I missed some place the actual TRCCONFIGR is already being returned > > in other perf records? > > The sysfs is one place where we could expose this and this could be then > consumed by the perf tool while creating the TRCCONFIGR. Since the EL > can't change, this could be a onetime probe activity.
I'm joining this thread on the late and as such don't have all the context. But sysfs is never a good place to export run-time configuration registers as there could be multiple perf sessions active at the same time. This isn't a big issue for N:1 topologies but very real for 1:1.
I don't see how this affected by the sink topology, as this is an ETM configuration.
By sysfs, I mean, not necessarily under /sys/bus/coreight/.../etmN/
but, under :
/sys/bus/event_source/.../cs_etm/
We already expose various different information there. e.g, nr_addr_filters, sinks/
So we could expose something like :
trcconfigr/
and could describe how each of the individual event configs affect the TRCCONFIGR register. So that the perf tool can construct the TRCCONFIGR register from scratch just by looking at the trcconfigr directory.
e.g,
# cd /sys/bus/event_source/devices/cs_etm/ # for f in $(find format -type f); do echo $(basename $f) - $(cat $f); done cycacc - config:12 sinkid - config2:0-31 contextid - config:14 retstack - config:29 timestamp - config:28
$ for f in $(find trcconfigr -type f); do echo $(basename $f) - $(cat $f); done cycacc - 0x10 // Bit 4 timestamp - 0x800 // Bit 11 contextid - 0x8080 // Bit15 | Bit 7 On EL2 kernels - 0x40 // Bit 6 on EL1 kernels retstack - 0x1000 // Bit12
So those are configurations that apply to specific trace sessions. When dealing with concurrent trace sessions there is no way to guarantee the information exported to sysfs applies to the trace session that would read the information.
I think there is a bit of confusion here; The idea is the perf tool will construct the value of the traceconfigr based on the "options" selected by the user, just like it does now for "config1"/"config2"
i.e,
perf record -e cs_etm/cycacc,timestamp/ blah-blah
perf tool will :
map cycacc -> config1 |= BIT(12) and trcconfigr |= BIT($cat /sys/.../trcconfigr/cycacc)
and similarly for all the options. The values under the "trcconfigr" are static and only maps a given attribute to the traceconfig register bit.
Thus, any session can create the TRCCONFIGR accurately based on the events now, by dynamically looking up under the sysfs, rather than "compile time" static assumption of the the ETM model.
Cheers Suzuki
Hi,
On Tue, 26 May 2020 at 17:21, Suzuki K Poulose suzuki.poulose@arm.com wrote:
On 05/25/2020 04:56 PM, Mathieu Poirier wrote:
On Fri, 22 May 2020 at 09:30, Suzuki K Poulose suzuki.poulose@arm.com wrote:
On 05/08/2020 04:43 PM, Mathieu Poirier wrote:
On Fri, 8 May 2020 at 00:11, Leo Yan <leo.yan@linaro.org mailto:leo.yan@linaro.org> wrote:
Hi Al, Suzuki, On Wed, May 06, 2020 at 10:17:15AM +0100, Suzuki Kuruppassery Poulose wrote: > On 05/05/2020 05:00 PM, Al Grant wrote: > > > > The patch below does the kernel and userspace side but is not complete. > > > > The problem is that userspace perf creates the metadata copy of > > > > TRCCONFIGR based on its request (and fills in the other id registers > > > > by reading sysfs), but the detection of EL2/E2H happens in the kernel > > > > which adjusts TRCCONFIGR, and it's this config which is needed for decode. I > > > see three ways round this: > > > > > > > > - have userspace test to see if the kernel is EL2 (somehow) and adjust > > > > the metadata to mirror what the kernel is doing > > > > > > > > - have the kernel pass the adjusted TRCCONFIGR back so perf can put it > > > > in the metadata > > > > > > > > - have the perf decoder get the thread id from whichever of VMID and > > > > CONTEXTID is available in a PE_CONTEXT element > > > > > > > > Obviously, the last is simplest, but it's a bodge, and means that > > > > OpenCSD will see VMIDs when its TRCCONFIGR says it won't. It's kind of > > > > cleanest to get the real TRCCONFIGR somehow, but how do we do that? > > > > > > We do get TRCCONFIGR in the perf records. We should simply make sure we get > > > the uptodate value (wherever we are getting it from). > > > > The copy in PERF_RECORD_AUXINFO (which is a synthetic record created > > by userspace perf) is, I believe, as I said: > > > > "userspace perf creates the metadata copy of TRCCONFIGR based on > > its request". > > > > So if the kernel modifies it based on information only the kernel knows, > > there's no current way to get the actual value. That was what I was trying > > to address with my suggestions. > > > > Have I missed some place the actual TRCCONFIGR is already being returned > > in other perf records? > > The sysfs is one place where we could expose this and this could be then > consumed by the perf tool while creating the TRCCONFIGR. Since the EL > can't change, this could be a onetime probe activity.
I'm joining this thread on the late and as such don't have all the context. But sysfs is never a good place to export run-time configuration registers as there could be multiple perf sessions active at the same time. This isn't a big issue for N:1 topologies but very real for 1:1.
I don't see how this affected by the sink topology, as this is an ETM configuration.
By sysfs, I mean, not necessarily under /sys/bus/coreight/.../etmN/
but, under :
/sys/bus/event_source/.../cs_etm/
We already expose various different information there. e.g, nr_addr_filters, sinks/
So we could expose something like :
trcconfigr/
and could describe how each of the individual event configs affect the TRCCONFIGR register. So that the perf tool can construct the TRCCONFIGR register from scratch just by looking at the trcconfigr directory.
e.g,
# cd /sys/bus/event_source/devices/cs_etm/ # for f in $(find format -type f); do echo $(basename $f) - $(cat $f); done cycacc - config:12 sinkid - config2:0-31 contextid - config:14 retstack - config:29 timestamp - config:28
$ for f in $(find trcconfigr -type f); do echo $(basename $f) - $(cat $f); done cycacc - 0x10 // Bit 4 timestamp - 0x800 // Bit 11 contextid - 0x8080 // Bit15 | Bit 7 On EL2 kernels - 0x40 // Bit 6 on EL1 kernels retstack - 0x1000 // Bit12
So those are configurations that apply to specific trace sessions. When dealing with concurrent trace sessions there is no way to guarantee the information exported to sysfs applies to the trace session that would read the information.
I think there is a bit of confusion here; The idea is the perf tool will construct the value of the traceconfigr based on the "options" selected by the user, just like it does now for "config1"/"config2"
i.e,
perf record -e cs_etm/cycacc,timestamp/ blah-blah
perf tool will :
map cycacc -> config1 |= BIT(12) and trcconfigr |= BIT($cat /sys/.../trcconfigr/cycacc)
and similarly for all the options. The values under the "trcconfigr" are static and only maps a given attribute to the traceconfig register bit.
Thus, any session can create the TRCCONFIGR accurately based on the events now, by dynamically looking up under the sysfs, rather than "compile time" static assumption of the the ETM model.
So how does this work when I copy the captured data off my target and try to decode offline?
Regards
Mike
Cheers Suzuki _______________________________________________ CoreSight mailing list CoreSight@lists.linaro.org https://lists.linaro.org/mailman/listinfo/coresight
-----Original Message----- From: CoreSight coresight-bounces@lists.linaro.org On Behalf Of Mike Leach Sent: 26 May 2020 17:33 To: Suzuki Poulose Suzuki.Poulose@arm.com Cc: Coresight ML coresight@lists.linaro.org Subject: Re: Context id tracing with kernel at EL2
Hi,
On Tue, 26 May 2020 at 17:21, Suzuki K Poulose suzuki.poulose@arm.com wrote:
On 05/25/2020 04:56 PM, Mathieu Poirier wrote:
On Fri, 22 May 2020 at 09:30, Suzuki K Poulose suzuki.poulose@arm.com
wrote:
On 05/08/2020 04:43 PM, Mathieu Poirier wrote:
On Fri, 8 May 2020 at 00:11, Leo Yan <leo.yan@linaro.org mailto:leo.yan@linaro.org> wrote:
Hi Al, Suzuki, On Wed, May 06, 2020 at 10:17:15AM +0100, Suzuki Kuruppassery Poulose wrote: > On 05/05/2020 05:00 PM, Al Grant wrote: > > > > The patch below does the kernel and userspace side but is not complete. > > > > The problem is that userspace perf creates the metadata copy
of
> > > > TRCCONFIGR based on its request (and fills in the other id registers > > > > by reading sysfs), but the detection of EL2/E2H happens in the kernel > > > > which adjusts TRCCONFIGR, and it's this config which is needed for decode. I > > > see three ways round this: > > > > > > > > - have userspace test to see if the kernel is EL2 (somehow) and adjust > > > > the metadata to mirror what the kernel is doing > > > > > > > > - have the kernel pass the adjusted TRCCONFIGR back so perf can put it > > > > in the metadata > > > > > > > > - have the perf decoder get the thread id from whichever of VMID and > > > > CONTEXTID is available in a PE_CONTEXT element > > > > > > > > Obviously, the last is simplest, but it's a bodge, and means that > > > > OpenCSD will see VMIDs when its TRCCONFIGR says it won't. It's kind of > > > > cleanest to get the real TRCCONFIGR somehow, but how do we do that? > > > > > > We do get TRCCONFIGR in the perf records. We should simply make sure we get > > > the uptodate value (wherever we are getting it from). > > > > The copy in PERF_RECORD_AUXINFO (which is a synthetic record created > > by userspace perf) is, I believe, as I said: > > > > "userspace perf creates the metadata copy of TRCCONFIGR based
on
> > its request". > > > > So if the kernel modifies it based on information only the kernel knows, > > there's no current way to get the actual value. That was what I was trying > > to address with my suggestions. > > > > Have I missed some place the actual TRCCONFIGR is already being returned > > in other perf records? > > The sysfs is one place where we could expose this and this could be then > consumed by the perf tool while creating the TRCCONFIGR. Since the
EL
> can't change, this could be a onetime probe activity.
I'm joining this thread on the late and as such don't have all the context. But sysfs is never a good place to export run-time configuration registers as there could be multiple perf sessions active at the same time. This isn't a big issue for N:1 topologies but very real for 1:1.
I don't see how this affected by the sink topology, as this is an ETM configuration.
By sysfs, I mean, not necessarily under /sys/bus/coreight/.../etmN/
but, under :
/sys/bus/event_source/.../cs_etm/
We already expose various different information there. e.g, nr_addr_filters, sinks/
So we could expose something like :
trcconfigr/
and could describe how each of the individual event configs affect the TRCCONFIGR register. So that the perf tool can construct the TRCCONFIGR register from scratch just by looking at the trcconfigr directory.
e.g,
# cd /sys/bus/event_source/devices/cs_etm/ # for f in $(find format -type f); do echo $(basename $f) - $(cat $f); done cycacc - config:12 sinkid - config2:0-31 contextid - config:14 retstack - config:29 timestamp - config:28
$ for f in $(find trcconfigr -type f); do echo $(basename $f) - $(cat $f); done cycacc - 0x10 // Bit 4 timestamp - 0x800 // Bit 11 contextid - 0x8080 // Bit15 | Bit 7 On EL2 kernels - 0x40 // Bit 6 on EL1 kernels retstack - 0x1000 // Bit12
So those are configurations that apply to specific trace sessions. When dealing with concurrent trace sessions there is no way to guarantee the information exported to sysfs applies to the trace session that would read the information.
I think there is a bit of confusion here; The idea is the perf tool will construct the value of the traceconfigr based on the "options" selected by the user, just like it does now for "config1"/"config2"
i.e,
perf record -e cs_etm/cycacc,timestamp/ blah-blah
perf tool will :
map cycacc -> config1 |= BIT(12) and trcconfigr |= BIT($cat /sys/.../trcconfigr/cycacc)
and similarly for all the options. The values under the "trcconfigr" are static and only maps a given attribute to the traceconfig register bit.
Thus, any session can create the TRCCONFIGR accurately based on the events now, by dynamically looking up under the sysfs, rather than "compile time" static assumption of the the ETM model.
So how does this work when I copy the captured data off my target and try to decode offline?
Same as it did before, but this time TRCCONFIGR in the perf.data's PERF_RECORD_AUXTRACE_INFO will correctly indicate that the thread id is in vmid.
The problem we have now is that when "perf record" is constructing PERF_RECORD_AUXTRACE_INFO, which is a synthetic record that the kernel has nothing to do with, the session-specific value of TRCCONFIGR perf uses is one that it synthesizes based on the options it is passing into the kernel, and this doesn't take account of any adjustments made by the kernel, e.g. to track CONTEXTIDR_EL2 in the vmid field. The idea is to have sysfs provide enough (non-session-specific) info about the kernel, so that "perf record" can synthesize a TRCCONFIGR that is both session-specific and adjusted to take account of what the kernel is doing. In effect "perf record" is mirroring the kernel's own logic.
The alternative is to have the kernel pass something back in a record (e.g. a flag in PERF_RECORD_AUX), again to help "perf record" construct a correct TRCCONFIGR.
It would be much cleaner if the kernel generated all metadata needed to decode the AUX buffer, but that’s not how perf is designed.
Al
Regards
Mike
Cheers Suzuki _______________________________________________ CoreSight mailing list CoreSight@lists.linaro.org https://lists.linaro.org/mailman/listinfo/coresight
-- Mike Leach Principal Engineer, ARM Ltd. Manchester Design Centre. UK _______________________________________________ CoreSight mailing list CoreSight@lists.linaro.org https://lists.linaro.org/mailman/listinfo/coresight
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
HI Al,
On Tue, 26 May 2020 at 19:53, Al Grant Al.Grant@arm.com wrote:
-----Original Message----- From: CoreSight coresight-bounces@lists.linaro.org On Behalf Of Mike Leach Sent: 26 May 2020 17:33 To: Suzuki Poulose Suzuki.Poulose@arm.com Cc: Coresight ML coresight@lists.linaro.org Subject: Re: Context id tracing with kernel at EL2
Hi,
On Tue, 26 May 2020 at 17:21, Suzuki K Poulose suzuki.poulose@arm.com wrote:
On 05/25/2020 04:56 PM, Mathieu Poirier wrote:
On Fri, 22 May 2020 at 09:30, Suzuki K Poulose suzuki.poulose@arm.com
wrote:
On 05/08/2020 04:43 PM, Mathieu Poirier wrote:
On Fri, 8 May 2020 at 00:11, Leo Yan <leo.yan@linaro.org mailto:leo.yan@linaro.org> wrote:
Hi Al, Suzuki, On Wed, May 06, 2020 at 10:17:15AM +0100, Suzuki Kuruppassery Poulose wrote: > On 05/05/2020 05:00 PM, Al Grant wrote: > > > > The patch below does the kernel and userspace side but is not complete. > > > > The problem is that userspace perf creates the metadata copy
of
> > > > TRCCONFIGR based on its request (and fills in the other id registers > > > > by reading sysfs), but the detection of EL2/E2H happens in the kernel > > > > which adjusts TRCCONFIGR, and it's this config which is needed for decode. I > > > see three ways round this: > > > > > > > > - have userspace test to see if the kernel is EL2 (somehow) and adjust > > > > the metadata to mirror what the kernel is doing > > > > > > > > - have the kernel pass the adjusted TRCCONFIGR back so perf can put it > > > > in the metadata > > > > > > > > - have the perf decoder get the thread id from whichever of VMID and > > > > CONTEXTID is available in a PE_CONTEXT element > > > > > > > > Obviously, the last is simplest, but it's a bodge, and means that > > > > OpenCSD will see VMIDs when its TRCCONFIGR says it won't. It's kind of > > > > cleanest to get the real TRCCONFIGR somehow, but how do we do that? > > > > > > We do get TRCCONFIGR in the perf records. We should simply make sure we get > > > the uptodate value (wherever we are getting it from). > > > > The copy in PERF_RECORD_AUXINFO (which is a synthetic record created > > by userspace perf) is, I believe, as I said: > > > > "userspace perf creates the metadata copy of TRCCONFIGR based
on
> > its request". > > > > So if the kernel modifies it based on information only the kernel knows, > > there's no current way to get the actual value. That was what I was trying > > to address with my suggestions. > > > > Have I missed some place the actual TRCCONFIGR is already being returned > > in other perf records? > > The sysfs is one place where we could expose this and this could be then > consumed by the perf tool while creating the TRCCONFIGR. Since the
EL
> can't change, this could be a onetime probe activity.
I'm joining this thread on the late and as such don't have all the context. But sysfs is never a good place to export run-time configuration registers as there could be multiple perf sessions active at the same time. This isn't a big issue for N:1 topologies but very real for 1:1.
I don't see how this affected by the sink topology, as this is an ETM configuration.
By sysfs, I mean, not necessarily under /sys/bus/coreight/.../etmN/
but, under :
/sys/bus/event_source/.../cs_etm/
We already expose various different information there. e.g, nr_addr_filters, sinks/
So we could expose something like :
trcconfigr/
and could describe how each of the individual event configs affect the TRCCONFIGR register. So that the perf tool can construct the TRCCONFIGR register from scratch just by looking at the trcconfigr directory.
e.g,
# cd /sys/bus/event_source/devices/cs_etm/ # for f in $(find format -type f); do echo $(basename $f) - $(cat $f); done cycacc - config:12 sinkid - config2:0-31 contextid - config:14 retstack - config:29 timestamp - config:28
$ for f in $(find trcconfigr -type f); do echo $(basename $f) - $(cat $f); done cycacc - 0x10 // Bit 4 timestamp - 0x800 // Bit 11 contextid - 0x8080 // Bit15 | Bit 7 On EL2 kernels - 0x40 // Bit 6 on EL1 kernels retstack - 0x1000 // Bit12
So those are configurations that apply to specific trace sessions. When dealing with concurrent trace sessions there is no way to guarantee the information exported to sysfs applies to the trace session that would read the information.
I think there is a bit of confusion here; The idea is the perf tool will construct the value of the traceconfigr based on the "options" selected by the user, just like it does now for "config1"/"config2"
i.e,
perf record -e cs_etm/cycacc,timestamp/ blah-blah
perf tool will :
map cycacc -> config1 |= BIT(12) and trcconfigr |= BIT($cat /sys/.../trcconfigr/cycacc)
and similarly for all the options. The values under the "trcconfigr" are static and only maps a given attribute to the traceconfig register bit.
Thus, any session can create the TRCCONFIGR accurately based on the events now, by dynamically looking up under the sysfs, rather than "compile time" static assumption of the the ETM model.
So how does this work when I copy the captured data off my target and try to decode offline?
Same as it did before, but this time TRCCONFIGR in the perf.data's PERF_RECORD_AUXTRACE_INFO will correctly indicate that the thread id is in vmid.
Yeah - that makes sense - I skimmed Suzukis code and somehow assumed it was grabbing the data live on decode rather than the thing that makes most sense - putting it in the INFO structures after the record has finished.
The problem we have now is that when "perf record" is constructing PERF_RECORD_AUXTRACE_INFO, which is a synthetic record that the kernel has nothing to do with, the session-specific value of TRCCONFIGR perf uses is one that it synthesizes based on the options it is passing into the kernel, and this doesn't take account of any adjustments made by the kernel, e.g. to track CONTEXTIDR_EL2 in the vmid field. The idea is to have sysfs provide enough (non-session-specific) info about the kernel, so that "perf record" can synthesize a TRCCONFIGR that is both session-specific and adjusted to take account of what the kernel is doing. In effect "perf record" is mirroring the kernel's own logic.
The alternative is to have the kernel pass something back in a record (e.g. a flag in PERF_RECORD_AUX), again to help "perf record" construct a correct TRCCONFIGR.
It would be much cleaner if the kernel generated all metadata needed to decode the AUX buffer, but that’s not how perf is designed.
I did think that perf was designed for exactly the task of extracting data from the kernel to allow user side analysis!
However I would agree there doesn't appear to be an obvious route out for the specifics we need other than embed it in the data we are outputting - i.e. the raw trace buffer. sysfs is simpler as long as it is read at the end of the record session.
Regards
Mike
Al
Regards
Mike
Cheers Suzuki _______________________________________________ CoreSight mailing list CoreSight@lists.linaro.org https://lists.linaro.org/mailman/listinfo/coresight
-- Mike Leach Principal Engineer, ARM Ltd. Manchester Design Centre. UK _______________________________________________ CoreSight mailing list CoreSight@lists.linaro.org https://lists.linaro.org/mailman/listinfo/coresight
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
-----Original Message----- From: Mike Leach mike.leach@linaro.org Sent: 27 May 2020 10:49 To: Al Grant Al.Grant@arm.com Cc: Suzuki Poulose Suzuki.Poulose@arm.com; Coresight ML coresight@lists.linaro.org Subject: Re: Context id tracing with kernel at EL2
HI Al,
On Tue, 26 May 2020 at 19:53, Al Grant Al.Grant@arm.com wrote:
-----Original Message----- From: CoreSight coresight-bounces@lists.linaro.org On Behalf Of Mike Leach Sent: 26 May 2020 17:33 To: Suzuki Poulose Suzuki.Poulose@arm.com Cc: Coresight ML coresight@lists.linaro.org Subject: Re: Context id tracing with kernel at EL2
Hi,
On Tue, 26 May 2020 at 17:21, Suzuki K Poulose suzuki.poulose@arm.com wrote:
On 05/25/2020 04:56 PM, Mathieu Poirier wrote:
On Fri, 22 May 2020 at 09:30, Suzuki K Poulose suzuki.poulose@arm.com
wrote:
On 05/08/2020 04:43 PM, Mathieu Poirier wrote: > > > On Fri, 8 May 2020 at 00:11, Leo Yan <leo.yan@linaro.org > mailto:leo.yan@linaro.org> wrote: > > Hi Al, Suzuki, > > On Wed, May 06, 2020 at 10:17:15AM +0100, Suzuki Kuruppassery > Poulose wrote: > > On 05/05/2020 05:00 PM, Al Grant wrote: > > > > > The patch below does the kernel and userspace side but is > not complete. > > > > > The problem is that userspace perf creates the > metadata copy
of
> > > > > TRCCONFIGR based on its request (and fills in the other id > registers > > > > > by reading sysfs), but the detection of EL2/E2H happens in > the kernel > > > > > which adjusts TRCCONFIGR, and it's this config which is > needed for decode. I > > > > see three ways round this: > > > > > > > > > > - have userspace test to see if the kernel is EL2 (somehow) > and adjust > > > > > the metadata to mirror what the kernel is doing > > > > > > > > > > - have the kernel pass the adjusted TRCCONFIGR back so
perf
> can put it > > > > > in the metadata > > > > > > > > > > - have the perf decoder get the thread id from whichever of > VMID and > > > > > CONTEXTID is available in a PE_CONTEXT element > > > > > > > > > > Obviously, the last is simplest, but it's a bodge, and > means that > > > > > OpenCSD will see VMIDs when its TRCCONFIGR says it won't. > It's kind of > > > > > cleanest to get the real TRCCONFIGR somehow, but how do
we
> do that? > > > > > > > > We do get TRCCONFIGR in the perf records. We should simply > make sure we get > > > > the uptodate value (wherever we are getting it from). > > > > > > The copy in PERF_RECORD_AUXINFO (which is a synthetic
record
> created > > > by userspace perf) is, I believe, as I said: > > > > > > "userspace perf creates the metadata copy of > TRCCONFIGR based
on
> > > its request". > > > > > > So if the kernel modifies it based on information only the > kernel knows, > > > there's no current way to get the actual value. That was what I > was trying > > > to address with my suggestions. > > > > > > Have I missed some place the actual TRCCONFIGR is already
being
> returned > > > in other perf records? > > > > The sysfs is one place where we could expose this and this could > be then > > consumed by the perf tool while creating the > TRCCONFIGR. Since the
EL
> > can't change, this could be a onetime probe activity. > > > I'm joining this thread on the late and as such don't have all > the context. But sysfs is never a good place to export > run-time configuration registers as there could be multiple > perf sessions active at the same time. This isn't a big issue > for N:1 topologies but very real for 1:1.
I don't see how this affected by the sink topology, as this is an ETM configuration.
By sysfs, I mean, not necessarily under /sys/bus/coreight/.../etmN/
but, under :
/sys/bus/event_source/.../cs_etm/
We already expose various different information there. e.g, nr_addr_filters, sinks/
So we could expose something like :
trcconfigr/
and could describe how each of the individual event configs affect the TRCCONFIGR register. So that the perf tool can construct the TRCCONFIGR register from scratch just by looking at the trcconfigr directory.
e.g,
# cd /sys/bus/event_source/devices/cs_etm/ # for f in $(find format -type f); do echo $(basename $f) - $(cat $f); done cycacc - config:12 sinkid - config2:0-31 contextid - config:14 retstack - config:29 timestamp - config:28
$ for f in $(find trcconfigr -type f); do echo $(basename $f) - $(cat $f); done cycacc - 0x10 // Bit 4 timestamp - 0x800 // Bit 11 contextid - 0x8080 // Bit15 | Bit 7 On EL2 kernels - 0x40 // Bit 6 on EL1 kernels retstack - 0x1000 // Bit12
So those are configurations that apply to specific trace sessions. When dealing with concurrent trace sessions there is no way to guarantee the information exported to sysfs applies to the trace session that would read the information.
I think there is a bit of confusion here; The idea is the perf tool will construct the value of the traceconfigr based on the "options" selected by the user, just like it does now for "config1"/"config2"
i.e,
perf record -e cs_etm/cycacc,timestamp/ blah-blah
perf tool will :
map cycacc -> config1 |= BIT(12) and trcconfigr |= BIT($cat /sys/.../trcconfigr/cycacc)
and similarly for all the options. The values under the "trcconfigr" are static and only maps a given attribute to the traceconfig register bit.
Thus, any session can create the TRCCONFIGR accurately based on the events now, by dynamically looking up under the sysfs, rather than "compile time" static assumption of the the ETM model.
So how does this work when I copy the captured data off my target and try to decode offline?
Same as it did before, but this time TRCCONFIGR in the perf.data's PERF_RECORD_AUXTRACE_INFO will correctly indicate that the thread id is in vmid.
Yeah - that makes sense - I skimmed Suzukis code and somehow assumed it was grabbing the data live on decode rather than the thing that makes most sense - putting it in the INFO structures after the record has finished.
The problem we have now is that when "perf record" is constructing PERF_RECORD_AUXTRACE_INFO, which is a synthetic record that the kernel has nothing to do with, the session-specific value of TRCCONFIGR perf uses is one that it synthesizes based on the options it is passing into the kernel, and this doesn't take account of any adjustments made by the kernel, e.g. to track CONTEXTIDR_EL2 in the vmid field. The idea is to have sysfs provide enough (non-session-specific) info about the kernel, so that "perf record" can synthesize a TRCCONFIGR that is both session-specific and adjusted to take account of what the kernel is doing. In effect "perf record" is mirroring the kernel's own logic.
The alternative is to have the kernel pass something back in a record (e.g. a flag in PERF_RECORD_AUX), again to help "perf record" construct a correct TRCCONFIGR.
It would be much cleaner if the kernel generated all metadata needed to decode the AUX buffer, but that’s not how perf is designed.
I did think that perf was designed for exactly the task of extracting data from the kernel to allow user side analysis!
However I would agree there doesn't appear to be an obvious route out for the specifics we need other than embed it in the data we are outputting - i.e. the raw trace buffer. sysfs is simpler as long as it is read at the end of the record session.
It should be read at the beginning, when "perf record" creates PERF_RECORD_AUXTRACE_INFO. "perf record" can be sent to a pipe and by the time whatever's reading it sees an AUX buffer it should already have seen the metadata on how to decode it.
Al
Regards
Mike
Al
Regards
Mike
Cheers Suzuki _______________________________________________ CoreSight mailing list CoreSight@lists.linaro.org https://lists.linaro.org/mailman/listinfo/coresight
-- Mike Leach Principal Engineer, ARM Ltd. Manchester Design Centre. UK _______________________________________________ CoreSight mailing list CoreSight@lists.linaro.org https://lists.linaro.org/mailman/listinfo/coresight
IMPORTANT NOTICE: The contents of this email and any attachments are
confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
-- Mike Leach Principal Engineer, ARM Ltd. Manchester Design Centre. UK
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
On 05/26/2020 05:33 PM, Mike Leach wrote:
Hi,
On Tue, 26 May 2020 at 17:21, Suzuki K Poulose suzuki.poulose@arm.com wrote:
On 05/25/2020 04:56 PM, Mathieu Poirier wrote:
On Fri, 22 May 2020 at 09:30, Suzuki K Poulose suzuki.poulose@arm.com wrote:
On 05/08/2020 04:43 PM, Mathieu Poirier wrote:
On Fri, 8 May 2020 at 00:11, Leo Yan <leo.yan@linaro.org mailto:leo.yan@linaro.org> wrote:
Hi Al, Suzuki, On Wed, May 06, 2020 at 10:17:15AM +0100, Suzuki Kuruppassery Poulose wrote: > On 05/05/2020 05:00 PM, Al Grant wrote: > > > > The patch below does the kernel and userspace side but is not complete. > > > > The problem is that userspace perf creates the metadata copy of > > > > TRCCONFIGR based on its request (and fills in the other id registers > > > > by reading sysfs), but the detection of EL2/E2H happens in the kernel > > > > which adjusts TRCCONFIGR, and it's this config which is needed for decode. I > > > see three ways round this: > > > > > > > > - have userspace test to see if the kernel is EL2 (somehow) and adjust > > > > the metadata to mirror what the kernel is doing > > > > > > > > - have the kernel pass the adjusted TRCCONFIGR back so perf can put it > > > > in the metadata > > > > > > > > - have the perf decoder get the thread id from whichever of VMID and > > > > CONTEXTID is available in a PE_CONTEXT element > > > > > > > > Obviously, the last is simplest, but it's a bodge, and means that > > > > OpenCSD will see VMIDs when its TRCCONFIGR says it won't. It's kind of > > > > cleanest to get the real TRCCONFIGR somehow, but how do we do that? > > > > > > We do get TRCCONFIGR in the perf records. We should simply make sure we get > > > the uptodate value (wherever we are getting it from). > > > > The copy in PERF_RECORD_AUXINFO (which is a synthetic record created > > by userspace perf) is, I believe, as I said: > > > > "userspace perf creates the metadata copy of TRCCONFIGR based on > > its request". > > > > So if the kernel modifies it based on information only the kernel knows, > > there's no current way to get the actual value. That was what I was trying > > to address with my suggestions. > > > > Have I missed some place the actual TRCCONFIGR is already being returned > > in other perf records? > > The sysfs is one place where we could expose this and this could be then > consumed by the perf tool while creating the TRCCONFIGR. Since the EL > can't change, this could be a onetime probe activity.
I'm joining this thread on the late and as such don't have all the context. But sysfs is never a good place to export run-time configuration registers as there could be multiple perf sessions active at the same time. This isn't a big issue for N:1 topologies but very real for 1:1.
I don't see how this affected by the sink topology, as this is an ETM configuration.
By sysfs, I mean, not necessarily under /sys/bus/coreight/.../etmN/
but, under :
/sys/bus/event_source/.../cs_etm/
We already expose various different information there. e.g, nr_addr_filters, sinks/
So we could expose something like :
trcconfigr/
and could describe how each of the individual event configs affect the TRCCONFIGR register. So that the perf tool can construct the TRCCONFIGR register from scratch just by looking at the trcconfigr directory.
e.g,
# cd /sys/bus/event_source/devices/cs_etm/ # for f in $(find format -type f); do echo $(basename $f) - $(cat $f); done cycacc - config:12 sinkid - config2:0-31 contextid - config:14 retstack - config:29 timestamp - config:28
$ for f in $(find trcconfigr -type f); do echo $(basename $f) - $(cat $f); done cycacc - 0x10 // Bit 4 timestamp - 0x800 // Bit 11 contextid - 0x8080 // Bit15 | Bit 7 On EL2 kernels - 0x40 // Bit 6 on EL1 kernels retstack - 0x1000 // Bit12
So those are configurations that apply to specific trace sessions. When dealing with concurrent trace sessions there is no way to guarantee the information exported to sysfs applies to the trace session that would read the information.
I think there is a bit of confusion here; The idea is the perf tool will construct the value of the traceconfigr based on the "options" selected by the user, just like it does now for "config1"/"config2"
i.e,
perf record -e cs_etm/cycacc,timestamp/ blah-blah
perf tool will :
map cycacc -> config1 |= BIT(12) and trcconfigr |= BIT($cat /sys/.../trcconfigr/cycacc)
and similarly for all the options. The values under the "trcconfigr" are static and only maps a given attribute to the traceconfig register bit.
Thus, any session can create the TRCCONFIGR accurately based on the events now, by dynamically looking up under the sysfs, rather than "compile time" static assumption of the the ETM model.
So how does this work when I copy the captured data off my target and try to decode offline?
Just like it works now. There is no change in flow for the perf tool. The only difference, how the perf tool would emit the TRCCONFIGR values in the perf.data samples. Instead of assuming that perf_event.attr.config1.CID => TRCCONFIGR[bit6], use the sysfs exported values to figure out the TRCCONFIGR setting.
Suzuki
Hi all,
Resurrecting an old thread. See more below. Short summary:
With the kernel running at EL2, the PID is in CONTEXTIDR_EL2 and the normal CONTEXTID tracing (which uses CONTEXTIDR_EL1) is not useful. The solution is to use VMID (with CONTEXTIDR_EL2 emitted as VMID) for the PID on an EL2 kernel.
This needs to be communicated to the perf tool somehow, so that it can decode the trace appropriately. One of the proposal was to expose the TRCCONFIGR bits for each of the individual event->attr.config options. I have hacked up a patch [0] to do this and here are two ways to expose this :
Option 1: Trace Config exposed matching the format name: -----
$ pwd /sys/bus/event_source/devices/cs_etm
$ ls trace_config/ contextid cycacc retstack timestamp cs_etm# for f in $(find trace_config -type f) do echo $(basename $f) : $(cat $f) done
timestamp : 0x800 contextid : 0x40 cycacc : 0x10 retstack : 0x1000
Disadvantages : It is really tricky for the perf tool figure out the name of the "config" option from the event->attr.config bits. As the parsing of the format is not exported elsewhere (or I didn't look deep enough)
Option 2: Trace Config exposed as the "bit" positions.
$ ls trace_config/ 12 14 28 29
$ for f in $(find . -type f) do echo $(basename $f) : $(cat $f) done
28 : 0x800 14 : 0x40 12 : 0x10
29 : 0x1000
Advantage: The perf tool can iterate over the event->attr.config bits and do something like:
trconfigr = 0
for_each_set_bit(i, event->attr.config) trconfigr |= sysfs_read_trace_config(i)
where:
sysfs_read_trace_config(int i) { char path[MAX_LEN]; FILE *fp; unsigned long trcfg = 0;
sprintf(path, "%s/%d", $TRACE_CONFIG_PATH, i); fp = fopen(path, "r"); if (fp) fscanf(fp, "%lx", &trcfg); return trcfg; }
Disadvantage: It is not human reader friendly. But it doesn't have to be, as it is for the tools to construct the trcconfigr.
Option 3:
Another completely different option is to fixup the TRCIDR2.CID = 0, while running in VHE, via sysfs. This will indicate that the CONTEXTID tracing is not supported by the ETM, but the VMID support should be visible. This can be used by the perf-tool to detect that the VMID is used as the PID.
Thoughts ?
Suzuki
[0] ---- Expose trace config via sysfs
HACK: coresight: etm4x: Expose trace config via sysfs
Not-yet-signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c index 9d61a71da96f..58e86267fa37 100644 --- a/drivers/hwtracing/coresight/coresight-etm-perf.c +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c @@ -35,6 +35,55 @@ PMU_FORMAT_ATTR(retstack, "config:" __stringify(ETM_OPT_RETSTK)); /* Sink ID - same for all ETMs */ PMU_FORMAT_ATTR(sinkid, "config2:0-31");
+static struct etm_trace_config_map *etm_cfg_map; + +#define ETM_TRCONFIGR_ATTR(_config) \ + &((struct dev_ext_attribute []) { \ + { \ + __ATTR(_config, S_IRUGO, etm_trace_config_show, NULL), \ + (void *)_config \ + } \ + })[0].attr.attr + +static unsigned long etm_trace_config_map_cfg(unsigned long cfg) +{ + struct etm_trace_config_map *map = etm_cfg_map; + + if (!map) + return 0; + + for (; map->cfg; map++) + if (map->cfg == cfg) + return map->trcfg; + return 0; +} + +static ssize_t etm_trace_config_show(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct dev_ext_attribute *eattr; + unsigned long cfg; + + eattr = container_of(attr, struct dev_ext_attribute, attr); + cfg = (unsigned long)eattr->var; + + return snprintf(buf, PAGE_SIZE, "0x%lx\n", etm_trace_config_map_cfg(cfg)); +} + +static struct attribute *etm_pmu_trace_config_attr[] = { + ETM_TRCONFIGR_ATTR(ETM_OPT_CYCACC), + ETM_TRCONFIGR_ATTR(ETM_OPT_CTXTID), + ETM_TRCONFIGR_ATTR(ETM_OPT_TS), + ETM_TRCONFIGR_ATTR(ETM_OPT_RETSTK), + NULL, +}; + +static const struct attribute_group etm_pmu_trace_config_group = { + .name = "trace_config", + .attrs = etm_pmu_trace_config_attr, +}; + static struct attribute *etm_config_formats_attr[] = { &format_attr_cycacc.attr, &format_attr_contextid.attr, @@ -61,6 +110,7 @@ static const struct attribute_group etm_pmu_sinks_group = { static const struct attribute_group *etm_pmu_attr_groups[] = { &etm_pmu_format_group, &etm_pmu_sinks_group, + &etm_pmu_trace_config_group, NULL, };
@@ -600,6 +650,12 @@ void etm_perf_del_symlink_sink(struct coresight_device *csdev) csdev->ea = NULL; }
+void etm_install_trace_config_map(struct etm_trace_config_map *cfg_map) +{ + if (!etm_cfg_map && cfg_map) + etm_cfg_map = cfg_map; +} + static int __init etm_perf_init(void) { int ret; diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.h b/drivers/hwtracing/coresight/coresight-etm-perf.h index 015213abe00a..01679988debb 100644 --- a/drivers/hwtracing/coresight/coresight-etm-perf.h +++ b/drivers/hwtracing/coresight/coresight-etm-perf.h @@ -57,6 +57,16 @@ struct etm_event_data { struct list_head * __percpu *path; };
+/** + * struct etm_trace_config_map - Map perf config to TRCONFIGR bits for userspace + * @cfg: event->attr.config value + * @trcfg: TRCONFIGR setting for @cfg + */ +struct etm_trace_config_map { + unsigned long cfg; + unsigned long trcfg; +}; + #ifdef CONFIG_CORESIGHT int etm_perf_symlink(struct coresight_device *csdev, bool link); int etm_perf_add_symlink_sink(struct coresight_device *csdev); @@ -69,6 +79,8 @@ static inline void *etm_perf_sink_config(struct perf_output_handle *handle) return data->snk_config; return NULL; } + +void etm_install_trace_config_map(struct etm_trace_config_map *cfg_map); #else static inline int etm_perf_symlink(struct coresight_device *csdev, bool link) { return -EINVAL; } @@ -80,6 +92,7 @@ static inline void *etm_perf_sink_config(struct perf_output_handle *handle) return NULL; }
+static void etm_install_trace_config_map(struct etm_trace_config_map *cfg_map) {} #endif /* CONFIG_CORESIGHT */
#endif diff --git a/drivers/hwtracing/coresight/coresight-etm4x.c b/drivers/hwtracing/coresight/coresight-etm4x.c index 00c9f0bb8b1a..e528afc9a56e 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x.c +++ b/drivers/hwtracing/coresight/coresight-etm4x.c @@ -1591,6 +1591,15 @@ static struct amba_driver etm4x_driver = { .id_table = etm4_ids, };
+#define ETM4x_TRACE_CONFIG_MAP_SIZE 5 +static struct etm_trace_config_map etm4x_map[ETM4x_TRACE_CONFIG_MAP_SIZE] = { + { ETM_OPT_CTXTID, 0 }, /* filled in dynamically for ETM_OPT_CTXTID */ + { ETM_OPT_CYCACC, BIT(4) }, + { ETM_OPT_TS, BIT(11) }, + { ETM_OPT_RETSTK, BIT(12) }, + { 0, 0 }, +}; + static int __init etm4x_init(void) { int ret; @@ -1605,6 +1614,10 @@ static int __init etm4x_init(void) if (ret) { pr_err("Error registering etm4x driver\n"); etm4_pm_clear(); + } else { + etm4x_map[0].trcfg = is_kernel_in_hyp_mode() ? + (BIT(7) | BIT(15)) : BIT(6); + etm_install_trace_config_map(&etm4x_map[0]); }
return ret; ---
On 05/27/2020 10:20 AM, Suzuki K Poulose wrote:
On 05/26/2020 05:33 PM, Mike Leach wrote:
Hi,
On Tue, 26 May 2020 at 17:21, Suzuki K Poulose suzuki.poulose@arm.com wrote:
On 05/25/2020 04:56 PM, Mathieu Poirier wrote:
On Fri, 22 May 2020 at 09:30, Suzuki K Poulose suzuki.poulose@arm.com wrote:
On 05/08/2020 04:43 PM, Mathieu Poirier wrote:
On Fri, 8 May 2020 at 00:11, Leo Yan <leo.yan@linaro.org
I'm joining this thread on the late and as such don't have all the context. But sysfs is never a good place to export run-time configuration registers as there could be multiple perf sessions active at the same time. This isn't a big issue for N:1 topologies but very real for 1:1.
I don't see how this affected by the sink topology, as this is an ETM configuration.
By sysfs, I mean, not necessarily under /sys/bus/coreight/.../etmN/
but, under :
/sys/bus/event_source/.../cs_etm/
We already expose various different information there. e.g, nr_addr_filters, sinks/
So we could expose something like :
trcconfigr/
and could describe how each of the individual event configs affect the TRCCONFIGR register. So that the perf tool can construct the TRCCONFIGR register from scratch just by looking at the trcconfigr directory.
e.g,
# cd /sys/bus/event_source/devices/cs_etm/ # for f in $(find format -type f); do echo $(basename $f) - $(cat $f); done cycacc - config:12 sinkid - config2:0-31 contextid - config:14 retstack - config:29 timestamp - config:28
$ for f in $(find trcconfigr -type f); do echo $(basename $f) - $(cat $f); done cycacc - 0x10 // Bit 4 timestamp - 0x800 // Bit 11 contextid - 0x8080 // Bit15 | Bit 7 On EL2 kernels - 0x40 // Bit 6 on EL1 kernels retstack - 0x1000 // Bit12
So those are configurations that apply to specific trace sessions. When dealing with concurrent trace sessions there is no way to guarantee the information exported to sysfs applies to the trace session that would read the information.
I think there is a bit of confusion here; The idea is the perf tool will construct the value of the traceconfigr based on the "options" selected by the user, just like it does now for "config1"/"config2"
i.e,
perf record -e cs_etm/cycacc,timestamp/ blah-blah
perf tool will :
map cycacc -> config1 |= BIT(12) and trcconfigr |= BIT($cat /sys/.../trcconfigr/cycacc)
and similarly for all the options. The values under the "trcconfigr" are static and only maps a given attribute to the traceconfig register bit.
Thus, any session can create the TRCCONFIGR accurately based on the events now, by dynamically looking up under the sysfs, rather than "compile time" static assumption of the the ETM model.
So how does this work when I copy the captured data off my target and try to decode offline?
Just like it works now. There is no change in flow for the perf tool. The only difference, how the perf tool would emit the TRCCONFIGR values in the perf.data samples. Instead of assuming that perf_event.attr.config1.CID => TRCCONFIGR[bit6], use the sysfs exported values to figure out the TRCCONFIGR setting.
Suzuki
Cc: Mathieu, Al, Leo Anshuman
Please let me know what you think about the proposal below.
On 09/16/2020 10:19 AM, Suzuki K Poulose wrote:
Hi all,
Resurrecting an old thread. See more below. Short summary:
With the kernel running at EL2, the PID is in CONTEXTIDR_EL2 and the normal CONTEXTID tracing (which uses CONTEXTIDR_EL1) is not useful. The solution is to use VMID (with CONTEXTIDR_EL2 emitted as VMID) for the PID on an EL2 kernel.
This needs to be communicated to the perf tool somehow, so that it can decode the trace appropriately. One of the proposal was to expose the TRCCONFIGR bits for each of the individual event->attr.config options. I have hacked up a patch [0] to do this and here are two ways to expose this :
Option 1: Trace Config exposed matching the format name:
$ pwd /sys/bus/event_source/devices/cs_etm
$ ls trace_config/ contextid cycacc retstack timestamp cs_etm# for f in $(find trace_config -type f) do echo $(basename $f) : $(cat $f) done
timestamp : 0x800 contextid : 0x40 cycacc : 0x10 retstack : 0x1000
Disadvantages : It is really tricky for the perf tool figure out the name of the "config" option from the event->attr.config bits. As the parsing of the format is not exported elsewhere (or I didn't look deep enough)
Adding a bit more clarity here : The perf tool needs to know the name of the "config" option for a given bit position to decode the traceconfig. i.e, if event->attr.config == BIT(12) | BIT(14), The perf tool now needs to go back and look at
/sys/bus/event_sources/../cs_etm/format/
and figure out the name corresponding to BIT(12) and BIT(14). Since the initial conversion of the name to config bit is performed by the perf tool parsing logic, a reverse mapping is not available to cs_etm code. Unless we hard code BIT(12) => cycacc in the perf tool, which is something we are trying to avoid to support the future changes seemlessly.
Option 2: Trace Config exposed as the "bit" positions.
$ ls trace_config/ 12 14 28 29
$ for f in $(find . -type f) do echo $(basename $f) : $(cat $f) done
28 : 0x800 14 : 0x40 12 : 0x10
29 : 0x1000
Advantage: The perf tool can iterate over the event->attr.config bits and do something like:
trconfigr = 0
for_each_set_bit(i, event->attr.config) trconfigr |= sysfs_read_trace_config(i)
where:
sysfs_read_trace_config(int i) { char path[MAX_LEN]; FILE *fp; unsigned long trcfg = 0;
sprintf(path, "%s/%d", $TRACE_CONFIG_PATH, i); fp = fopen(path, "r"); if (fp) fscanf(fp, "%lx", &trcfg); return trcfg; }
Disadvantage: It is not human reader friendly. But it doesn't have to be, as it is for the tools to construct the trcconfigr.
Option 3:
Another completely different option is to fixup the TRCIDR2.CID = 0, while running in VHE, via sysfs. This will indicate that the CONTEXTID tracing is not supported by the ETM, but the VMID support should be visible. This can be used by the perf-tool to detect that the VMID is used as the PID.
Thoughts ?
Suzuki
[0] ---- Expose trace config via sysfs
HACK: coresight: etm4x: Expose trace config via sysfs
Not-yet-signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c index 9d61a71da96f..58e86267fa37 100644 --- a/drivers/hwtracing/coresight/coresight-etm-perf.c +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c @@ -35,6 +35,55 @@ PMU_FORMAT_ATTR(retstack, "config:" __stringify(ETM_OPT_RETSTK)); /* Sink ID - same for all ETMs */ PMU_FORMAT_ATTR(sinkid, "config2:0-31");
+static struct etm_trace_config_map *etm_cfg_map;
+#define ETM_TRCONFIGR_ATTR(_config) \ + &((struct dev_ext_attribute []) { \ + { \ + __ATTR(_config, S_IRUGO, etm_trace_config_show, NULL), \ + (void *)_config \ + } \ + })[0].attr.attr
+static unsigned long etm_trace_config_map_cfg(unsigned long cfg) +{ + struct etm_trace_config_map *map = etm_cfg_map;
+ if (!map) + return 0;
+ for (; map->cfg; map++) + if (map->cfg == cfg) + return map->trcfg; + return 0; +}
+static ssize_t etm_trace_config_show(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct dev_ext_attribute *eattr; + unsigned long cfg;
+ eattr = container_of(attr, struct dev_ext_attribute, attr); + cfg = (unsigned long)eattr->var;
+ return snprintf(buf, PAGE_SIZE, "0x%lx\n", etm_trace_config_map_cfg(cfg)); +}
+static struct attribute *etm_pmu_trace_config_attr[] = { + ETM_TRCONFIGR_ATTR(ETM_OPT_CYCACC), + ETM_TRCONFIGR_ATTR(ETM_OPT_CTXTID), + ETM_TRCONFIGR_ATTR(ETM_OPT_TS), + ETM_TRCONFIGR_ATTR(ETM_OPT_RETSTK), + NULL, +};
+static const struct attribute_group etm_pmu_trace_config_group = { + .name = "trace_config", + .attrs = etm_pmu_trace_config_attr, +};
static struct attribute *etm_config_formats_attr[] = { &format_attr_cycacc.attr, &format_attr_contextid.attr, @@ -61,6 +110,7 @@ static const struct attribute_group etm_pmu_sinks_group = { static const struct attribute_group *etm_pmu_attr_groups[] = { &etm_pmu_format_group, &etm_pmu_sinks_group, + &etm_pmu_trace_config_group, NULL, };
@@ -600,6 +650,12 @@ void etm_perf_del_symlink_sink(struct coresight_device *csdev) csdev->ea = NULL; }
+void etm_install_trace_config_map(struct etm_trace_config_map *cfg_map) +{ + if (!etm_cfg_map && cfg_map) + etm_cfg_map = cfg_map; +}
static int __init etm_perf_init(void) { int ret; diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.h b/drivers/hwtracing/coresight/coresight-etm-perf.h index 015213abe00a..01679988debb 100644 --- a/drivers/hwtracing/coresight/coresight-etm-perf.h +++ b/drivers/hwtracing/coresight/coresight-etm-perf.h @@ -57,6 +57,16 @@ struct etm_event_data { struct list_head * __percpu *path; };
+/**
- struct etm_trace_config_map - Map perf config to TRCONFIGR bits for userspace
- @cfg: event->attr.config value
- @trcfg: TRCONFIGR setting for @cfg
- */
+struct etm_trace_config_map { + unsigned long cfg; + unsigned long trcfg; +};
#ifdef CONFIG_CORESIGHT int etm_perf_symlink(struct coresight_device *csdev, bool link); int etm_perf_add_symlink_sink(struct coresight_device *csdev); @@ -69,6 +79,8 @@ static inline void *etm_perf_sink_config(struct perf_output_handle *handle) return data->snk_config; return NULL; }
+void etm_install_trace_config_map(struct etm_trace_config_map *cfg_map); #else static inline int etm_perf_symlink(struct coresight_device *csdev, bool link) { return -EINVAL; } @@ -80,6 +92,7 @@ static inline void *etm_perf_sink_config(struct perf_output_handle *handle) return NULL; }
+static void etm_install_trace_config_map(struct etm_trace_config_map *cfg_map) {} #endif /* CONFIG_CORESIGHT */
#endif diff --git a/drivers/hwtracing/coresight/coresight-etm4x.c b/drivers/hwtracing/coresight/coresight-etm4x.c index 00c9f0bb8b1a..e528afc9a56e 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x.c +++ b/drivers/hwtracing/coresight/coresight-etm4x.c @@ -1591,6 +1591,15 @@ static struct amba_driver etm4x_driver = { .id_table = etm4_ids, };
+#define ETM4x_TRACE_CONFIG_MAP_SIZE 5 +static struct etm_trace_config_map etm4x_map[ETM4x_TRACE_CONFIG_MAP_SIZE] = { + { ETM_OPT_CTXTID, 0 }, /* filled in dynamically for ETM_OPT_CTXTID */ + { ETM_OPT_CYCACC, BIT(4) }, + { ETM_OPT_TS, BIT(11) }, + { ETM_OPT_RETSTK, BIT(12) }, + { 0, 0 }, +};
static int __init etm4x_init(void) { int ret; @@ -1605,6 +1614,10 @@ static int __init etm4x_init(void) if (ret) { pr_err("Error registering etm4x driver\n"); etm4_pm_clear(); + } else { + etm4x_map[0].trcfg = is_kernel_in_hyp_mode() ? + (BIT(7) | BIT(15)) : BIT(6); + etm_install_trace_config_map(&etm4x_map[0]); }
return ret;
On 05/27/2020 10:20 AM, Suzuki K Poulose wrote:
On 05/26/2020 05:33 PM, Mike Leach wrote:
Hi,
On Tue, 26 May 2020 at 17:21, Suzuki K Poulose suzuki.poulose@arm.com wrote:
On 05/25/2020 04:56 PM, Mathieu Poirier wrote:
On Fri, 22 May 2020 at 09:30, Suzuki K Poulose suzuki.poulose@arm.com wrote:
On 05/08/2020 04:43 PM, Mathieu Poirier wrote: > > > On Fri, 8 May 2020 at 00:11, Leo Yan <leo.yan@linaro.org
> I'm joining this thread on the late and as such don't have all the context. > But sysfs is never a good place to export run-time configuration registers as > there could be multiple perf sessions active at the same time. This isn't a > big issue for N:1 topologies but very real for 1:1.
I don't see how this affected by the sink topology, as this is an ETM configuration.
By sysfs, I mean, not necessarily under /sys/bus/coreight/.../etmN/
but, under :
/sys/bus/event_source/.../cs_etm/
We already expose various different information there. e.g, nr_addr_filters, sinks/
So we could expose something like :
trcconfigr/
and could describe how each of the individual event configs affect the TRCCONFIGR register. So that the perf tool can construct the TRCCONFIGR register from scratch just by looking at the trcconfigr directory.
e.g,
# cd /sys/bus/event_source/devices/cs_etm/ # for f in $(find format -type f); do echo $(basename $f) - $(cat $f); done cycacc - config:12 sinkid - config2:0-31 contextid - config:14 retstack - config:29 timestamp - config:28
$ for f in $(find trcconfigr -type f); do echo $(basename $f) - $(cat $f); done cycacc - 0x10 // Bit 4 timestamp - 0x800 // Bit 11 contextid - 0x8080 // Bit15 | Bit 7 On EL2 kernels - 0x40 // Bit 6 on EL1 kernels retstack - 0x1000 // Bit12
So those are configurations that apply to specific trace sessions. When dealing with concurrent trace sessions there is no way to guarantee the information exported to sysfs applies to the trace session that would read the information.
I think there is a bit of confusion here; The idea is the perf tool will construct the value of the traceconfigr based on the "options" selected by the user, just like it does now for "config1"/"config2"
i.e,
perf record -e cs_etm/cycacc,timestamp/ blah-blah
perf tool will :
map cycacc -> config1 |= BIT(12) and trcconfigr |= BIT($cat /sys/.../trcconfigr/cycacc)
and similarly for all the options. The values under the "trcconfigr" are static and only maps a given attribute to the traceconfig register bit.
Thus, any session can create the TRCCONFIGR accurately based on the events now, by dynamically looking up under the sysfs, rather than "compile time" static assumption of the the ETM model.
So how does this work when I copy the captured data off my target and try to decode offline?
Just like it works now. There is no change in flow for the perf tool. The only difference, how the perf tool would emit the TRCCONFIGR values in the perf.data samples. Instead of assuming that perf_event.attr.config1.CID => TRCCONFIGR[bit6], use the sysfs exported values to figure out the TRCCONFIGR setting.
Suzuki
CoreSight mailing list CoreSight@lists.linaro.org https://lists.linaro.org/mailman/listinfo/coresight
Option 1: Trace Config exposed matching the format name:
$ pwd /sys/bus/event_source/devices/cs_etm
$ ls trace_config/ contextid cycacc retstack timestamp cs_etm# for f in $(find trace_config -type f) do echo $(basename $f) : $(cat $f) done
timestamp : 0x800 contextid : 0x40 cycacc : 0x10 retstack : 0x1000
Disadvantages : It is really tricky for the perf tool figure out the name of the "config" option from the event->attr.config bits. As the parsing of the format is not exported elsewhere (or I didn't look deep enough)
I don't like this. TRCCONFIG is a dynamic register whose value at any given time is specific to a perf session. Having it exposed in sysfs means that sysfs now exposes data about any number of perf sessions, just so that each instance of "perf record" can read data about its own session - and hopefully read the right data at the right time. It might work but it's not very clean.
Option 2: Trace Config exposed as the "bit" positions.
$ ls trace_config/ 12 14 28 29
$ for f in $(find . -type f) do echo $(basename $f) : $(cat $f) done
28 : 0x800 14 : 0x40 12 : 0x10
29 : 0x1000
Advantage: The perf tool can iterate over the event->attr.config bits and do something like:
trconfigr = 0
for_each_set_bit(i, event->attr.config) trconfigr |= sysfs_read_trace_config(i)
where:
sysfs_read_trace_config(int i) { char path[MAX_LEN]; FILE *fp; unsigned long trcfg = 0;
sprintf(path, "%s/%d", $TRACE_CONFIG_PATH, i); fp = fopen(path, "r"); if (fp) fscanf(fp, "%lx", &trcfg); return trcfg; }
Disadvantage: It is not human reader friendly. But it doesn't have to be, as it is for the tools to construct the trcconfigr.
Same problem as above.
Option 3:
Another completely different option is to fixup the TRCIDR2.CID = 0, while running in VHE, via sysfs. This will indicate that the CONTEXTID tracing is not supported by the ETM, but the VMID support should be visible. This can be used by the perf-tool to detect that the VMID is used as the PID.
I don't like misleading the user about ID registers, but that does kind of work.
Thoughts ?
Another solution would be a specific immutable sysfs entry for the ETM, for the sole purpose of saying "I'm running at EL2 and will put the process id in VMID". Then "perf record" can read that. The question is then how it should put that in perf.data. It could either use TRCCONFIGR (which it's already faking) or TRCIDR2 as you suggest, but a more direct solution would be to use one of the spare header flags in PERF_RECORD_AUXDATA.
Would it be possible to reserve a few flags in the capabilities area of the main mmap buffer, to indicate "flags that tell you about the AUX data"? That would be another way for the kernel to tell userspace what it's doing.
Al
Suzuki
[0] ---- Expose trace config via sysfs
HACK: coresight: etm4x: Expose trace config via sysfs
Not-yet-signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c index 9d61a71da96f..58e86267fa37 100644 --- a/drivers/hwtracing/coresight/coresight-etm-perf.c +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c @@ -35,6 +35,55 @@ PMU_FORMAT_ATTR(retstack, "config:" __stringify(ETM_OPT_RETSTK)); /* Sink ID - same for all ETMs */ PMU_FORMAT_ATTR(sinkid, "config2:0-31");
+static struct etm_trace_config_map *etm_cfg_map;
+#define ETM_TRCONFIGR_ATTR(_config) \
- &((struct dev_ext_attribute []) { \
{ \
__ATTR(_config, S_IRUGO, etm_trace_config_show,
NULL), \
(void *)_config \
} \
- })[0].attr.attr
+static unsigned long etm_trace_config_map_cfg(unsigned long cfg) {
- struct etm_trace_config_map *map = etm_cfg_map;
- if (!map)
return 0;
- for (; map->cfg; map++)
if (map->cfg == cfg)
return map->trcfg;
- return 0;
+}
+static ssize_t etm_trace_config_show(struct device *dev,
struct device_attribute *attr,
char *buf)
+{
- struct dev_ext_attribute *eattr;
- unsigned long cfg;
- eattr = container_of(attr, struct dev_ext_attribute, attr);
- cfg = (unsigned long)eattr->var;
- return snprintf(buf, PAGE_SIZE, "0x%lx\n",
+etm_trace_config_map_cfg(cfg)); }
+static struct attribute *etm_pmu_trace_config_attr[] = {
- ETM_TRCONFIGR_ATTR(ETM_OPT_CYCACC),
- ETM_TRCONFIGR_ATTR(ETM_OPT_CTXTID),
- ETM_TRCONFIGR_ATTR(ETM_OPT_TS),
- ETM_TRCONFIGR_ATTR(ETM_OPT_RETSTK),
- NULL,
+};
+static const struct attribute_group etm_pmu_trace_config_group = {
- .name = "trace_config",
- .attrs = etm_pmu_trace_config_attr,
+};
- static struct attribute *etm_config_formats_attr[] = { &format_attr_cycacc.attr, &format_attr_contextid.attr,
@@ -61,6 +110,7 @@ static const struct attribute_group etm_pmu_sinks_group = { static const struct attribute_group *etm_pmu_attr_groups[] = { &etm_pmu_format_group, &etm_pmu_sinks_group,
- &etm_pmu_trace_config_group, NULL, };
@@ -600,6 +650,12 @@ void etm_perf_del_symlink_sink(struct coresight_device *csdev) csdev->ea = NULL; }
+void etm_install_trace_config_map(struct etm_trace_config_map *cfg_map) +{
- if (!etm_cfg_map && cfg_map)
etm_cfg_map = cfg_map;
+}
- static int __init etm_perf_init(void) { int ret;
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.h b/drivers/hwtracing/coresight/coresight-etm-perf.h index 015213abe00a..01679988debb 100644 --- a/drivers/hwtracing/coresight/coresight-etm-perf.h +++ b/drivers/hwtracing/coresight/coresight-etm-perf.h @@ -57,6 +57,16 @@ struct etm_event_data { struct list_head * __percpu *path; };
+/**
- struct etm_trace_config_map - Map perf config to TRCONFIGR bits for
userspace
- @cfg: event->attr.config value
- @trcfg: TRCONFIGR setting for @cfg
- */
+struct etm_trace_config_map {
- unsigned long cfg;
- unsigned long trcfg;
+};
- #ifdef CONFIG_CORESIGHT int etm_perf_symlink(struct coresight_device *csdev, bool link); int etm_perf_add_symlink_sink(struct coresight_device *csdev); @@ -69,6
+79,8 @@ static inline void *etm_perf_sink_config(struct perf_output_handle *handle) return data->snk_config; return NULL; }
+void etm_install_trace_config_map(struct etm_trace_config_map +*cfg_map); #else static inline int etm_perf_symlink(struct coresight_device *csdev, bool link) { return -EINVAL; } @@ -80,6 +92,7 @@ static inline void *etm_perf_sink_config(struct perf_output_handle *handle) return NULL; }
+static void etm_install_trace_config_map(struct etm_trace_config_map +*cfg_map) {} #endif /* CONFIG_CORESIGHT */
#endif diff --git a/drivers/hwtracing/coresight/coresight-etm4x.c b/drivers/hwtracing/coresight/coresight-etm4x.c index 00c9f0bb8b1a..e528afc9a56e 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x.c +++ b/drivers/hwtracing/coresight/coresight-etm4x.c @@ -1591,6 +1591,15 @@ static struct amba_driver etm4x_driver = { .id_table = etm4_ids, };
+#define ETM4x_TRACE_CONFIG_MAP_SIZE 5 +static struct etm_trace_config_map etm4x_map[ETM4x_TRACE_CONFIG_MAP_SIZE] = {
- { ETM_OPT_CTXTID, 0 }, /* filled in dynamically for
ETM_OPT_CTXTID */
- { ETM_OPT_CYCACC, BIT(4) },
- { ETM_OPT_TS, BIT(11) },
- { ETM_OPT_RETSTK, BIT(12) },
- { 0, 0 },
+};
- static int __init etm4x_init(void) { int ret;
@@ -1605,6 +1614,10 @@ static int __init etm4x_init(void) if (ret) { pr_err("Error registering etm4x driver\n"); etm4_pm_clear();
} else {
etm4x_map[0].trcfg = is_kernel_in_hyp_mode() ?
(BIT(7) | BIT(15)) : BIT(6);
etm_install_trace_config_map(&etm4x_map[0]);
}
return ret;
On 05/27/2020 10:20 AM, Suzuki K Poulose wrote:
On 05/26/2020 05:33 PM, Mike Leach wrote:
Hi,
On Tue, 26 May 2020 at 17:21, Suzuki K Poulose suzuki.poulose@arm.com
wrote:
On 05/25/2020 04:56 PM, Mathieu Poirier wrote:
On Fri, 22 May 2020 at 09:30, Suzuki K Poulose suzuki.poulose@arm.com
wrote:
On 05/08/2020 04:43 PM, Mathieu Poirier wrote: > > > On Fri, 8 May 2020 at 00:11, Leo Yan <leo.yan@linaro.org
> I'm joining this thread on the late and as such don't have all the context. > But sysfs is never a good place to export run-time configuration > registers as there could be multiple perf sessions active at the > same time. This isn't a big issue for N:1 topologies but very real for 1:1.
I don't see how this affected by the sink topology, as this is an ETM configuration.
By sysfs, I mean, not necessarily under /sys/bus/coreight/.../etmN/
but, under :
/sys/bus/event_source/.../cs_etm/
We already expose various different information there. e.g, nr_addr_filters, sinks/
So we could expose something like :
trcconfigr/
and could describe how each of the individual event configs affect the TRCCONFIGR register. So that the perf tool can construct the TRCCONFIGR register from scratch just by looking at the trcconfigr
directory.
e.g,
# cd /sys/bus/event_source/devices/cs_etm/ # for f in $(find format -type f); do echo $(basename $f) - $(cat $f); done cycacc - config:12 sinkid - config2:0-31 contextid - config:14 retstack - config:29 timestamp
- config:28
$ for f in $(find trcconfigr -type f); do echo $(basename $f) - $(cat $f); done cycacc - 0x10 // Bit 4 timestamp - 0x800 // Bit 11 contextid - 0x8080 // Bit15 | Bit 7 On EL2 kernels - 0x40 // Bit 6 on EL1 kernels retstack - 0x1000 // Bit12
So those are configurations that apply to specific trace sessions. When dealing with concurrent trace sessions there is no way to guarantee the information exported to sysfs applies to the trace session
that would read the information.
I think there is a bit of confusion here; The idea is the perf tool will construct the value of the traceconfigr based on the "options" selected by the user, just like it does now for "config1"/"config2"
i.e,
perf record -e cs_etm/cycacc,timestamp/ blah-blah
perf tool will :
map cycacc -> config1 |= BIT(12) and trcconfigr |= BIT($cat /sys/.../trcconfigr/cycacc)
and similarly for all the options. The values under the "trcconfigr" are static and only maps a given attribute to the traceconfig register bit.
Thus, any session can create the TRCCONFIGR accurately based on the events now, by dynamically looking up under the sysfs, rather than "compile time" static assumption of the the ETM model.
So how does this work when I copy the captured data off my target and try to decode offline?
Just like it works now. There is no change in flow for the perf tool. The only difference, how the perf tool would emit the TRCCONFIGR values in the perf.data samples. Instead of assuming that perf_event.attr.config1.CID => TRCCONFIGR[bit6], use the sysfs exported
values to figure out the TRCCONFIGR setting.
Suzuki
CoreSight mailing list CoreSight@lists.linaro.org https://lists.linaro.org/mailman/listinfo/coresight
On 09/16/2020 10:58 AM, Al Grant wrote:
Option 1: Trace Config exposed matching the format name:
$ pwd /sys/bus/event_source/devices/cs_etm
$ ls trace_config/ contextid cycacc retstack timestamp cs_etm# for f in $(find trace_config -type f) do echo $(basename $f) : $(cat $f) done
timestamp : 0x800 contextid : 0x40 cycacc : 0x10 retstack : 0x1000
Disadvantages : It is really tricky for the perf tool figure out the name of the "config" option from the event->attr.config bits. As the parsing of the format is not exported elsewhere (or I didn't look deep enough)
I don't like this. TRCCONFIG is a dynamic register whose value at any given time is specific to a perf session. Having it exposed in sysfs means that sysfs
Right, we are not exposing the TRCCONFIG for the *current session* (irrespective of what is running). This is a way for the perf tool to compute the TRCCONFIGR for a given event. The tool will consult the exposed files to construct the TRCCONFIGR (see the code below for e.g,). None of the values exposed are complete TRCCONFIGR values anyways.
now exposes data about any number of perf sessions, just so that each instance of "perf record" can read data about its own session - and hopefully read the right data at the right time. It might work but it's not very clean.
Option 2: Trace Config exposed as the "bit" positions.
$ ls trace_config/ 12 14 28 29
$ for f in $(find . -type f) do echo $(basename $f) : $(cat $f) done
28 : 0x800 14 : 0x40 12 : 0x10
29 : 0x1000
Advantage: The perf tool can iterate over the event->attr.config bits and do something like:
trconfigr = 0
for_each_set_bit(i, event->attr.config) trconfigr |= sysfs_read_trace_config(i)
where:
sysfs_read_trace_config(int i) { char path[MAX_LEN]; FILE *fp; unsigned long trcfg = 0;
sprintf(path, "%s/%d", $TRACE_CONFIG_PATH, i); fp = fopen(path, "r"); if (fp) fscanf(fp, "%lx", &trcfg); return trcfg; }
That is how this is going to be consumed. Again, the values exposed are not the ETM current configuration.
Suzuki
On 09/16/2020 11:16 AM, Suzuki K Poulose wrote:
On 09/16/2020 10:58 AM, Al Grant wrote:
Option 1: Trace Config exposed matching the format name:
$ pwd /sys/bus/event_source/devices/cs_etm
Please see ^^^
$ ls trace_config/ contextid cycacc retstack timestamp cs_etm# for f in $(find trace_config -type f) do echo $(basename $f) : $(cat $f) done
timestamp : 0x800 contextid : 0x40 cycacc : 0x10 retstack : 0x1000
Disadvantages : It is really tricky for the perf tool figure out the name of the "config" option from the event->attr.config bits. As the parsing of the format is not exported elsewhere (or I didn't look deep enough)
I don't like this. TRCCONFIG is a dynamic register whose value at any given time is specific to a perf session. Having it exposed in sysfs means that sysfs
Right, we are not exposing the TRCCONFIG for the *current session* (irrespective of what is running). This is a way for the perf tool to compute the TRCCONFIGR for a given event. The tool will consult the exposed files to construct the TRCCONFIGR (see the code below for e.g,). None of the values exposed are complete TRCCONFIGR values anyways.
Please note that these new files are *not exposed* per ETM. Rather this is exposed via the cs_etm PMU node under sysfs (there is only *one* on the system). Also these values are only specific to the perf mode, where we *map a single* attribute config to the *TRCCONFIGR* value.
now exposes data about any number of perf sessions, just so that each instance of "perf record" can read data about its own session - and hopefully read the right data at the right time. It might work but it's not very clean.
Option 2: Trace Config exposed as the "bit" positions.
$ ls trace_config/ 12 14 28 29
$ for f in $(find . -type f) do echo $(basename $f) : $(cat $f) done
28 : 0x800 14 : 0x40 12 : 0x10
29 : 0x1000
Advantage: The perf tool can iterate over the event->attr.config bits and do something like:
trconfigr = 0
for_each_set_bit(i, event->attr.config) trconfigr |= sysfs_read_trace_config(i)
where:
sysfs_read_trace_config(int i) { char path[MAX_LEN]; FILE *fp; unsigned long trcfg = 0;
sprintf(path, "%s/%d", $TRACE_CONFIG_PATH, i); fp = fopen(path, "r"); if (fp) fscanf(fp, "%lx", &trcfg); return trcfg; }
That is how this is going to be consumed. Again, the values exposed are not the ETM current configuration.
Suzuki
Suzuki
On Wed, Sep 16, 2020 at 09:58:52AM +0000, Al Grant wrote:
[...]
Thoughts ?
Another solution would be a specific immutable sysfs entry for the ETM, for the sole purpose of saying "I'm running at EL2 and will put the process id in VMID". Then "perf record" can read that. The question is then how it should put that in perf.data. It could either use TRCCONFIGR (which it's already faking) or TRCIDR2 as you suggest, but a more direct solution would be to use one of the spare header flags in PERF_RECORD_AUXDATA.
How about add a new node (e.g. "kern_el") under the sysfs?
/sys/bus/coresight/devices/etmX/mgmt/kern_el
perf tool can add a new entry in its metadata for "kern_el" when record, and when decode the CoreSight trace data, we can firstly get to know the kernel exception level from the metadata, and then based on this info to decide PID value from "context.context_id" or "context.vmid".
This approach seems to me is more directive than other options.
Would it be possible to reserve a few flags in the capabilities area of the main mmap buffer, to indicate "flags that tell you about the AUX data"? That would be another way for the kernel to tell userspace what it's doing.
I think Al suggests to change the data structure perf_event_mmap_page like below:
struct perf_event_mmap_page {
union { __u64 capabilities; struct { __u64 cap_bit0 : 1, /* Always 0, deprecated, see commit 860f085b74e9 */ cap_bit0_is_deprecated : 1, /* Always 1, signals that bit 0 is zero */
cap_user_rdpmc : 1, /* The RDPMC instruction can be used to read counts */ cap_user_time : 1, /* The time_{shift,mult,offset} fields are used */ cap_user_time_zero : 1, /* The time_zero field is used */ cap_user_time_short : 1, /* the time_{cycle,mask} fields are used */
cap_vhe : 1, /* System supports VHE or not */ cap_____res : 57; }; };
IMHO, I have concern for this since the capability 'cap_vhe' is specific for Arm64 arch and it cannot be commonly applied on other archs.
Thanks, Leo
-----Original Message----- From: Leo Yan leo.yan@linaro.org Sent: 18 September 2020 11:52 To: Al Grant Al.Grant@arm.com; Mathieu Poirier mathieu.poirier@linaro.org Cc: Suzuki Poulose Suzuki.Poulose@arm.com; mike.leach@linaro.org; coresight@lists.linaro.org Subject: Re: Context id tracing with kernel at EL2
On Wed, Sep 16, 2020 at 09:58:52AM +0000, Al Grant wrote:
[...]
Thoughts ?
Another solution would be a specific immutable sysfs entry for the ETM, for the sole purpose of saying "I'm running at EL2 and will put the
process id in VMID".
Then "perf record" can read that. The question is then how it should put that in perf.data. It could either use TRCCONFIGR (which it's already faking) or TRCIDR2 as you suggest, but a more direct solution would be to use one of the spare header flags in PERF_RECORD_AUXDATA.
How about add a new node (e.g. "kern_el") under the sysfs?
/sys/bus/coresight/devices/etmX/mgmt/kern_el
perf tool can add a new entry in its metadata for "kern_el" when record, and when decode the CoreSight trace data, we can firstly get to know the kernel exception level from the metadata, and then based on this info to decide PID value from "context.context_id" or "context.vmid".
This approach seems to me is more directive than other options.
Would it be possible to reserve a few flags in the capabilities area of the main mmap buffer, to indicate "flags that tell you about the AUX data"? That would be another way for the kernel to tell userspace what
it's doing.
I think Al suggests to change the data structure perf_event_mmap_page like below:
struct perf_event_mmap_page {
union { __u64 capabilities; struct { __u64 cap_bit0 : 1, /* Always 0, deprecated, see commit
860f085b74e9 */ cap_bit0_is_deprecated : 1, /* Always 1, signals that bit 0 is zero */
cap_user_rdpmc : 1, /* The RDPMC instruction can be used to
read counts */ cap_user_time : 1, /* The time_{shift,mult,offset} fields are used */ cap_user_time_zero : 1, /* The time_zero field is used */ cap_user_time_short : 1, /* the time_{cycle,mask} fields are used */
cap_vhe : 1, /* System supports VHE or not */ cap_____res : 57; }; };
IMHO, I have concern for this since the capability 'cap_vhe' is specific for Arm64 arch and it cannot be commonly applied on other archs.
I was thinking more that the mmap header would have something like
cap_eventflag1:1 cap_eventflag2:1
allocated for event-specific flags. Then those flags can be used for meanings specific to event. The same goes for flags in the perf record headers - if a handful of flags are reserved for event-specific use, their meanings can be different for different event types.
Al
Thanks, Leo
On Fri, Sep 18, 2020 at 10:29:01PM +0000, Al Grant wrote:
[...]
Would it be possible to reserve a few flags in the capabilities area of the main mmap buffer, to indicate "flags that tell you about the AUX data"? That would be another way for the kernel to tell userspace what
it's doing.
I think Al suggests to change the data structure perf_event_mmap_page like below:
struct perf_event_mmap_page {
union { __u64 capabilities; struct { __u64 cap_bit0 : 1, /* Always 0, deprecated, see commit
860f085b74e9 */ cap_bit0_is_deprecated : 1, /* Always 1, signals that bit 0 is zero */
cap_user_rdpmc : 1, /* The RDPMC instruction can be used to
read counts */ cap_user_time : 1, /* The time_{shift,mult,offset} fields are used */ cap_user_time_zero : 1, /* The time_zero field is used */ cap_user_time_short : 1, /* the time_{cycle,mask} fields are used */
cap_vhe : 1, /* System supports VHE or not */ cap_____res : 57; }; };
IMHO, I have concern for this since the capability 'cap_vhe' is specific for Arm64 arch and it cannot be commonly applied on other archs.
I was thinking more that the mmap header would have something like
cap_eventflag1:1 cap_eventflag2:1
allocated for event-specific flags. Then those flags can be used for meanings specific to event. The same goes for flags in the perf record headers - if a handful of flags are reserved for event-specific use, their meanings can be different for different event types.
This approach might be difficult to extend, especially if one perf session enabls multiple events (and it's very flexiable for event numbers), so it leads to unlimited event flags.
Another potential issue is how to sent these event flags in the kernel. One way is to use a central place to set these event flags, like in the function arch_perf_update_userpage(), but these means this function needs to know every events ID and their associated event flags; another way is to spread event flag setting in individual PMU drivers, this brings a requirement for program flow: the event flag setting in PMU driver must be prior to reading mmap page in the perf tool.
Thanks Leo
On 09/21/2020 04:36 AM, Leo Yan wrote:
On Fri, Sep 18, 2020 at 10:29:01PM +0000, Al Grant wrote:
[...]
Would it be possible to reserve a few flags in the capabilities area of the main mmap buffer, to indicate "flags that tell you about the AUX data"? That would be another way for the kernel to tell userspace what
it's doing.
I think Al suggests to change the data structure perf_event_mmap_page like below:
struct perf_event_mmap_page {
union { __u64 capabilities; struct { __u64 cap_bit0 : 1, /* Always 0, deprecated, see commit
860f085b74e9 */ cap_bit0_is_deprecated : 1, /* Always 1, signals that bit 0 is zero */
cap_user_rdpmc : 1, /* The RDPMC instruction can be used to
read counts */ cap_user_time : 1, /* The time_{shift,mult,offset} fields are used */ cap_user_time_zero : 1, /* The time_zero field is used */ cap_user_time_short : 1, /* the time_{cycle,mask} fields are used */
cap_vhe : 1, /* System supports VHE or not */ cap_____res : 57; }; };
IMHO, I have concern for this since the capability 'cap_vhe' is specific for Arm64 arch and it cannot be commonly applied on other archs.
I was thinking more that the mmap header would have something like
cap_eventflag1:1 cap_eventflag2:1
allocated for event-specific flags. Then those flags can be used for meanings specific to event. The same goes for flags in the perf record headers - if a handful of flags are reserved for event-specific use, their meanings can be different for different event types.
This approach might be difficult to extend, especially if one perf session enabls multiple events (and it's very flexiable for event numbers), so it leads to unlimited event flags.
I don't prefer this approach either. Morever, this particular change is static to a booted kernel and has nothing to do with a single event configuration.
Another option might be to flip the config bits for contextid for the kernel based on the EL to a new flag for EL2 and use the existing one for EL1 kernels.
root@kernel-at-EL1:~# cat /sys/bus/event_source/devices/cs_etm/format/contextid config:14 root@kernel-at-EL2~:# cat /sys/bus/event_source/devices/cs_etm/format/contextid config:15
#define ETM_OPT_CONTEXTID 14 #define ETM_OPT_CONTEXTID_IN_VMID 15
Suzuki
Good day Suzuki,
On Wed, 16 Sep 2020 at 03:15, Suzuki K Poulose suzuki.poulose@arm.com wrote:
Hi all,
Resurrecting an old thread. See more below. Short summary:
I would rather address ongoing "big ticket" items such as Mike's complex configuration patchset and your work on ETMv4.4 system instructions before looking into another thorny issue. Needless to say that Tingwei's modularisation has to be dealt with before anything else but that one is pretty much done now.
Thanks, Mathieu
With the kernel running at EL2, the PID is in CONTEXTIDR_EL2 and the normal CONTEXTID tracing (which uses CONTEXTIDR_EL1) is not useful. The solution is to use VMID (with CONTEXTIDR_EL2 emitted as VMID) for the PID on an EL2 kernel.
This needs to be communicated to the perf tool somehow, so that it can decode the trace appropriately. One of the proposal was to expose the TRCCONFIGR bits for each of the individual event->attr.config options. I have hacked up a patch [0] to do this and here are two ways to expose this :
Option 1: Trace Config exposed matching the format name:
$ pwd /sys/bus/event_source/devices/cs_etm
$ ls trace_config/ contextid cycacc retstack timestamp cs_etm# for f in $(find trace_config -type f) do echo $(basename $f) : $(cat $f) done
timestamp : 0x800 contextid : 0x40 cycacc : 0x10 retstack : 0x1000
Disadvantages : It is really tricky for the perf tool figure out the name of the "config" option from the event->attr.config bits. As the parsing of the format is not exported elsewhere (or I didn't look deep enough)
Option 2: Trace Config exposed as the "bit" positions.
$ ls trace_config/ 12 14 28 29
$ for f in $(find . -type f) do echo $(basename $f) : $(cat $f) done
28 : 0x800 14 : 0x40 12 : 0x10
29 : 0x1000
Advantage: The perf tool can iterate over the event->attr.config bits and do something like:
trconfigr = 0
for_each_set_bit(i, event->attr.config) trconfigr |= sysfs_read_trace_config(i)
where:
sysfs_read_trace_config(int i) { char path[MAX_LEN]; FILE *fp; unsigned long trcfg = 0;
sprintf(path, "%s/%d", $TRACE_CONFIG_PATH, i); fp = fopen(path, "r"); if (fp) fscanf(fp, "%lx", &trcfg); return trcfg;
}
Disadvantage: It is not human reader friendly. But it doesn't have to be, as it is for the tools to construct the trcconfigr.
Option 3:
Another completely different option is to fixup the TRCIDR2.CID = 0, while running in VHE, via sysfs. This will indicate that the CONTEXTID tracing is not supported by the ETM, but the VMID support should be visible. This can be used by the perf-tool to detect that the VMID is used as the PID.
Thoughts ?
Suzuki
[0] ---- Expose trace config via sysfs
HACK: coresight: etm4x: Expose trace config via sysfs
Not-yet-signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c index 9d61a71da96f..58e86267fa37 100644 --- a/drivers/hwtracing/coresight/coresight-etm-perf.c +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c @@ -35,6 +35,55 @@ PMU_FORMAT_ATTR(retstack, "config:" __stringify(ETM_OPT_RETSTK)); /* Sink ID - same for all ETMs */ PMU_FORMAT_ATTR(sinkid, "config2:0-31");
+static struct etm_trace_config_map *etm_cfg_map;
+#define ETM_TRCONFIGR_ATTR(_config) \
&((struct dev_ext_attribute []) { \
{ \
__ATTR(_config, S_IRUGO, etm_trace_config_show, NULL), \
(void *)_config \
} \
})[0].attr.attr
+static unsigned long etm_trace_config_map_cfg(unsigned long cfg) +{
struct etm_trace_config_map *map = etm_cfg_map;
if (!map)
return 0;
for (; map->cfg; map++)
if (map->cfg == cfg)
return map->trcfg;
return 0;
+}
+static ssize_t etm_trace_config_show(struct device *dev,
struct device_attribute *attr,
char *buf)
+{
struct dev_ext_attribute *eattr;
unsigned long cfg;
eattr = container_of(attr, struct dev_ext_attribute, attr);
cfg = (unsigned long)eattr->var;
return snprintf(buf, PAGE_SIZE, "0x%lx\n", etm_trace_config_map_cfg(cfg));
+}
+static struct attribute *etm_pmu_trace_config_attr[] = {
ETM_TRCONFIGR_ATTR(ETM_OPT_CYCACC),
ETM_TRCONFIGR_ATTR(ETM_OPT_CTXTID),
ETM_TRCONFIGR_ATTR(ETM_OPT_TS),
ETM_TRCONFIGR_ATTR(ETM_OPT_RETSTK),
NULL,
+};
+static const struct attribute_group etm_pmu_trace_config_group = {
.name = "trace_config",
.attrs = etm_pmu_trace_config_attr,
+};
- static struct attribute *etm_config_formats_attr[] = { &format_attr_cycacc.attr, &format_attr_contextid.attr,
@@ -61,6 +110,7 @@ static const struct attribute_group etm_pmu_sinks_group = { static const struct attribute_group *etm_pmu_attr_groups[] = { &etm_pmu_format_group, &etm_pmu_sinks_group,
};&etm_pmu_trace_config_group, NULL,
@@ -600,6 +650,12 @@ void etm_perf_del_symlink_sink(struct coresight_device *csdev) csdev->ea = NULL; }
+void etm_install_trace_config_map(struct etm_trace_config_map *cfg_map) +{
if (!etm_cfg_map && cfg_map)
etm_cfg_map = cfg_map;
+}
- static int __init etm_perf_init(void) { int ret;
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.h b/drivers/hwtracing/coresight/coresight-etm-perf.h index 015213abe00a..01679988debb 100644 --- a/drivers/hwtracing/coresight/coresight-etm-perf.h +++ b/drivers/hwtracing/coresight/coresight-etm-perf.h @@ -57,6 +57,16 @@ struct etm_event_data { struct list_head * __percpu *path; };
+/**
- struct etm_trace_config_map - Map perf config to TRCONFIGR bits for userspace
- @cfg: event->attr.config value
- @trcfg: TRCONFIGR setting for @cfg
- */
+struct etm_trace_config_map {
unsigned long cfg;
unsigned long trcfg;
+};
- #ifdef CONFIG_CORESIGHT int etm_perf_symlink(struct coresight_device *csdev, bool link); int etm_perf_add_symlink_sink(struct coresight_device *csdev);
@@ -69,6 +79,8 @@ static inline void *etm_perf_sink_config(struct perf_output_handle *handle) return data->snk_config; return NULL; }
+void etm_install_trace_config_map(struct etm_trace_config_map *cfg_map); #else static inline int etm_perf_symlink(struct coresight_device *csdev, bool link) { return -EINVAL; } @@ -80,6 +92,7 @@ static inline void *etm_perf_sink_config(struct perf_output_handle *handle) return NULL; }
+static void etm_install_trace_config_map(struct etm_trace_config_map *cfg_map) {} #endif /* CONFIG_CORESIGHT */
#endif diff --git a/drivers/hwtracing/coresight/coresight-etm4x.c b/drivers/hwtracing/coresight/coresight-etm4x.c index 00c9f0bb8b1a..e528afc9a56e 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x.c +++ b/drivers/hwtracing/coresight/coresight-etm4x.c @@ -1591,6 +1591,15 @@ static struct amba_driver etm4x_driver = { .id_table = etm4_ids, };
+#define ETM4x_TRACE_CONFIG_MAP_SIZE 5 +static struct etm_trace_config_map etm4x_map[ETM4x_TRACE_CONFIG_MAP_SIZE] = {
{ ETM_OPT_CTXTID, 0 }, /* filled in dynamically for ETM_OPT_CTXTID */
{ ETM_OPT_CYCACC, BIT(4) },
{ ETM_OPT_TS, BIT(11) },
{ ETM_OPT_RETSTK, BIT(12) },
{ 0, 0 },
+};
- static int __init etm4x_init(void) { int ret;
@@ -1605,6 +1614,10 @@ static int __init etm4x_init(void) if (ret) { pr_err("Error registering etm4x driver\n"); etm4_pm_clear();
} else {
etm4x_map[0].trcfg = is_kernel_in_hyp_mode() ?
(BIT(7) | BIT(15)) : BIT(6);
etm_install_trace_config_map(&etm4x_map[0]); } return ret;
On 05/27/2020 10:20 AM, Suzuki K Poulose wrote:
On 05/26/2020 05:33 PM, Mike Leach wrote:
Hi,
On Tue, 26 May 2020 at 17:21, Suzuki K Poulose suzuki.poulose@arm.com wrote:
On 05/25/2020 04:56 PM, Mathieu Poirier wrote:
On Fri, 22 May 2020 at 09:30, Suzuki K Poulose suzuki.poulose@arm.com wrote:
On 05/08/2020 04:43 PM, Mathieu Poirier wrote: > > > On Fri, 8 May 2020 at 00:11, Leo Yan <leo.yan@linaro.org
> I'm joining this thread on the late and as such don't have all the context. > But sysfs is never a good place to export run-time configuration registers as > there could be multiple perf sessions active at the same time. This isn't a > big issue for N:1 topologies but very real for 1:1.
I don't see how this affected by the sink topology, as this is an ETM configuration.
By sysfs, I mean, not necessarily under /sys/bus/coreight/.../etmN/
but, under :
/sys/bus/event_source/.../cs_etm/
We already expose various different information there. e.g, nr_addr_filters, sinks/
So we could expose something like :
trcconfigr/
and could describe how each of the individual event configs affect the TRCCONFIGR register. So that the perf tool can construct the TRCCONFIGR register from scratch just by looking at the trcconfigr directory.
e.g,
# cd /sys/bus/event_source/devices/cs_etm/ # for f in $(find format -type f); do echo $(basename $f) - $(cat $f); done cycacc - config:12 sinkid - config2:0-31 contextid - config:14 retstack - config:29 timestamp - config:28
$ for f in $(find trcconfigr -type f); do echo $(basename $f) - $(cat $f); done cycacc - 0x10 // Bit 4 timestamp - 0x800 // Bit 11 contextid - 0x8080 // Bit15 | Bit 7 On EL2 kernels - 0x40 // Bit 6 on EL1 kernels retstack - 0x1000 // Bit12
So those are configurations that apply to specific trace sessions. When dealing with concurrent trace sessions there is no way to guarantee the information exported to sysfs applies to the trace session that would read the information.
I think there is a bit of confusion here; The idea is the perf tool will construct the value of the traceconfigr based on the "options" selected by the user, just like it does now for "config1"/"config2"
i.e,
perf record -e cs_etm/cycacc,timestamp/ blah-blah
perf tool will :
map cycacc -> config1 |= BIT(12) and trcconfigr |= BIT($cat /sys/.../trcconfigr/cycacc)
and similarly for all the options. The values under the "trcconfigr" are static and only maps a given attribute to the traceconfig register bit.
Thus, any session can create the TRCCONFIGR accurately based on the events now, by dynamically looking up under the sysfs, rather than "compile time" static assumption of the the ETM model.
So how does this work when I copy the captured data off my target and try to decode offline?
Just like it works now. There is no change in flow for the perf tool. The only difference, how the perf tool would emit the TRCCONFIGR values in the perf.data samples. Instead of assuming that perf_event.attr.config1.CID => TRCCONFIGR[bit6], use the sysfs exported values to figure out the TRCCONFIGR setting.
Suzuki
hi Mathieu,
On 09/17/2020 05:28 PM, Mathieu Poirier wrote:
Good day Suzuki,
On Wed, 16 Sep 2020 at 03:15, Suzuki K Poulose suzuki.poulose@arm.com wrote:
Hi all,
Resurrecting an old thread. See more below. Short summary:
I would rather address ongoing "big ticket" items such as Mike's complex configuration patchset and your work on ETMv4.4 system instructions before looking into another thorny issue. Needless to say that Tingwei's modularisation has to be dealt with before anything else but that one is pretty much done now.
I understand and agree that there are plenty of things for review. And apologies for missing Mike's series (I missed it in my vacation email piles). I will start looking at it next week.
One pressing reason for this is, the broken pid tracing on VHE systems (most of the server boards), that are out there.
Btw, I don't mind if we prioritise this over ETMv4.4 (we don't have many systems out there and we cannot reliably test unless someone tests it on their hardware).
I am fine either ways.
Suzuki
Thanks, Mathieu
With the kernel running at EL2, the PID is in CONTEXTIDR_EL2 and the normal CONTEXTID tracing (which uses CONTEXTIDR_EL1) is not useful. The solution is to use VMID (with CONTEXTIDR_EL2 emitted as VMID) for the PID on an EL2 kernel.
This needs to be communicated to the perf tool somehow, so that it can decode the trace appropriately. One of the proposal was to expose the TRCCONFIGR bits for each of the individual event->attr.config options. I have hacked up a patch [0] to do this and here are two ways to expose this :
Option 1: Trace Config exposed matching the format name:
$ pwd /sys/bus/event_source/devices/cs_etm
$ ls trace_config/ contextid cycacc retstack timestamp cs_etm# for f in $(find trace_config -type f) do echo $(basename $f) : $(cat $f) done
timestamp : 0x800 contextid : 0x40 cycacc : 0x10 retstack : 0x1000
Disadvantages : It is really tricky for the perf tool figure out the name of the "config" option from the event->attr.config bits. As the parsing of the format is not exported elsewhere (or I didn't look deep enough)
Option 2: Trace Config exposed as the "bit" positions.
$ ls trace_config/ 12 14 28 29
$ for f in $(find . -type f) do echo $(basename $f) : $(cat $f) done
28 : 0x800 14 : 0x40 12 : 0x10
29 : 0x1000
Advantage: The perf tool can iterate over the event->attr.config bits and do something like:
trconfigr = 0
for_each_set_bit(i, event->attr.config) trconfigr |= sysfs_read_trace_config(i)
where:
sysfs_read_trace_config(int i) { char path[MAX_LEN]; FILE *fp; unsigned long trcfg = 0;
sprintf(path, "%s/%d", $TRACE_CONFIG_PATH, i); fp = fopen(path, "r"); if (fp) fscanf(fp, "%lx", &trcfg); return trcfg;
}
Disadvantage: It is not human reader friendly. But it doesn't have to be, as it is for the tools to construct the trcconfigr.
Option 3:
Another completely different option is to fixup the TRCIDR2.CID = 0, while running in VHE, via sysfs. This will indicate that the CONTEXTID tracing is not supported by the ETM, but the VMID support should be visible. This can be used by the perf-tool to detect that the VMID is used as the PID.
Thoughts ?
Suzuki
[0] ---- Expose trace config via sysfs
HACK: coresight: etm4x: Expose trace config via sysfs
Not-yet-signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c index 9d61a71da96f..58e86267fa37 100644 --- a/drivers/hwtracing/coresight/coresight-etm-perf.c +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c @@ -35,6 +35,55 @@ PMU_FORMAT_ATTR(retstack, "config:" __stringify(ETM_OPT_RETSTK)); /* Sink ID - same for all ETMs */ PMU_FORMAT_ATTR(sinkid, "config2:0-31");
+static struct etm_trace_config_map *etm_cfg_map;
+#define ETM_TRCONFIGR_ATTR(_config) \
&((struct dev_ext_attribute []) { \
{ \
__ATTR(_config, S_IRUGO, etm_trace_config_show, NULL), \
(void *)_config \
} \
})[0].attr.attr
+static unsigned long etm_trace_config_map_cfg(unsigned long cfg) +{
struct etm_trace_config_map *map = etm_cfg_map;
if (!map)
return 0;
for (; map->cfg; map++)
if (map->cfg == cfg)
return map->trcfg;
return 0;
+}
+static ssize_t etm_trace_config_show(struct device *dev,
struct device_attribute *attr,
char *buf)
+{
struct dev_ext_attribute *eattr;
unsigned long cfg;
eattr = container_of(attr, struct dev_ext_attribute, attr);
cfg = (unsigned long)eattr->var;
return snprintf(buf, PAGE_SIZE, "0x%lx\n", etm_trace_config_map_cfg(cfg));
+}
+static struct attribute *etm_pmu_trace_config_attr[] = {
ETM_TRCONFIGR_ATTR(ETM_OPT_CYCACC),
ETM_TRCONFIGR_ATTR(ETM_OPT_CTXTID),
ETM_TRCONFIGR_ATTR(ETM_OPT_TS),
ETM_TRCONFIGR_ATTR(ETM_OPT_RETSTK),
NULL,
+};
+static const struct attribute_group etm_pmu_trace_config_group = {
.name = "trace_config",
.attrs = etm_pmu_trace_config_attr,
+};
- static struct attribute *etm_config_formats_attr[] = { &format_attr_cycacc.attr, &format_attr_contextid.attr,
@@ -61,6 +110,7 @@ static const struct attribute_group etm_pmu_sinks_group = { static const struct attribute_group *etm_pmu_attr_groups[] = { &etm_pmu_format_group, &etm_pmu_sinks_group,
};&etm_pmu_trace_config_group, NULL,
@@ -600,6 +650,12 @@ void etm_perf_del_symlink_sink(struct coresight_device *csdev) csdev->ea = NULL; }
+void etm_install_trace_config_map(struct etm_trace_config_map *cfg_map) +{
if (!etm_cfg_map && cfg_map)
etm_cfg_map = cfg_map;
+}
- static int __init etm_perf_init(void) { int ret;
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.h b/drivers/hwtracing/coresight/coresight-etm-perf.h index 015213abe00a..01679988debb 100644 --- a/drivers/hwtracing/coresight/coresight-etm-perf.h +++ b/drivers/hwtracing/coresight/coresight-etm-perf.h @@ -57,6 +57,16 @@ struct etm_event_data { struct list_head * __percpu *path; };
+/**
- struct etm_trace_config_map - Map perf config to TRCONFIGR bits for userspace
- @cfg: event->attr.config value
- @trcfg: TRCONFIGR setting for @cfg
- */
+struct etm_trace_config_map {
unsigned long cfg;
unsigned long trcfg;
+};
- #ifdef CONFIG_CORESIGHT int etm_perf_symlink(struct coresight_device *csdev, bool link); int etm_perf_add_symlink_sink(struct coresight_device *csdev);
@@ -69,6 +79,8 @@ static inline void *etm_perf_sink_config(struct perf_output_handle *handle) return data->snk_config; return NULL; }
+void etm_install_trace_config_map(struct etm_trace_config_map *cfg_map); #else static inline int etm_perf_symlink(struct coresight_device *csdev, bool link) { return -EINVAL; } @@ -80,6 +92,7 @@ static inline void *etm_perf_sink_config(struct perf_output_handle *handle) return NULL; }
+static void etm_install_trace_config_map(struct etm_trace_config_map *cfg_map) {} #endif /* CONFIG_CORESIGHT */
#endif diff --git a/drivers/hwtracing/coresight/coresight-etm4x.c b/drivers/hwtracing/coresight/coresight-etm4x.c index 00c9f0bb8b1a..e528afc9a56e 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x.c +++ b/drivers/hwtracing/coresight/coresight-etm4x.c @@ -1591,6 +1591,15 @@ static struct amba_driver etm4x_driver = { .id_table = etm4_ids, };
+#define ETM4x_TRACE_CONFIG_MAP_SIZE 5 +static struct etm_trace_config_map etm4x_map[ETM4x_TRACE_CONFIG_MAP_SIZE] = {
{ ETM_OPT_CTXTID, 0 }, /* filled in dynamically for ETM_OPT_CTXTID */
{ ETM_OPT_CYCACC, BIT(4) },
{ ETM_OPT_TS, BIT(11) },
{ ETM_OPT_RETSTK, BIT(12) },
{ 0, 0 },
+};
- static int __init etm4x_init(void) { int ret;
@@ -1605,6 +1614,10 @@ static int __init etm4x_init(void) if (ret) { pr_err("Error registering etm4x driver\n"); etm4_pm_clear();
} else {
etm4x_map[0].trcfg = is_kernel_in_hyp_mode() ?
(BIT(7) | BIT(15)) : BIT(6);
etm_install_trace_config_map(&etm4x_map[0]); } return ret;
On 05/27/2020 10:20 AM, Suzuki K Poulose wrote:
On 05/26/2020 05:33 PM, Mike Leach wrote:
Hi,
On Tue, 26 May 2020 at 17:21, Suzuki K Poulose suzuki.poulose@arm.com wrote:
On 05/25/2020 04:56 PM, Mathieu Poirier wrote:
On Fri, 22 May 2020 at 09:30, Suzuki K Poulose suzuki.poulose@arm.com wrote: > > On 05/08/2020 04:43 PM, Mathieu Poirier wrote: >> >> >> On Fri, 8 May 2020 at 00:11, Leo Yan <leo.yan@linaro.org
>> I'm joining this thread on the late and as such don't have all the context. >> But sysfs is never a good place to export run-time configuration registers as >> there could be multiple perf sessions active at the same time. This isn't a >> big issue for N:1 topologies but very real for 1:1. > > I don't see how this affected by the sink topology, as this is an ETM > configuration. > > By sysfs, I mean, not necessarily under /sys/bus/coreight/.../etmN/ > > but, under : > > /sys/bus/event_source/.../cs_etm/ > > We already expose various different information there. e.g, nr_addr_filters, > sinks/ > > So we could expose something like : > > trcconfigr/ > > and could describe how each of the individual event configs affect the > TRCCONFIGR register. So that the perf tool can construct the TRCCONFIGR > register from scratch just by looking at the trcconfigr directory. > > e.g, > > # cd /sys/bus/event_source/devices/cs_etm/ # for f in $(find format -type f); > do echo $(basename $f) - $(cat $f); done cycacc - config:12 sinkid - > config2:0-31 contextid - config:14 retstack - config:29 timestamp - config:28 > > $ for f in $(find trcconfigr -type f); do echo $(basename $f) - $(cat $f); > done cycacc - 0x10 // Bit 4 timestamp - 0x800 // Bit 11 > contextid - 0x8080 // Bit15 | Bit 7 On EL2 kernels - 0x40 // Bit 6 on > EL1 kernels retstack - 0x1000 // Bit12
So those are configurations that apply to specific trace sessions. When dealing with concurrent trace sessions there is no way to guarantee the information exported to sysfs applies to the trace session that would read the information.
I think there is a bit of confusion here; The idea is the perf tool will construct the value of the traceconfigr based on the "options" selected by the user, just like it does now for "config1"/"config2"
i.e,
perf record -e cs_etm/cycacc,timestamp/ blah-blah
perf tool will :
map cycacc -> config1 |= BIT(12) and trcconfigr |= BIT($cat /sys/.../trcconfigr/cycacc)
and similarly for all the options. The values under the "trcconfigr" are static and only maps a given attribute to the traceconfig register bit.
Thus, any session can create the TRCCONFIGR accurately based on the events now, by dynamically looking up under the sysfs, rather than "compile time" static assumption of the the ETM model.
So how does this work when I copy the captured data off my target and try to decode offline?
Just like it works now. There is no change in flow for the perf tool. The only difference, how the perf tool would emit the TRCCONFIGR values in the perf.data samples. Instead of assuming that perf_event.attr.config1.CID => TRCCONFIGR[bit6], use the sysfs exported values to figure out the TRCCONFIGR setting.
Suzuki
Good morning,
On Fri, Sep 18, 2020 at 10:32:05AM +0100, Suzuki K Poulose wrote:
hi Mathieu,
On 09/17/2020 05:28 PM, Mathieu Poirier wrote:
Good day Suzuki,
On Wed, 16 Sep 2020 at 03:15, Suzuki K Poulose suzuki.poulose@arm.com wrote:
Hi all,
Resurrecting an old thread. See more below. Short summary:
I would rather address ongoing "big ticket" items such as Mike's complex configuration patchset and your work on ETMv4.4 system instructions before looking into another thorny issue. Needless to say that Tingwei's modularisation has to be dealt with before anything else but that one is pretty much done now.
I understand and agree that there are plenty of things for review. And apologies for missing Mike's series (I missed it in my vacation email piles). I will start looking at it next week.
Right, doing that is at the very top of my priority list for the week.
One pressing reason for this is, the broken pid tracing on VHE systems (most of the server boards), that are out there.
Btw, I don't mind if we prioritise this over ETMv4.4 (we don't have many systems out there and we cannot reliably test unless someone tests it on their hardware).
It would certainly be nice to prioritise features that have an immediate need. I just looked through Mike's comments on your system instruction patchset and there seems to be a lot of positive points. I also noticed some areas to improve (as it is expected with a patchset this big) but can't tell how much work is involved in addressing them since I have not had the opportunity to look at the set.
I am fine either ways.
Same here - I need to ram-up on both topic (system instruction and contextID at EL2) anyway.
Priority #1 is to look at complex configuration. What that's done we can start the conversation on contextID.
Thanks, Mathieu
Suzuki
Thanks, Mathieu
With the kernel running at EL2, the PID is in CONTEXTIDR_EL2 and the normal CONTEXTID tracing (which uses CONTEXTIDR_EL1) is not useful. The solution is to use VMID (with CONTEXTIDR_EL2 emitted as VMID) for the PID on an EL2 kernel.
This needs to be communicated to the perf tool somehow, so that it can decode the trace appropriately. One of the proposal was to expose the TRCCONFIGR bits for each of the individual event->attr.config options. I have hacked up a patch [0] to do this and here are two ways to expose this :
Option 1: Trace Config exposed matching the format name:
$ pwd /sys/bus/event_source/devices/cs_etm
$ ls trace_config/ contextid cycacc retstack timestamp cs_etm# for f in $(find trace_config -type f) do echo $(basename $f) : $(cat $f) done
timestamp : 0x800 contextid : 0x40 cycacc : 0x10 retstack : 0x1000
Disadvantages : It is really tricky for the perf tool figure out the name of the "config" option from the event->attr.config bits. As the parsing of the format is not exported elsewhere (or I didn't look deep enough)
Option 2: Trace Config exposed as the "bit" positions.
$ ls trace_config/ 12 14 28 29
$ for f in $(find . -type f) do echo $(basename $f) : $(cat $f) done
28 : 0x800 14 : 0x40 12 : 0x10
29 : 0x1000
Advantage: The perf tool can iterate over the event->attr.config bits and do something like:
trconfigr = 0
for_each_set_bit(i, event->attr.config) trconfigr |= sysfs_read_trace_config(i)
where:
sysfs_read_trace_config(int i) { char path[MAX_LEN]; FILE *fp; unsigned long trcfg = 0;
sprintf(path, "%s/%d", $TRACE_CONFIG_PATH, i); fp = fopen(path, "r"); if (fp) fscanf(fp, "%lx", &trcfg); return trcfg;
}
Disadvantage: It is not human reader friendly. But it doesn't have to be, as it is for the tools to construct the trcconfigr.
Option 3:
Another completely different option is to fixup the TRCIDR2.CID = 0, while running in VHE, via sysfs. This will indicate that the CONTEXTID tracing is not supported by the ETM, but the VMID support should be visible. This can be used by the perf-tool to detect that the VMID is used as the PID.
Thoughts ?
Suzuki
[0] ---- Expose trace config via sysfs
HACK: coresight: etm4x: Expose trace config via sysfs
Not-yet-signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c index 9d61a71da96f..58e86267fa37 100644 --- a/drivers/hwtracing/coresight/coresight-etm-perf.c +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c @@ -35,6 +35,55 @@ PMU_FORMAT_ATTR(retstack, "config:" __stringify(ETM_OPT_RETSTK)); /* Sink ID - same for all ETMs */ PMU_FORMAT_ATTR(sinkid, "config2:0-31");
+static struct etm_trace_config_map *etm_cfg_map;
+#define ETM_TRCONFIGR_ATTR(_config) \
&((struct dev_ext_attribute []) { \
{ \
__ATTR(_config, S_IRUGO, etm_trace_config_show, NULL), \
(void *)_config \
} \
})[0].attr.attr
+static unsigned long etm_trace_config_map_cfg(unsigned long cfg) +{
struct etm_trace_config_map *map = etm_cfg_map;
if (!map)
return 0;
for (; map->cfg; map++)
if (map->cfg == cfg)
return map->trcfg;
return 0;
+}
+static ssize_t etm_trace_config_show(struct device *dev,
struct device_attribute *attr,
char *buf)
+{
struct dev_ext_attribute *eattr;
unsigned long cfg;
eattr = container_of(attr, struct dev_ext_attribute, attr);
cfg = (unsigned long)eattr->var;
return snprintf(buf, PAGE_SIZE, "0x%lx\n", etm_trace_config_map_cfg(cfg));
+}
+static struct attribute *etm_pmu_trace_config_attr[] = {
ETM_TRCONFIGR_ATTR(ETM_OPT_CYCACC),
ETM_TRCONFIGR_ATTR(ETM_OPT_CTXTID),
ETM_TRCONFIGR_ATTR(ETM_OPT_TS),
ETM_TRCONFIGR_ATTR(ETM_OPT_RETSTK),
NULL,
+};
+static const struct attribute_group etm_pmu_trace_config_group = {
.name = "trace_config",
.attrs = etm_pmu_trace_config_attr,
+};
- static struct attribute *etm_config_formats_attr[] = { &format_attr_cycacc.attr, &format_attr_contextid.attr,
@@ -61,6 +110,7 @@ static const struct attribute_group etm_pmu_sinks_group = { static const struct attribute_group *etm_pmu_attr_groups[] = { &etm_pmu_format_group, &etm_pmu_sinks_group,
};&etm_pmu_trace_config_group, NULL,
@@ -600,6 +650,12 @@ void etm_perf_del_symlink_sink(struct coresight_device *csdev) csdev->ea = NULL; }
+void etm_install_trace_config_map(struct etm_trace_config_map *cfg_map) +{
if (!etm_cfg_map && cfg_map)
etm_cfg_map = cfg_map;
+}
- static int __init etm_perf_init(void) { int ret;
diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.h b/drivers/hwtracing/coresight/coresight-etm-perf.h index 015213abe00a..01679988debb 100644 --- a/drivers/hwtracing/coresight/coresight-etm-perf.h +++ b/drivers/hwtracing/coresight/coresight-etm-perf.h @@ -57,6 +57,16 @@ struct etm_event_data { struct list_head * __percpu *path; };
+/**
- struct etm_trace_config_map - Map perf config to TRCONFIGR bits for userspace
- @cfg: event->attr.config value
- @trcfg: TRCONFIGR setting for @cfg
- */
+struct etm_trace_config_map {
unsigned long cfg;
unsigned long trcfg;
+};
- #ifdef CONFIG_CORESIGHT int etm_perf_symlink(struct coresight_device *csdev, bool link); int etm_perf_add_symlink_sink(struct coresight_device *csdev);
@@ -69,6 +79,8 @@ static inline void *etm_perf_sink_config(struct perf_output_handle *handle) return data->snk_config; return NULL; }
+void etm_install_trace_config_map(struct etm_trace_config_map *cfg_map); #else static inline int etm_perf_symlink(struct coresight_device *csdev, bool link) { return -EINVAL; } @@ -80,6 +92,7 @@ static inline void *etm_perf_sink_config(struct perf_output_handle *handle) return NULL; }
+static void etm_install_trace_config_map(struct etm_trace_config_map *cfg_map) {} #endif /* CONFIG_CORESIGHT */
#endif diff --git a/drivers/hwtracing/coresight/coresight-etm4x.c b/drivers/hwtracing/coresight/coresight-etm4x.c index 00c9f0bb8b1a..e528afc9a56e 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x.c +++ b/drivers/hwtracing/coresight/coresight-etm4x.c @@ -1591,6 +1591,15 @@ static struct amba_driver etm4x_driver = { .id_table = etm4_ids, };
+#define ETM4x_TRACE_CONFIG_MAP_SIZE 5 +static struct etm_trace_config_map etm4x_map[ETM4x_TRACE_CONFIG_MAP_SIZE] = {
{ ETM_OPT_CTXTID, 0 }, /* filled in dynamically for ETM_OPT_CTXTID */
{ ETM_OPT_CYCACC, BIT(4) },
{ ETM_OPT_TS, BIT(11) },
{ ETM_OPT_RETSTK, BIT(12) },
{ 0, 0 },
+};
- static int __init etm4x_init(void) { int ret;
@@ -1605,6 +1614,10 @@ static int __init etm4x_init(void) if (ret) { pr_err("Error registering etm4x driver\n"); etm4_pm_clear();
} else {
etm4x_map[0].trcfg = is_kernel_in_hyp_mode() ?
(BIT(7) | BIT(15)) : BIT(6);
etm_install_trace_config_map(&etm4x_map[0]); } return ret;
On 05/27/2020 10:20 AM, Suzuki K Poulose wrote:
On 05/26/2020 05:33 PM, Mike Leach wrote:
Hi,
On Tue, 26 May 2020 at 17:21, Suzuki K Poulose suzuki.poulose@arm.com wrote:
On 05/25/2020 04:56 PM, Mathieu Poirier wrote: > On Fri, 22 May 2020 at 09:30, Suzuki K Poulose suzuki.poulose@arm.com wrote: > > > > On 05/08/2020 04:43 PM, Mathieu Poirier wrote: > > > > > > > > > On Fri, 8 May 2020 at 00:11, Leo Yan <leo.yan@linaro.org
> > > I'm joining this thread on the late and as such don't have all the context. > > > But sysfs is never a good place to export run-time configuration registers as > > > there could be multiple perf sessions active at the same time. This isn't a > > > big issue for N:1 topologies but very real for 1:1. > > > > I don't see how this affected by the sink topology, as this is an ETM > > configuration. > > > > By sysfs, I mean, not necessarily under /sys/bus/coreight/.../etmN/ > > > > but, under : > > > > /sys/bus/event_source/.../cs_etm/ > > > > We already expose various different information there. e.g, nr_addr_filters, > > sinks/ > > > > So we could expose something like : > > > > trcconfigr/ > > > > and could describe how each of the individual event configs affect the > > TRCCONFIGR register. So that the perf tool can construct the TRCCONFIGR > > register from scratch just by looking at the trcconfigr directory. > > > > e.g, > > > > # cd /sys/bus/event_source/devices/cs_etm/ # for f in $(find format -type f); > > do echo $(basename $f) - $(cat $f); done cycacc - config:12 sinkid - > > config2:0-31 contextid - config:14 retstack - config:29 timestamp - config:28 > > > > $ for f in $(find trcconfigr -type f); do echo $(basename $f) - $(cat $f); > > done cycacc - 0x10 // Bit 4 timestamp - 0x800 // Bit 11 > > contextid - 0x8080 // Bit15 | Bit 7 On EL2 kernels - 0x40 // Bit 6 on > > EL1 kernels retstack - 0x1000 // Bit12 > > So those are configurations that apply to specific trace sessions. When dealing > with concurrent trace sessions there is no way to guarantee the information > exported to sysfs applies to the trace session that would read the information.
I think there is a bit of confusion here; The idea is the perf tool will construct the value of the traceconfigr based on the "options" selected by the user, just like it does now for "config1"/"config2"
i.e,
perf record -e cs_etm/cycacc,timestamp/ blah-blah
perf tool will :
map cycacc -> config1 |= BIT(12) and trcconfigr |= BIT($cat /sys/.../trcconfigr/cycacc)
and similarly for all the options. The values under the "trcconfigr" are static and only maps a given attribute to the traceconfig register bit.
Thus, any session can create the TRCCONFIGR accurately based on the events now, by dynamically looking up under the sysfs, rather than "compile time" static assumption of the the ETM model.
So how does this work when I copy the captured data off my target and try to decode offline?
Just like it works now. There is no change in flow for the perf tool. The only difference, how the perf tool would emit the TRCCONFIGR values in the perf.data samples. Instead of assuming that perf_event.attr.config1.CID => TRCCONFIGR[bit6], use the sysfs exported values to figure out the TRCCONFIGR setting.
Suzuki