The first commit contains a fix for a recently introduced regression,
but was always a shortcoming in the Coresight code anyway.
The following commits are a tidyup in preparation for the last commit,
which is a fairly major change to the decode logic that's also
indirectly related to the regression so I thought it would be good time
to fix that now.
Applies to perf/core (9be6ab181b7b)
James Clark (7):
perf: cs-etm: Fix timeless decode mode detection
perf tools: Add util function for overriding user set config values
perf: cs-etm: Don't test full_auxtrace because it's always set
perf: cs-etm: Validate options after applying them
perf: cs-etm: Allow user to override timestamp and contextid settings
perf: cs-etm: Use bool type for boolean values
perf: cs-etm: Add separate decode paths for timeless and per-thread
modes
tools/perf/arch/arm/util/cs-etm.c | 223 +++++++++---------
tools/perf/arch/arm/util/pmu.c | 2 +
tools/perf/arch/arm64/util/arm-spe.c | 26 +-
tools/perf/arch/x86/util/intel-pt.c | 22 +-
tools/perf/tests/shell/test_arm_coresight.sh | 24 ++
.../perf/util/cs-etm-decoder/cs-etm-decoder.h | 8 +-
tools/perf/util/cs-etm.c | 200 +++++++++++-----
tools/perf/util/cs-etm.h | 6 +-
tools/perf/util/evsel.c | 29 +++
tools/perf/util/evsel.h | 3 +
10 files changed, 325 insertions(+), 218 deletions(-)
--
2.34.1
Changes since v4:
* Update commit message on patch 1
* Drop previous change that added lockdep checks to coresight_add_helper()
because they are already in mutex_lock(). But keep extra comment.
* Check for allocation failure in coresight_add_out_conn()
------------------
Changes since v3:
* Put connection loss fix at the beginning so that it can be backported
* Replace coresight_find_link_{x}() with coresight_find_out_connection()
* Reorder coresight_enable_source() arguments for consistency
* Add source and destination reference counts so that two link devices
connected together don't clash
* Add coresight_is_helper()
* Fix overwriting csdev bug in coresight_orphan_match()
* Don't clear conns[i]->dest_fwnode in coresight_remove_conns() in case
it's used again
* Use dev instead of adev->dev for devmem allocation in
acpi_coresight_parse_graph() so that it's consistent with DT mode and
doesn't cause a warning on free.
* Rename coresight_add_helper_mutex() -> coresight_add_helper()
* Ensure coresight_mutex isn't already held in coresight_add_helper()
* Return new connection from coresight_add_out_conn()
* Comment and formatting improvements
------------------
Changes since v2:
* Make out_conns array contiguous instead of sparse which simplifies
filling and using it. New connections are always added to the end
* Store pointers to individual connection objects so that they can be
shared between inputs and outputs
* Fix an existing bug where connection info was lost when unloading a
device
* Simplify connection fixup functions. Now the orphan mechanism is used
for inputs in the same way as outputs to guarantee that all
connections have both an input and an output set
* Use input connections to disconnect devices on unload instead of
iterating through them all
* Make refcount a property of the connection rather than use it's own
array based on the number of inputs and outputs
* Fix a bug in v2 where helpers attached to the source device weren't
disabled because coresight-etm-perf.c was making a raw call to
disable rather than using a helper.
* Change names of connection members to make direction explicit now
that the connection is shared between input and outputs
------------------
Changes since v1:
* Don't dereference handle in tmc_etr_get_buffer() when not in perf mode.
* Fix some W=1 warnings
* Add a commit to rename child/output in terms of local/remote
-------------------
Currently there is a refcount leak in CTI when using system wide mode
or tracing multithreaded applications. See the last commit for a
reproducer. This prevents the module from being unloaded.
Historically there have been a few issues and fixes attempted around
here which have resulted in some extra logic and a member to keep
track of CTI being enabled 'struct coresight_device->ect_enabled'.
The fix in commit 665c157e0204 ("coresight: cti: Fix hang in
cti_disable_hw()") was also related to CTI having its own
enable/disable path which came later than other devices.
If we make CTI a helper device and enable helper devices adjacent to
the path we get very similar enable/disable behavior to now, but with
more reuse of the existing reference counting logic in the coresight
core code. This also affects CATU which can have a little bit of
its hard coded enable/disable code removed.
Enabling CATU on the generic path does require that input connections
are tracked so that it can get its associated ETR buffer.
Applies to coresight/next (197b6b60ae7b) but also requires the
realloc_array patch here [1].
Also available in full here [2].
[1]: https://lore.kernel.org/linux-arm-kernel/20230306152723.3090195-1-james.cla…
[2]: https://gitlab.arm.com/linux-arm/linux-jc/-/tree/james-cs-cti-module-refcou…
James Clark (13):
coresight: Fix loss of connection info when a module is unloaded
coresight: Use enum type for cs_mode wherever possible
coresight: Change name of pdata->conns
coresight: Rename nr_outports to nr_outconns
coresight: Rename connection members to make the direction explicit
coresight: Dynamically add connections
coresight: Store pointers to connections rather than an array of them
coresight: Simplify connection fixup mechanism
coresight: Store in-connections as well as out-connections
coresight: Make refcount a property of the connection
coresight: Refactor out buffer allocation function for ETR
coresight: Enable and disable helper devices adjacent to the path
coresight: Fix CTI module refcount leak by making it a helper device
drivers/hwtracing/coresight/coresight-catu.c | 21 +-
drivers/hwtracing/coresight/coresight-core.c | 593 ++++++++++--------
.../hwtracing/coresight/coresight-cti-core.c | 52 +-
.../hwtracing/coresight/coresight-cti-sysfs.c | 4 +-
drivers/hwtracing/coresight/coresight-cti.h | 4 +-
drivers/hwtracing/coresight/coresight-etb10.c | 13 +-
.../hwtracing/coresight/coresight-etm-perf.c | 4 +-
.../coresight/coresight-etm3x-core.c | 6 +-
.../coresight/coresight-etm4x-core.c | 6 +-
.../hwtracing/coresight/coresight-funnel.c | 26 +-
.../hwtracing/coresight/coresight-platform.c | 269 ++++----
drivers/hwtracing/coresight/coresight-priv.h | 17 +-
.../coresight/coresight-replicator.c | 23 +-
drivers/hwtracing/coresight/coresight-stm.c | 6 +-
drivers/hwtracing/coresight/coresight-sysfs.c | 17 +-
.../hwtracing/coresight/coresight-tmc-etf.c | 26 +-
.../hwtracing/coresight/coresight-tmc-etr.c | 110 ++--
drivers/hwtracing/coresight/coresight-tmc.h | 2 +
drivers/hwtracing/coresight/coresight-tpda.c | 23 +-
drivers/hwtracing/coresight/coresight-tpdm.c | 4 +-
drivers/hwtracing/coresight/coresight-tpiu.c | 7 +-
drivers/hwtracing/coresight/coresight-trbe.c | 3 +-
drivers/hwtracing/coresight/ultrasoc-smb.c | 11 +-
drivers/hwtracing/coresight/ultrasoc-smb.h | 2 +-
include/linux/coresight.h | 127 ++--
25 files changed, 708 insertions(+), 668 deletions(-)
--
2.34.1
Em Mon, Apr 24, 2023 at 06:36:14PM +0300, Adrian Hunter escreveu:
> On 24/04/23 16:47, James Clark wrote:
> > There is some duplicated code to only override config values if they
> > haven't already been set by the user so make a util function for this.
> >
> > Signed-off-by: James Clark <james.clark(a)arm.com>
>
> One minor comment, nevertheless:
>
> Acked-by: Adrian Hunter <adrian.hunter(a)intel.com>
I just moved to evsel__set_config_if_unset() to util/pmu.c, next to
some other evsel__ functions to not break the python.so binding, before
I was getting:
[acme@quaco perf-tools-next]$ perf test -v python
Couldn't bump rlimit(MEMLOCK), failures may take place when creating BPF maps, etc
19: 'import perf' in python :
--- start ---
test child forked, pid 500086
python usage test: "echo "import sys ; sys.path.append('/tmp/build/perf-tools-next/python'); import perf" | '/usr/bin/python3' "
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: /tmp/build/perf-tools-next/python/perf.cpython-311-x86_64-linux-gnu.so: undefined symbol: perf_pmu__format_bits
test child finished with -1
---- end ----
'import perf' in python: FAILED!
[acme@quaco perf-tools-next]$
Please run 'perf test' and 'make -C tools/perf build-test' prior to
sending pull requests,
Thanks, applied.
- Arnaldo
On 24/04/2023 16:36, Adrian Hunter wrote:
> On 24/04/23 16:47, James Clark wrote:
>> There is some duplicated code to only override config values if they
>> haven't already been set by the user so make a util function for this.
>>
>> Signed-off-by: James Clark <james.clark(a)arm.com>
>
> One minor comment, nevertheless:
>
> Acked-by: Adrian Hunter <adrian.hunter(a)intel.com>
>
Thanks for the review
>> ---
>> tools/perf/arch/arm64/util/arm-spe.c | 26 ++-----------------------
>> tools/perf/arch/x86/util/intel-pt.c | 22 ++-------------------
>> tools/perf/util/evsel.c | 29 ++++++++++++++++++++++++++++
>> tools/perf/util/evsel.h | 3 +++
>> 4 files changed, 36 insertions(+), 44 deletions(-)
>>
[...]
>> }
>> }
>> +
>> +/*
>> + * Set @config_name to @val as long as the user hasn't already set or cleared it
>> + * by passing a config term on the command line.
>> + *
>> + * @val is the value to put into the bits specified by @config_name rather than
>> + * the bit pattern. It is shifted into position by this function, so to set
>> + * something to true, pass 1 for val rather than a pre shifted value.
>> + */
>> +#define field_prep(_mask, _val) (((_val) << (ffsll(_mask) - 1)) & (_mask))
>
> I notice there is already tools/include/linux/bitfield.h
> so may FIELD_PREP from there could be used?
I tried that first, but it seems like quite a lot of effort went into it
to make it work on const only values so it doesn't work here. There is
this [1] change to make a non const one but it doesn't look like it went
anywhere:
[1]:
https://patchwork.kernel.org/project/linux-omap/patch/3a54a6703879d10f08cf0…
Hi Dan
On 21/04/2023 11:42, Dan Carpenter wrote:
> This code generates a Smatch warning:
>
> drivers/hwtracing/coresight/coresight-tmc-etr.c:947 tmc_etr_buf_insert_barrier_packet()
> error: uninitialized symbol 'bufp'.
>
> The problem is that if tmc_sg_table_get_data() returns -EINVAL, then
> when we test if "len < CORESIGHT_BARRIER_PKT_SIZE", the negative "len"
> value is type promoted to a high unsigned long value which is greater
> than CORESIGHT_BARRIER_PKT_SIZE. Fix this bug by adding an explicit
> check for error codes.
>
> Fixes: 75f4e3619fe2 ("coresight: tmc-etr: Add transparent buffer management")
> Signed-off-by: Dan Carpenter <dan.carpenter(a)linaro.org>
> ---
> drivers/hwtracing/coresight/coresight-tmc-etr.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> index 918d461fcf4a..eaa296ced167 100644
> --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c
> +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c
> @@ -942,7 +942,7 @@ tmc_etr_buf_insert_barrier_packet(struct etr_buf *etr_buf, u64 offset)
>
> len = tmc_etr_buf_get_data(etr_buf, offset,
> CORESIGHT_BARRIER_PKT_SIZE, &bufp);
> - if (WARN_ON(len < CORESIGHT_BARRIER_PKT_SIZE))
> + if (WARN_ON(len < 0 || len < CORESIGHT_BARRIER_PKT_SIZE))
> return -EINVAL;
> coresight_insert_barrier_packet(bufp);
> return offset + CORESIGHT_BARRIER_PKT_SIZE;
Thanks for the fix, I will send this as fixes at 6.4-rc1, as I have
already sent the PULL request for 6.4
Suzuki
On 20/04/2023 13:37, Ganapatrao Kulkarni wrote:
>
>
> On 20-04-2023 06:00 pm, James Clark wrote:
>>
>>
>> On 20/04/2023 12:47, Ganapatrao Kulkarni wrote:
>>>
...
>>> My patch is rebased on 6.3-RC7 codebase with Mike's 3 perf patches
>>> related to dynamic id [1] support(queued for 6.4).
>>>
>>> "perf report -D" works for me.
>>
>> I was referring to sparse CPU lists, which I think you mentioned above
>> doesn't work even with this patch.
>>
>>>
>>> [1] https://www.spinics.net/lists/linux-perf-users/msg27452.html
>>>
>>
>> It should be based on the next branch here:
>> git://git.kernel.org/pub/scm/linux/kernel/git/coresight/linux.git
>
> OK.
It need not be. Since this patch is purely perf tools patch and has
nothing to do with the kernel drivers, it should be beased on whatever
the tip of the perf tool tree is. Otherwise we risk rebasing to that
eventually.
Cheers
Suzuki
On 20/04/2023 12:47, Ganapatrao Kulkarni wrote:
>
> Hi James,
>
> On 20-04-2023 03:13 pm, James Clark wrote:
>>
>>
>> On 19/04/2023 18:21, Ganapatrao Kulkarni wrote:
>>> The current implementation supports coresight trace for a range of
>>> CPUs, if the first CPU is CPU0.
>>>
>>> Adding changes to enable coresight trace for any range of CPUs by
>>> decoding the first CPU also from the header.
>>> Later, first CPU id is used instead of CPU0 across the decoder
>>> functions.
>>>
>>> Signed-off-by: Ganapatrao Kulkarni <gankulkarni(a)os.amperecomputing.com>
>>> ---
>>> .../perf/util/cs-etm-decoder/cs-etm-decoder.c | 4 +-
>>> .../perf/util/cs-etm-decoder/cs-etm-decoder.h | 3 +-
>>> tools/perf/util/cs-etm.c | 62 ++++++++++++-------
>>> 3 files changed, 42 insertions(+), 27 deletions(-)
>>>
>>> diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
>>> b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
>>> index 82a27ab90c8b..41ab299b643b 100644
>>> --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
>>> +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
>>> @@ -724,7 +724,7 @@ cs_etm_decoder__create_etm_decoder(struct
>>> cs_etm_decoder_params *d_params,
>>> }
>>> struct cs_etm_decoder *
>>> -cs_etm_decoder__new(int decoders, struct cs_etm_decoder_params
>>> *d_params,
>>> +cs_etm_decoder__new(int first_decoder, int decoders, struct
>>> cs_etm_decoder_params *d_params,
>>> struct cs_etm_trace_params t_params[])
>>> {
>>> struct cs_etm_decoder *decoder;
>>> @@ -769,7 +769,7 @@ cs_etm_decoder__new(int decoders, struct
>>> cs_etm_decoder_params *d_params,
>>> /* init raw frame logging if required */
>>> cs_etm_decoder__init_raw_frame_logging(d_params, decoder);
>>> - for (i = 0; i < decoders; i++) {
>>> + for (i = first_decoder; i < decoders; i++) {
>>> ret = cs_etm_decoder__create_etm_decoder(d_params,
>>> &t_params[i],
>>> decoder);
>>> diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h
>>> b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h
>>> index 92a855fbe5b8..b06193fc75b4 100644
>>> --- a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h
>>> +++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h
>>> @@ -90,7 +90,8 @@ int cs_etm_decoder__process_data_block(struct
>>> cs_etm_decoder *decoder,
>>> size_t len, size_t *consumed);
>>> struct cs_etm_decoder *
>>> -cs_etm_decoder__new(int num_cpu,
>>> +cs_etm_decoder__new(int first_decoder,
>>> + int decoders,
>>> struct cs_etm_decoder_params *d_params,
>>> struct cs_etm_trace_params t_params[]);
>>> diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
>>> index 94e2d02009eb..2619513ae088 100644
>>> --- a/tools/perf/util/cs-etm.c
>>> +++ b/tools/perf/util/cs-etm.c
>>> @@ -55,6 +55,8 @@ struct cs_etm_auxtrace {
>>> u8 has_virtual_ts; /* Virtual/Kernel timestamps in the trace. */
>>> int num_cpu;
>>> + int first_cpu;
>>> + int last_cpu;
>>> u64 latest_kernel_timestamp;
>>> u32 auxtrace_type;
>>> u64 branches_sample_type;
>>> @@ -638,14 +640,13 @@ static void cs_etm__set_trace_param_ete(struct
>>> cs_etm_trace_params *t_params,
>>> }
>>> static int cs_etm__init_trace_params(struct cs_etm_trace_params
>>> *t_params,
>>> - struct cs_etm_auxtrace *etm,
>>> - int decoders)
>>> + struct cs_etm_auxtrace *etm)
>>> {
>>> int i;
>>> u32 etmidr;
>>> u64 architecture;
>>> - for (i = 0; i < decoders; i++) {
>>> + for (i = etm->first_cpu; i < etm->last_cpu; i++) {
>>> architecture = etm->metadata[i][CS_ETM_MAGIC];
>>> switch (architecture) {
>>> @@ -817,7 +818,7 @@ static void cs_etm__free(struct perf_session
>>> *session)
>>> /* Then the RB tree itself */
>>> intlist__delete(traceid_list);
>>> - for (i = 0; i < aux->num_cpu; i++)
>>> + for (i = aux->first_cpu; i < aux->last_cpu; i++)
>>> zfree(&aux->metadata[i]);
>>> thread__zput(aux->unknown_thread);
>>> @@ -921,7 +922,8 @@ static struct cs_etm_queue
>>> *cs_etm__alloc_queue(struct cs_etm_auxtrace *etm,
>>> * Each queue can only contain data from one CPU when
>>> unformatted, so only one decoder is
>>> * needed.
>>> */
>>> - int decoders = formatted ? etm->num_cpu : 1;
>>> + int first_decoder = formatted ? etm->first_cpu : 0;
>>> + int decoders = first_decoder + (formatted ? etm->num_cpu : 1);
>>> etmq = zalloc(sizeof(*etmq));
>>> if (!etmq)
>>> @@ -937,7 +939,7 @@ static struct cs_etm_queue
>>> *cs_etm__alloc_queue(struct cs_etm_auxtrace *etm,
>>> if (!t_params)
>>> goto out_free;
>>> - if (cs_etm__init_trace_params(t_params, etm, decoders))
>>> + if (cs_etm__init_trace_params(t_params, etm))
>>> goto out_free;
>>> /* Set decoder parameters to decode trace packets */
>>> @@ -947,8 +949,7 @@ static struct cs_etm_queue
>>> *cs_etm__alloc_queue(struct cs_etm_auxtrace *etm,
>>> formatted))
>>> goto out_free;
>>> - etmq->decoder = cs_etm_decoder__new(decoders, &d_params,
>>> - t_params);
>>> + etmq->decoder = cs_etm_decoder__new(first_decoder, decoders,
>>> &d_params, t_params);
>>> if (!etmq->decoder)
>>> goto out_free;
>>> @@ -2959,11 +2960,11 @@ static int cs_etm__queue_aux_records(struct
>>> perf_session *session)
>>> * Loop through the ETMs and complain if we find at least one where
>>> ts_source != 1 (virtual
>>> * timestamps).
>>> */
>>> -static bool cs_etm__has_virtual_ts(u64 **metadata, int num_cpu)
>>> +static bool cs_etm__has_virtual_ts(u64 **metadata, struct
>>> cs_etm_auxtrace *etm)
>>> {
>>> int j;
>>> - for (j = 0; j < num_cpu; j++) {
>>> + for (j = etm->first_cpu; j < etm->last_cpu; j++) {
>>> switch (metadata[j][CS_ETM_MAGIC]) {
>>> case __perf_cs_etmv4_magic:
>>> if (HAS_PARAM(j, ETMV4, TS_SOURCE) ||
>>> metadata[j][CS_ETMV4_TS_SOURCE] != 1)
>>> @@ -2982,13 +2983,14 @@ static bool cs_etm__has_virtual_ts(u64
>>> **metadata, int num_cpu)
>>> }
>>> /* map trace ids to correct metadata block, from information in
>>> metadata */
>>> -static int cs_etm__map_trace_ids_metadata(int num_cpu, u64 **metadata)
>>> +static int cs_etm__map_trace_ids_metadata(struct cs_etm_auxtrace *etm)
>>> {
>>> u64 cs_etm_magic;
>>> + u64 **metadata = etm->metadata;
>>> u8 trace_chan_id;
>>> int i, err;
>>> - for (i = 0; i < num_cpu; i++) {
>>> + for (i = etm->first_cpu; i < etm->last_cpu; i++) {
>>> cs_etm_magic = metadata[i][CS_ETM_MAGIC];
>>> switch (cs_etm_magic) {
>>> case __perf_cs_etmv3_magic:
>>> @@ -3015,12 +3017,13 @@ static int cs_etm__map_trace_ids_metadata(int
>>> num_cpu, u64 **metadata)
>>> * If we found AUX_HW_ID packets, then set any metadata marked as
>>> unused to the
>>> * unused value to reduce the number of unneeded decoders created.
>>> */
>>> -static int cs_etm__clear_unused_trace_ids_metadata(int num_cpu, u64
>>> **metadata)
>>> +static int cs_etm__clear_unused_trace_ids_metadata(struct
>>> cs_etm_auxtrace *etm)
>>> {
>>> u64 cs_etm_magic;
>>> + u64 **metadata = etm->metadata;
>>> int i;
>>> - for (i = 0; i < num_cpu; i++) {
>>> + for (i = etm->first_cpu; i < etm->last_cpu; i++) {
>>> cs_etm_magic = metadata[i][CS_ETM_MAGIC];
>>> switch (cs_etm_magic) {
>>> case __perf_cs_etmv3_magic:
>>> @@ -3049,7 +3052,7 @@ int cs_etm__process_auxtrace_info_full(union
>>> perf_event *event,
>>> int event_header_size = sizeof(struct perf_event_header);
>>> int total_size = auxtrace_info->header.size;
>>> int priv_size = 0;
>>> - int num_cpu;
>>> + int num_cpu, first_cpu = 0, last_cpu;
>>> int err = 0;
>>> int aux_hw_id_found;
>>> int i, j;
>>> @@ -3068,22 +3071,31 @@ int cs_etm__process_auxtrace_info_full(union
>>> perf_event *event,
>>> /* First the global part */
>>> ptr = (u64 *) auxtrace_info->priv;
>>> num_cpu = ptr[CS_PMU_TYPE_CPUS] & 0xffffffff;
>>> - metadata = zalloc(sizeof(*metadata) * num_cpu);
>>> +
>>> + /* Start parsing after the common part of the header */
>>> + i = CS_HEADER_VERSION_MAX;
>>> +
>>> + /*Get CPU id of first event */
>>> + first_cpu = ptr[i + CS_ETM_CPU];
>>> + last_cpu = first_cpu + num_cpu;
>>> +
>>> + if (first_cpu > cpu__max_cpu().cpu ||
>>> + last_cpu > cpu__max_cpu().cpu)
>>> + return -EINVAL;
>>> +
>>> + metadata = zalloc(sizeof(*metadata) * last_cpu);
>>
>> Hi Ganapatrao,
>>
>> I think I see what the problem is, but I'm wondering if a better fix
>> would be to further decouple the CPU ID from the index in the array.
>>
>> With your change it's not clear what happens with sparse recordings, for
>> example 'perf record -e cs_etm// -C 1,3,5'. And it seems like there is
>
> This patch fixes for any range of CPUs.
> Record with sparse list of CPUs will not work with current code still.
>
Is there a major issue that means sparse can't be done? I'm thinking it
would be best to fix both issues with one change while we understand
this part rather than a half fix that might have do be completely
re-understood and re-done later anyway. Unless there is some big blocker
but I can't see it?
>> some wastage in the zalloc here for example if only CPU 256 is traced
>> then we'd still make 256 decoders but 255 of them would be unused?
>>
>> I tried to test sparse recordings, but your change doesn't apply to the
>> latest coresight/next branch. I did notice that 'perf report -D' doesn't
>> work with them on coresight/next (it just quits), but I couldn't see if
>> that's fixed with your change.
>
> My patch is rebased on 6.3-RC7 codebase with Mike's 3 perf patches
> related to dynamic id [1] support(queued for 6.4).
>
> "perf report -D" works for me.
I was referring to sparse CPU lists, which I think you mentioned above
doesn't work even with this patch.
>
> [1] https://www.spinics.net/lists/linux-perf-users/msg27452.html
>
It should be based on the next branch here:
git://git.kernel.org/pub/scm/linux/kernel/git/coresight/linux.git
>>
>> Would a better fix not be to keep the metadata loops from 0-N and
>> instead save the CPU ID in cs_etm_decoder_params or the decoder. That
>> way it would support both sparse and not starting from 0 cases? I think
>
> Yep, I though this initially, it got complicated due to for more
> for-loops. I will try again and post V2.
>
I can't imagine it would need any extra for loops off the top of my
head. Just saving the CPU ID in a few structs and using it wherever it's
needed instead of the loop index. I imagine most of the loops would
actually stay the same rather than be changed like you have in V1.
>> the code would be better if it's worded like "i < recorded_cpus" rather
>> than "i < cpu" to make it clear that i isn't actually the CPU ID it's
>> just an index.
>
> Yes, makes sense to call it "recorded_cpus".
>
>>
>> Also a new test for this scenario would probably be a good idea.
>>
>> Thanks
>> James
>>
> Thanks,
> Ganapat