On Thu, May 21, 2026 at 08:16:27PM +0800, Yingchao Deng wrote:
[...]
> @@ -231,6 +254,8 @@ struct cti_trig_con *cti_allocate_trig_con(struct device *dev, int in_sigs,
> {
> struct cti_trig_con *tc = NULL;
> struct cti_trig_grp *in = NULL, *out = NULL;
> + struct cti_drvdata *drvdata = dev_get_drvdata(dev);
> + int n_trigs = drvdata->config.nr_trig_max;
I don't mind it allocates bitmask with nr_trig_max, but AI review
suggests that when in_sigs / out_sigs bigger than nr_trig_max, it might
access memory out-of-boundary (see cti_plat_read_trig_group()).
It is good to add a check:
if (in_sigs > n_trigs || out_sigs > n_trigs) {
dev_err(dev, "trigger signal is out of range: in=%d out=%d nr_max=%d\n",
in_sigs, out_sigs, n_trigs\n");
return NULL;
}
With this:
Reviewed-by: Leo Yan <leo.yan(a)arm.com>
BTW, I have given my review tag on v8, please remember to update
patches with review / ack tags.
On Thu, May 21, 2026 at 08:16:29PM +0800, Yingchao Deng wrote:
[...]
> Qualcomm implements an extended variant of the ARM CoreSight CTI with a
> different register layout and vendor-specific behavior. While the
> programming model remains largely compatible, the register offsets differ
> from the standard ARM CTI and require explicit handling.
I cannot apply this patch successfuly. Please rebase on the latest
coresight-next branch.
> @@ -726,6 +734,22 @@ static int cti_probe(struct amba_device *adev, const struct amba_id *id)
>
> raw_spin_lock_init(&drvdata->spinlock);
>
> + devarch = readl_relaxed(drvdata->base + CORESIGHT_DEVARCH);
> + if (CTI_DEVARCH_ARCHITECT(devarch) == ARCHITECT_QCOM) {
> + drvdata->is_qcom_cti = true;
> + /*
> + * QCOM CTI does not implement Claimtag functionality as
> + * per CoreSight specification, but its CLAIMSET register
> + * is incorrectly initialized to 0xF. This can mislead
> + * tools or drivers into thinking the component is claimed.
> + *
> + * Reset CLAIMSET to 0 to reflect that no claims are active.
> + */
> + CS_UNLOCK(drvdata->base);
> + writel_relaxed(0, drvdata->base + CORESIGHT_CLAIMSET);
> + CS_LOCK(drvdata->base);
Sorry I missed this piece before.
Can you move this quirk into firmware? I don't think the CTI driver
should clear the external claim bit as this totally break the protocol
defined in PSCI. A clean way would clear the bits in firmware and then
CTI driver can use the CLAIM registers.
Or, another option is to create several helpers to bypass claim
operations for Qcom CTI:
static void cti_clear_self_claim_tag(cti_drvdata *drvdata,
struct csdev_access *csa)
{
if (drvdata->is_qcom_cti)
return;
coresight_clear_self_claim_tag(csa);
}
static int cti_claim_device(cti_drvdata *drvdata)
{
if (drvdata->is_qcom_cti)
return 0;
return coresight_claim_device(drvdata->csdev);
}
static void cti_unclaim_device_unlocked(cti_drvdata *drvdata)
{
if (drvdata->is_qcom_cti)
return;
return coresight_disclaim_device_unlocked(drvdata->csdev);
}
Otherwise, this patch is fine for me.
Thanks,
Leo
On Tue, May 19, 2026 at 04:48:04PM +0100, Yeoreum Yun wrote:
[...]
> @@ -571,11 +571,11 @@ static int etm4_enable_hw(struct etmv4_drvdata *drvdata)
> etm4x_relaxed_write32(csa, config->res_ctrl[i], TRCRSCTLRn(i));
>
> for (i = 0; i < caps->nr_ss_cmp; i++) {
> - /* always clear status bit on restart if using single-shot */
> + /* always clear status and pending bits on restart if using single-shot */
> if (config->ss_ctrl[i] || config->ss_pe_cmp[i])
> - config->ss_status[i] &= ~TRCSSCSRn_STATUS;
> + drvdata->ss_status[i] &= ~(TRCSSCSRn_STATUS | TRCSSCSRn_PENDING);
> etm4x_relaxed_write32(csa, config->ss_ctrl[i], TRCSSCCRn(i));
> - etm4x_relaxed_write32(csa, config->ss_status[i], TRCSSCSRn(i));
> + etm4x_relaxed_write32(csa, drvdata->ss_status[i], TRCSSCSRn(i));
> if (etm4x_sspcicrn_present(drvdata, i))
> etm4x_relaxed_write32(csa, config->ss_pe_cmp[i], TRCSSPCICRn(i));
> }
> @@ -772,6 +772,7 @@ static int etm4_parse_event_config(struct coresight_device *csdev,
> /* Clear configuration from previous run */
> memset(config, 0, sizeof(struct etmv4_config));
>
> +
Unexpected new line?
> @@ -1497,8 +1498,9 @@ static void etm4_init_arch_data(void *info)
> */
> caps->nr_ss_cmp = FIELD_GET(TRCIDR4_NUMSSCC_MASK, etmidr4);
> for (i = 0; i < caps->nr_ss_cmp; i++) {
> - drvdata->config.ss_status[i] =
> - etm4x_relaxed_read32(csa, TRCSSCSRn(i));
> + drvdata->ss_status[i] = etm4x_relaxed_read32(csa, TRCSSCSRn(i));
> + drvdata->ss_status[i] &= (TRCSSCSRn_PC | TRCSSCSRn_DV |
> + TRCSSCSRn_DA | TRCSSCSRn_INST);
Since etm4_enable_hw() clears the TRCSSCSRn_STATUS and
TRCSSCSRn_PENDING bits every time, here is no need to clear the
status bits during probe.
In the future, we may want to preserve the status within a session and
clear it only when starting a new session. Clearing the status bits here
still cannot handle stale status across multiple sessions, so we can
defer this improvement for later.
Thanks,
Leo
On Tue, May 19, 2026 at 04:48:02PM +0100, Yeoreum Yun wrote:
> According to IHI006H Embedded Trace Macrocell Architecture
> Specification [0], n could be 0-2 for TCRSEQEVR<n> when
> TCRIDR5.NUMSEQSTATE is 0b100.
>
> Therefore, introduce ETM_MAX_SEQ_TRANSITIONS macro and apply this
> in TCRSEQEVR<n> relevant fields.
>
> Link: https://developer.arm.com/documentation/ihi0064/latest/ [0]
> Suggestedby: Leo Yan <leo.yan(a)arm.com>
s/Suggestedby/Suggested-by
> Signed-off-by: Yeoreum Yun <yeoreum.yun(a)arm.com>
The change looks good to me:
Reviewed-by: Leo Yan <leo.yan(a)arm.com>
On 15/05/2026 3:11 am, Amir Ayupov wrote:
> In a system-wide `perf record -e cs_etm/.../u` capture on aarch64,
> synthesized samples emitted by `perf script --itrace=il64` are
> sometimes attributed to the WRONG sample.pid/tid (and to the wrong
> EL/cpumode) for the chunk of branches that straddle a context-switch
> boundary on a CPU. A branch actually retired by process A is emitted
> with sample.pid set to the thread that next ran on the same CPU.
>
> Mechanism:
> 1. ETM emits CONTEXTIDR/EL packets in-stream when the kernel updates
> CONTEXTIDR_EL1 on context switch / EL change. OpenCSD turns these
> into OCSD_GEN_TRC_ELEM_PE_CONTEXT elements interleaved with
> OCSD_GEN_TRC_ELEM_INSTR_RANGE elements for retired branch ranges.
> 2. cs_etm_decoder__buffer_range() queues each INSTR_RANGE into
> packet_queue->packet_buffer[]; packets carry start/end addrs,
> instr_count, last-instruction info, etc., but NO owner identity.
> 3. PE_CONTEXT goes through cs_etm_decoder__set_tid() ->
> cs_etm__set_thread(), which immediately mutates tidq->thread and
> tidq->el. Queued packets are not drained first; reset_timestamp()
> is called so the next TIMESTAMP triggers OCSD_RESP_WAIT and a
> drain.
> 4. By drain time in cs_etm__process_traceid_queue() ->
> cs_etm__sample(), sample.pid/tid is read from the now-mutated
> tidq->thread and sample.cpumode from the now-mutated tidq->el.
> Pre-context INSTR_RANGEs get the post-context owner.
>
> The same race affects branch samples via tidq->prev_packet_thread /
> tidq->prev_packet_el, captured at packet-swap time from
> tidq->thread / tidq->el (which may already have flipped).
>
> This is independent of PERF_RECORD_SWITCH_CPU_WIDE, which is
> deliberately not used to assign sample identity in this path. The
> bug applies to any cs_etm capture with in-stream CONTEXTIDR
> (PIDFMT_CTXTID or PIDFMT_CTXTID2).
>
> Effect on downstream tools: branches that should belong to the
> previous thread on the CPU get attributed to the next thread. When
> the two threads share a binary, leaked branches' VAs land in the
> wrong thread's mappings; samples whose IPs land in r-x mappings
> silently pollute that binary's profile, while samples landing in
> R-only/RW mappings show up as out-of-range / non-text samples.
> Either way, AutoFDO/BOLT profiles built from `perf script --itrace`
> output of system-wide cs_etm captures contain misattributed samples.
>
> Concrete example from `perf script --itrace=il64` of the same
> captured branch (same timestamp, same IP, same from/to addrs) before
> and after this fix:
>
> before: launcher_multia 2638146/2638146 705897.219172: \
> fffcda6b124c 0xfffcda641958/0xfffcda6b123c
> after: ws-tcf-sr-io13 2736581/2741587 705897.219172: \
> fffcda6b124c 0xfffcda641958/0xfffcda6b123c
>
> The branch was retired by ws-tcf-sr-io13 (tid 2741587) but, before
> the fix, was attributed to launcher_multia (the next thread to run on
> that CPU after the context switch). After the fix, it is correctly
> attributed to ws-tcf-sr-io13.
>
> Why not "drain on PE_CONTEXT then switch" (deferred-set_thread):
> tidq->thread has two consumers \u2014 sample emission needs the OUTGOING
> identity for queued packets, but cs_etm__mem_access() needs the
> CURRENT thread's maps to fetch instruction bytes for OpenCSD. The
> two needs are temporally inverted; a single tidq->thread cannot
> serve both. Keeping tidq->thread current and stamping owner identity
> per packet is the only design that decouples them cleanly.
>
> Fix: capture the owning pid/tid/EL on each buffered packet at
> cs_etm_decoder__buffer_packet() time (before any subsequent
> PE_CONTEXT can mutate tidq->thread / tidq->el), and read them at
> sample emission time.
>
> - struct cs_etm_packet gains pid_t pid, pid_t tid, int el (storing
> an ocsd_ex_level value; typed as int so the struct does not
> depend on OpenCSD headers, which are only included inside
> HAVE_CSTRACE_SUPPORT).
> - cs_etm__etmq_get_pid_tid_el() (formerly cs_etm__etmq_get_pid_tid)
> returns all three.
> - cs_etm__synth_instruction_sample() reads sample.pid / sample.tid
> from tidq->packet->{pid,tid} and derives sample.cpumode from
> tidq->packet->el.
> - cs_etm__synth_branch_sample() reads sample.pid / sample.tid /
> cpumode from tidq->prev_packet->{pid,tid,el}.
> - The separate prev_packet_thread / prev_packet_el bookkeeping in
> cs_etm__packet_swap() / cs_etm__init_traceid_queue() /
> cs_etm__free_traceid_queues() is removed; the per-packet stamp
> on prev_packet now carries that information.
>
> Cost: 12 bytes added to struct cs_etm_packet (~12-16 KB per
> packet_queue with CS_ETM_PACKET_MAX_BUFFER=1024), 16 bytes saved per
> cs_etm_traceid_queue (one struct thread * + one ocsd_ex_level).
>
> A residual gap: cs_etm__copy_insn() reads sample.insn bytes via
> cs_etm__mem_access(), which still uses tidq->thread (the current
> thread), so the inline insn bytes for an outgoing-thread sample may
> be looked up against the wrong address space. Fixing this requires
> threading the packet's owner pid through cs_etm__mem_access and is
> left for a follow-up. sample.ip / sample.pid attribution \u2014 what
> AutoFDO/BOLT consume \u2014 is correct.
>
Hi Amir,
Can you test the patch here to see if it fixes your issue [1]?
We thought it didn't make sense to store the thread on every packet when
there is only one active thread for the decoder and one for sample
generation. We also fixed the other issue mentioned above about
cs_etm__copy_insn() not working.
Thanks
James
[1]:
https://lore.kernel.org/linux-perf-users/20260526-james-cs-context-tracking…
On 20/05/2026 11:13 am, Jie Gan wrote:
>
>
> On 5/20/2026 5:27 PM, Mike Leach wrote:
>>
>>
>>> -----Original Message-----
>>> From: James Clark <james.clark(a)linaro.org>
>>> Sent: Wednesday, May 20, 2026 9:38 AM
>>> To: Jie Gan <jie.gan(a)oss.qualcomm.com>
>>> Cc: coresight(a)lists.linaro.org; linux-arm-kernel(a)lists.infradead.org;
>>> linux-
>>> kernel(a)vger.kernel.org; Suzuki Poulose <Suzuki.Poulose(a)arm.com>; Mike
>>> Leach <Mike.Leach(a)arm.com>; Leo Yan <Leo.Yan(a)arm.com>; Alexander
>>> Shishkin <alexander.shishkin(a)linux.intel.com>; Mathieu Poirier
>>> <mathieu.poirier(a)linaro.org>; Tingwei Zhang
>>> <tingwei.zhang(a)oss.qualcomm.com>; Greg Kroah-Hartman
>>> <gregkh(a)linuxfoundation.org>
>>> Subject: Re: [PATCH] coresight: fix resource leaks on path build failure
>>>
>>>
>>>
>>> On 20/05/2026 2:55 am, Jie Gan wrote:
>>>>
>>>>
>>>> On 5/19/2026 9:57 PM, James Clark wrote:
>>>>>
>>>>>
>>>>> On 13/05/2026 2:32 am, Jie Gan wrote:
>>>>>> Two related leaks when _coresight_build_path() encounters an error
>>>>>> after
>>>>>> coresight_grab_device() has already incremented the pm_runtime,
>>> module,
>>>>>> and device references for a node:
>>>>>>
>>>>>> 1. In _coresight_build_path(), if kzalloc_obj() for the path node
>>>>>> fails
>>>>>> after coresight_grab_device() succeeds,
>>>>>> coresight_drop_device() was
>>>>>> never called, permanently leaking all three references.
>>>>>>
>>>>>> 2. In coresight_build_path(), on failure the partial path was
>>>>>> freed with
>>>>>> kfree(path) instead of coresight_release_path(path). kfree()
>>>>>> only
>>>>>> frees the coresight_path struct itself; it does not iterate
>>>>>> path_list
>>>>>> to call coresight_drop_device() and kfree() for each
>>>>>> coresight_node
>>>>>> already added by deeper recursive calls, leaking both the
>>>>>> pm_runtime,
>>>>>> module, and device references and the node memory for every
>>>>>> element
>>>>>> on the partial path.
>>>>>>
>>>>>> Fix both by adding coresight_drop_device() in the OOM unwind of
>>>>>> _coresight_build_path(), and replacing kfree(path) with
>>>>>> coresight_release_path(path) in coresight_build_path().
>>>>>>
>>>>>> Fixes: 32b0707a4182 ("coresight: Add try_get_module() in
>>>>>> coresight_grab_device()")
>>>>>> Fixes: b3e94405941e ("coresight: associating path with session rather
>>>>>> than tracer")
>>>>>> Signed-off-by: Jie Gan <jie.gan(a)oss.qualcomm.com>
>>>>>> ---
>>>>>> drivers/hwtracing/coresight/coresight-core.c | 6 ++++--
>>>>>> 1 file changed, 4 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/
>>>>>> hwtracing/coresight/coresight-core.c
>>>>>> index 46f247f73cf6..c1354ea8e11d 100644
>>>>>> --- a/drivers/hwtracing/coresight/coresight-core.c
>>>>>> +++ b/drivers/hwtracing/coresight/coresight-core.c
>>>>>> @@ -825,8 +825,10 @@ static int _coresight_build_path(struct
>>>>>> coresight_device *csdev,
>>>>>> return ret;
>>>>>> node = kzalloc_obj(struct coresight_node);
>>>>>> - if (!node)
>>>>>> + if (!node) {
>>>>>> + coresight_drop_device(csdev);
>>>>>> return -ENOMEM;
>>>>>> + }
>>>>>> node->csdev = csdev;
>>>>>> list_add(&node->link, &path->path_list);
>>>>>> @@ -851,7 +853,7 @@ struct coresight_path
>>>>>> *coresight_build_path(struct coresight_device *source,
>>>>>> rc = _coresight_build_path(source, source, sink, path);
>>>>>> if (rc) {
>>>>>> - kfree(path);
>>>>>> + coresight_release_path(path);
>>>>>> return ERR_PTR(rc);
>>>>>> }
>>>>>>
>>>>>> ---
>>>>>> base-commit: e98d21c170b01ddef366f023bbfcf6b31509fa83
>>>>>> change-id: 20260513-fix-memory-leak-issue-034b4a45265e
>>>>>>
>>>>>> Best regards,
>>>>>
>>>>> Looks good to me, but sashiko is complaining: https://sashiko.dev/#/
>>>>> patchset/20260513-fix-memory-leak-issue-
>>>>> v1-1-49822d7bc7d4%40oss.qualcomm.com
>>>>>
>>>>> I'm trying to understand why it's saying that, but I think the
>>>>> scenario is that if there are multiple correct paths to a sink, when
>>>>> one path partially fails and a second path succeeds you could get a
>>>>> path_list with some garbage entries in it.
>>>>
>>>> I think the coresight_release_path is added to address this situation.
>>>> We suffered the path partially failure, and we need release all nodes
>>>> already added to the path.
>>>>
>>>
>>> It wouldn't call coresight_release_path() in this case though. If one
>>> path ends up building to success but a parallel path partially failed
>>> then _coresight_build_path() still returns success. During the search it
>>> would have still added the nodes from the partially failed path to the
>>> path_list. This is only an issue if there are multiple correct paths.
>>>
>
> The point here is there are multiple routes from the same source device
> to the same sink device, am right?
>
> I have no experience on this scenario. So with the scenario, the
> build_path may succeeded in one route and failed in another route, but
> finally, the _coresight_build_path still returns success, is that correct?
>
>>>>>
>>>>> That's kind of a different and existing issue to the one you've fixed,
>>>>> and assumes that multiple paths to one sink are possible, which I'm
>>>>> not sure is supported?
>>>>
>>>> Each path is unique. We only deal with the issue path for balancing the
>>>> reference count.
>>>>
>>>> Thanks,
>>>> Jie
>>>>
>>>
>>> I'm not exactly sure what you mean by unique, but the same source and
>>> sink could potentially be connected through two different sets of links.
>>>
>>
>> Multiple paths between a source and sink are not permitted under the
>> CoreSight spec.
>>
>
> As Mike mentioned, my understanding is that a source device is only
> allowed to be added to one valid path—this is what I mean by “unique.”
>
> Thanks,
> Jie
>
That's ok then we can ignore this for this patch. But it would be good
to enforce that in _coresight_build_path() with some kind of assert. Or
at least add a comment to appease the AI reviewers.
>> If such a system was to be built - then a fix would need to be in the
>> declaration of connections - e.g. miss one path out in the device tree
>> for example. Not up to the Coresight drivers to handle out of
>> specification hardware.
>>
>> Mike
>>
>>
>>>>>
>>>>> It might be as easy as breaking the loop early for any return value
>>>>> other than -ENODEV, but I'll leave it to you to decide whether to do
>>>>> that here or not.
>>>>>
>>>>> Reviewed-by: James Clark <james.clark(a)linaro.org>
>>>>>
>>>>
>>
>
On 20/05/2026 2:55 am, Jie Gan wrote:
>
>
> On 5/19/2026 9:57 PM, James Clark wrote:
>>
>>
>> On 13/05/2026 2:32 am, Jie Gan wrote:
>>> Two related leaks when _coresight_build_path() encounters an error after
>>> coresight_grab_device() has already incremented the pm_runtime, module,
>>> and device references for a node:
>>>
>>> 1. In _coresight_build_path(), if kzalloc_obj() for the path node fails
>>> after coresight_grab_device() succeeds, coresight_drop_device() was
>>> never called, permanently leaking all three references.
>>>
>>> 2. In coresight_build_path(), on failure the partial path was freed with
>>> kfree(path) instead of coresight_release_path(path). kfree() only
>>> frees the coresight_path struct itself; it does not iterate
>>> path_list
>>> to call coresight_drop_device() and kfree() for each coresight_node
>>> already added by deeper recursive calls, leaking both the
>>> pm_runtime,
>>> module, and device references and the node memory for every element
>>> on the partial path.
>>>
>>> Fix both by adding coresight_drop_device() in the OOM unwind of
>>> _coresight_build_path(), and replacing kfree(path) with
>>> coresight_release_path(path) in coresight_build_path().
>>>
>>> Fixes: 32b0707a4182 ("coresight: Add try_get_module() in
>>> coresight_grab_device()")
>>> Fixes: b3e94405941e ("coresight: associating path with session rather
>>> than tracer")
>>> Signed-off-by: Jie Gan <jie.gan(a)oss.qualcomm.com>
>>> ---
>>> drivers/hwtracing/coresight/coresight-core.c | 6 ++++--
>>> 1 file changed, 4 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/
>>> hwtracing/coresight/coresight-core.c
>>> index 46f247f73cf6..c1354ea8e11d 100644
>>> --- a/drivers/hwtracing/coresight/coresight-core.c
>>> +++ b/drivers/hwtracing/coresight/coresight-core.c
>>> @@ -825,8 +825,10 @@ static int _coresight_build_path(struct
>>> coresight_device *csdev,
>>> return ret;
>>> node = kzalloc_obj(struct coresight_node);
>>> - if (!node)
>>> + if (!node) {
>>> + coresight_drop_device(csdev);
>>> return -ENOMEM;
>>> + }
>>> node->csdev = csdev;
>>> list_add(&node->link, &path->path_list);
>>> @@ -851,7 +853,7 @@ struct coresight_path
>>> *coresight_build_path(struct coresight_device *source,
>>> rc = _coresight_build_path(source, source, sink, path);
>>> if (rc) {
>>> - kfree(path);
>>> + coresight_release_path(path);
>>> return ERR_PTR(rc);
>>> }
>>>
>>> ---
>>> base-commit: e98d21c170b01ddef366f023bbfcf6b31509fa83
>>> change-id: 20260513-fix-memory-leak-issue-034b4a45265e
>>>
>>> Best regards,
>>
>> Looks good to me, but sashiko is complaining: https://sashiko.dev/#/
>> patchset/20260513-fix-memory-leak-issue-
>> v1-1-49822d7bc7d4%40oss.qualcomm.com
>>
>> I'm trying to understand why it's saying that, but I think the
>> scenario is that if there are multiple correct paths to a sink, when
>> one path partially fails and a second path succeeds you could get a
>> path_list with some garbage entries in it.
>
> I think the coresight_release_path is added to address this situation.
> We suffered the path partially failure, and we need release all nodes
> already added to the path.
>
It wouldn't call coresight_release_path() in this case though. If one
path ends up building to success but a parallel path partially failed
then _coresight_build_path() still returns success. During the search it
would have still added the nodes from the partially failed path to the
path_list. This is only an issue if there are multiple correct paths.
>>
>> That's kind of a different and existing issue to the one you've fixed,
>> and assumes that multiple paths to one sink are possible, which I'm
>> not sure is supported?
>
> Each path is unique. We only deal with the issue path for balancing the
> reference count.
>
> Thanks,
> Jie
>
I'm not exactly sure what you mean by unique, but the same source and
sink could potentially be connected through two different sets of links.
>>
>> It might be as easy as breaking the loop early for any return value
>> other than -ENODEV, but I'll leave it to you to decide whether to do
>> that here or not.
>>
>> Reviewed-by: James Clark <james.clark(a)linaro.org>
>>
>
On 13/05/2026 2:32 am, Jie Gan wrote:
> Two related leaks when _coresight_build_path() encounters an error after
> coresight_grab_device() has already incremented the pm_runtime, module,
> and device references for a node:
>
> 1. In _coresight_build_path(), if kzalloc_obj() for the path node fails
> after coresight_grab_device() succeeds, coresight_drop_device() was
> never called, permanently leaking all three references.
>
> 2. In coresight_build_path(), on failure the partial path was freed with
> kfree(path) instead of coresight_release_path(path). kfree() only
> frees the coresight_path struct itself; it does not iterate path_list
> to call coresight_drop_device() and kfree() for each coresight_node
> already added by deeper recursive calls, leaking both the pm_runtime,
> module, and device references and the node memory for every element
> on the partial path.
>
> Fix both by adding coresight_drop_device() in the OOM unwind of
> _coresight_build_path(), and replacing kfree(path) with
> coresight_release_path(path) in coresight_build_path().
>
> Fixes: 32b0707a4182 ("coresight: Add try_get_module() in coresight_grab_device()")
> Fixes: b3e94405941e ("coresight: associating path with session rather than tracer")
> Signed-off-by: Jie Gan <jie.gan(a)oss.qualcomm.com>
> ---
> drivers/hwtracing/coresight/coresight-core.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c
> index 46f247f73cf6..c1354ea8e11d 100644
> --- a/drivers/hwtracing/coresight/coresight-core.c
> +++ b/drivers/hwtracing/coresight/coresight-core.c
> @@ -825,8 +825,10 @@ static int _coresight_build_path(struct coresight_device *csdev,
> return ret;
>
> node = kzalloc_obj(struct coresight_node);
> - if (!node)
> + if (!node) {
> + coresight_drop_device(csdev);
> return -ENOMEM;
> + }
>
> node->csdev = csdev;
> list_add(&node->link, &path->path_list);
> @@ -851,7 +853,7 @@ struct coresight_path *coresight_build_path(struct coresight_device *source,
>
> rc = _coresight_build_path(source, source, sink, path);
> if (rc) {
> - kfree(path);
> + coresight_release_path(path);
> return ERR_PTR(rc);
> }
>
>
> ---
> base-commit: e98d21c170b01ddef366f023bbfcf6b31509fa83
> change-id: 20260513-fix-memory-leak-issue-034b4a45265e
>
> Best regards,
Looks good to me, but sashiko is complaining:
https://sashiko.dev/#/patchset/20260513-fix-memory-leak-issue-v1-1-49822d7b…
I'm trying to understand why it's saying that, but I think the scenario
is that if there are multiple correct paths to a sink, when one path
partially fails and a second path succeeds you could get a path_list
with some garbage entries in it.
That's kind of a different and existing issue to the one you've fixed,
and assumes that multiple paths to one sink are possible, which I'm not
sure is supported?
It might be as easy as breaking the loop early for any return value
other than -ENODEV, but I'll leave it to you to decide whether to do
that here or not.
Reviewed-by: James Clark <james.clark(a)linaro.org>