On Mon, Jun 29, 2026 at 10:08:17AM +0800, Jie Gan wrote:
[...]
> Can I fix the issue by adding "arm,primecell-periphid" property. That's
> would be the best temp solution as it avoids breaking the original design of
> both the TraceNoC AMBA driver and interconnect TraceNoC platform driver.
Before proceeding with the "arm,primecell-periphid" property, could you
clarify a bit:
- For an interconnect TraceNoC, what would be the consequence of
enabling ATID? Would it simply be a no-op, or are there any side
effects? Or is the concern that the trace IDs could be exhausted?
- How can you guarantee that a interconnect TraceNoC will never
require ATID in the future?
> The TraceNoC device here must be treated as an AMBA device and I am
> continuing to investigate the issue with our hardware team.
> We aim to fix it from hardware perspetive for existing platforms if possible
> and ensure it is fixed in future platforms.
I'm concerned that all of use end up repeatedly fixing similar issues
whenever hardware configurations change or modules are reused in
different topologies.
For example, if future platforms may require ATID support for an
interconnect TraceNoC, then the issue will pop up again.
Thanks,
Leo
Hello,
On 29/06/2026 11:17, Songwei.Chai wrote:
>
> On 6/29/2026 12:22 PM, Greg KH wrote:
>> On Mon, Jun 29, 2026 at 11:03:33AM +0800, Songwei.Chai wrote:
>>> Hi Greg & Alexander,
>>>
>>> Apologies for interrupting again.
>>>
>>> As the TGU hardware plays an important role in Qualcomm tracing
>>> design, I
>>> would greatly appreciate it if you could kindly take some time to review
>>> this at your earliest convenience.
>> The merge window _just_ closed, please give us a chance to catch up.
>>
>> Also, why us? Surely you have other reviewers for this code, right?
>
> Hi Greg,
>
> Understood, thanks for letting us know.
>
> Regarding your question: since this introduces a new drivers/hwtracing/
> qcom directory, there is no existing maintainer for it.
> Given your scope (and Alexander's), we believe you are the most relevant
> reviewers.
>
> The reason for creating the qcom directory is as follows:
>
> /We previously tried to upstream this driver under drivers/hwtracing/
> coresight,/
> /but it was not accepted as it is considered Qualcomm-specific and not
> tightly/
> /coupled with the CoreSight subsystem. Based on this feedback, we are
Some clarification here: This device is not CoreSight so we denied
keeping this under drivers/hwtracing/coresight/ - Not because it is
Qualcomm specific. We have TPDM, TPDA, TnoC devices under the coresight
subsystem, which are all Qualcomm specific for e.g.
That said, there are other drivers in drivers/hwtracing/ which I usually
merge and push to Greg, after some reviews/acks from the respective
people (e.g., PTT HiSilicon PCIe Tune and Trace).
But, your proposal was that there were other maintainers for your new
subtree and you were going to push this via ,linux-arm-msm ? to which I
didn't have any objections.
That said, I am fine with pushing this to Greg via the CoreSight pull
requests (similar to Hisilicon PTT driver), but would need someone to
Maintain/Review the driver (with entries in MAINTAINERS, similar to
PTT).
Thoughts ?
Kind regards
Suzuki
> exploring/
> /a dedicated drivers/hwtracing/qcom directory, similar to intel_th, to
> better/
> /support this and future Qualcomm hwtracing drivers./
>
> More details can be found in “[PATCH v14 0/7] -- Why we are proposing
> this”.
>
> Thanks,
> Songwei
>
>>
>> thanks,
>>
>> greg k-h
This series adds thread-stack and synthesized callchain support for Arm
CoreSight, which comes from older series [1] but heavily rewritten.
CS ETM previously kept last-branch state in a per-trace-queue buffer.
That effectively makes the state per CPU, while the call/return history
belongs to a thread. This series moves branch tracking to the common
thread-stack code.
The series records CoreSight branches with thread_stack__event(), uses
thread_stack__br_sample() for last branch entries, flushes thread stacks
after decoder resets.
A decoder reset between AUX trace buffers is treated as a global trace
discontinuity, so all thread stacks are flushed, so avoids carrying
stale call/return history across a trace discontinuity.
One limitation remains for instructions emulated by the kernel. In that
case the exception return address may not match the return address
stored in the thread stack, because after exception return can be one
instruction ahead. The stack can still recover when a later return
matches an upper caller. Given emulated instructions are not the common
target for performance callchain analysis. Supporting this would require
extending the common thread-stack path to accept both the real target
address and an adjusted address for stack matching, so this series
leaves that extra complexity out.
The series has been tested on Orion6 board:
perf test 136 -vvv
136: CoreSight synthesized callchain:
--- start ---
test child forked, pid 3539
---- end(0) ----
136: CoreSight synthesized callchain : Ok
perf script --itrace=g16i10il64
callchain_test 17468 [005] 1031003.229943: 10 instructions:
aaaac32507c4 main+0x8 (/home/kernel/leoy/test_cs_callchain/callchain_test)
ffff90bd225c __libc_start_call_main+0x7c (/usr/lib/aarch64-linux-gnu/libc.so.6)
ffff90bd233c call_init+0x9c (inlined)
ffff90bd233c __libc_start_main_impl+0x9c (inlined)
aaaac3250670 _start+0x30 (/home/kernel/leoy/test_cs_callchain/callchain_test)
callchain_test 17468 [005] 1031003.229943: 10 instructions:
aaaac3250774 do_svc+0xc (/home/kernel/leoy/test_cs_callchain/callchain_test)
aaaac3250798 print+0xc (/home/kernel/leoy/test_cs_callchain/callchain_test)
aaaac32507b0 foo+0xc (/home/kernel/leoy/test_cs_callchain/callchain_test)
aaaac32507c8 main+0xc (/home/kernel/leoy/test_cs_callchain/callchain_test)
ffff90bd225c __libc_start_call_main+0x7c (/usr/lib/aarch64-linux-gnu/libc.so.6)
ffff90bd233c call_init+0x9c (inlined)
ffff90bd233c __libc_start_main_impl+0x9c (inlined)
aaaac3250670 _start+0x30 (/home/kernel/leoy/test_cs_callchain/callchain_test)
callchain_test 17468 [005] 1031003.229944: 10 instructions:
ffff800080010c20 vectors+0x420 ([kernel.kallsyms])
aaaac3250784 do_svc+0x1c (/home/kernel/leoy/test_cs_callchain/callchain_test)
aaaac3250798 print+0xc (/home/kernel/leoy/test_cs_callchain/callchain_test)
aaaac32507b0 foo+0xc (/home/kernel/leoy/test_cs_callchain/callchain_test)
aaaac32507c8 main+0xc (/home/kernel/leoy/test_cs_callchain/callchain_test)
ffff90bd225c __libc_start_call_main+0x7c (/usr/lib/aarch64-linux-gnu/libc.so.6)
ffff90bd233c call_init+0x9c (inlined)
ffff90bd233c __libc_start_main_impl+0x9c (inlined)
aaaac3250670 _start+0x30 (/home/kernel/leoy/test_cs_callchain/callchain_test)
Note, the test fails on Juno board which is caused by many discontinuity
packets (mainly caused by NO_SYNC elem). This is likely caused by the
FIFO overflow on the path.
[1] https://lore.kernel.org/linux-arm-kernel/20200220052701.7754-1-leo.yan@lina…
Signed-off-by: Leo Yan <leo.yan(a)arm.com>
---
Changes in v10:
- Change to syscall(SYS_gettid) for build failure on x86 (James).
- Extracted sample thread stack into cs_etm__sample_branch_stack().
- Link to v9: https://lore.kernel.org/r/20260616-b4-arm_cs_callchain_support_v1-v9-0-f8fa…
Changes in v9:
- Added patch 01 to fixed thread leak during trace queue init (sashiko).
- Added check in instruction and branch samples in
cs_etm__add_stack_event() (sashiko).
- Released frontend_thread properly in cs_etm__context() (sashiko).
- Refined cs_etm__flush_all_stack() to use switch (sashiko).
- Gathered James' review tags.
- Rebased on the latest perf-tools-next.
- Link to v8: https://lore.kernel.org/r/20260611-b4-arm_cs_callchain_support_v1-v8-0-7379…
Changes in v8:
- Updated test_arm_coresight_disasm.sh to pass "--itrace=b" and updated
examples in arm-cs-trace-disasm.py (James).
- Removed static annotation in callchain workload and renamed functions
with prefix "callchain_" to reduce naming conflict (James).
- For callchain test pre-condition check, removed the aarch64 check and
added the root permission check (James).
- Resolved the shellcheck errors (James).
- Link to v7: https://lore.kernel.org/r/20260611-b4-arm_cs_callchain_support_v1-v7-0-1ba7…
Changes in v7:
- Rebased on the latest perf-tools-next.
- Used struct_size() for allocation callchain struct (James).
- Added a helper cs_etm__packet_has_taken_branch() (James).
- Minor improvements for the callchain test (used record-ctl FIFO and
reworked the validation callstack push / pop).
- Link to v6: https://lore.kernel.org/r/20260526-b4-arm_cs_callchain_support_v1-v6-0-f9f4…
Changes in v6:
- Heavily rewrote the patches since restarted the work after 6 years.
- Changed to use the common thread-stack for branch stack and callchain
management.
- Added a callchain test.
- Link to v5: https://lore.kernel.org/linux-arm-kernel/20200220052701.7754-1-leo.yan@lina…
Changes in v5:
- Addressed Mike's suggestion for performance improvement for function
cs_etm__instr_addr() for quick calculation for non T32;
- Removed the patch 'perf cs-etm: Synchronize instruction sample with
the thread stack' (Mike);
- Fixed the issue for exception is taken for branch target address
accessing, for the branch sample and stack thread handling, the
related patches are 01, 02, 07;
- Fixed the stack thread handling for instruction emulation and single
step with patches 08, 09.
- Link to v4: https://lore.kernel.org/linux-arm-kernel/20200203020716.31832-1-leo.yan@lin…
---
Leo Yan (9):
perf cs-etm: Fix thread leaks on trace queue init failure
perf cs-etm: Filter synthesized branch samples
perf cs-etm: Decode ETE exception packets
perf cs-etm: Refactor instruction size handling
perf cs-etm: Use thread-stack for last branch entries
perf cs-etm: Flush thread stacks after decoder reset
perf cs-etm: Support call indentation
perf cs-etm: Synthesize callchains for instruction samples
perf test: Add Arm CoreSight callchain test
tools/perf/Documentation/perf-test.txt | 6 +-
tools/perf/scripts/python/arm-cs-trace-disasm.py | 9 +-
tools/perf/tests/builtin-test.c | 1 +
tools/perf/tests/shell/coresight/callchain.sh | 172 ++++++++++
.../shell/coresight/test_arm_coresight_disasm.sh | 4 +-
tools/perf/tests/tests.h | 1 +
tools/perf/tests/workloads/Build | 2 +
tools/perf/tests/workloads/callchain.c | 33 ++
tools/perf/util/cs-etm.c | 377 +++++++++++++--------
9 files changed, 454 insertions(+), 151 deletions(-)
---
base-commit: 8c214ad8cb8d692c82c6466b8e88973dbfa8e064
change-id: 20260521-b4-arm_cs_callchain_support_v1-2c2a70719bcc
Best regards,
--
Leo Yan <leo.yan(a)arm.com>
The current ETMx configuration via sysfs can lead to the following
inconsistencies:
- If a configuration is modified via sysfs while a perf session is
active, the running configuration may differ between before
a sched-out and after a subsequent sched-in.
- If a perf session and sysfs session tries to enable concurrently,
configuration from configfs could be corrupted (etm4).
- There is chance to corrupt drvdata->config if perf session tries
to enabled among handling cscfg_csdev_disable_active_config()
in etm4_disable_sysfs() (etm4).
To resolve these inconsistencies, the configuration should be separated into:
- active_config, which is applied configuration for the current session
- config, which stores the settings configured via sysfs.
and apply configuration from configfs after taking a mode.
Also, This patch set includes some small fixes:
- missing trace id release in etm4x.
- underflow issue for nrseqstate.
- wrong check in etm4x_sspcicrn_present().
- missing call of cscfg_csdev_disable_active_config()
This patch based on coresight tree's next
Patch History
=============
from v7 to v8:
- accept @Leo Yan' suggestion to handle error.
- small minor fixes following @Suzuki' suggestion.
- https://lore.kernel.org/all/20260519154812.254884-1-yeoreum.yun@arm.com/
from v6 to v7:
- rebase on coresight/next
- add ETM_MAX_SEQ_TRANSITIONS define
- remove redundant patch relavent cpu-hotplug as coresight-pm patch
merged.
- https://lore.kernel.org/all/20260422132203.977549-1-yeoreum.yun@arm.com/
from v5 to v6:
- fix missing of calling cscfg_csdev_disable_active_config()
- add rb & fixes tags.
- add ss_status field in etm4x_drvdata to expose STATUS and PENDING bits.
- https://lore.kernel.org/all/20260415165528.3369607-1-yeoreum.yun@arm.com/
from v4 to v5:
- add rb-tag.
- fix underflow issue for nrseqstate.
- fix wrong check in etm4_sspcicrn_present().
- remove redundant fields on etmv4_save_state.
- rename caps->ss_status to ss_cmp.
- fix wrong location of etm4_release_trace_id.
- https://lore.kernel.org/all/20260413142003.3549310-1-yeoreum.yun@arm.com/
from v3 to v4:
- change etm_drvdata->spinlock type to raw_spin_lock_t
- remove redundant call etmX_enable_hw() with starting_cpu() callsback.
- fix missing trace id release.
- add missing docs.
- https://lore.kernel.org/all/20260412175506.412301-1-yeoreum.yun@arm.com/
from v2 to v3:
- fix build error for etm3x.
- fix checkpatch warning.
- https://lore.kernel.org/all/20260410074310.2693385-1-yeoreum.yun@arm.com/
from v1 to v2
- rebased to v7.0-rc7.
- introduce etmX_caps structure to save etmX's capabilities.
- remove ss_status from etmv4_config.
- modify active_config after taking a mode (perf/sysfs).
- https://lore.kernel.org/all/20260317181705.2456271-1-yeoreum.yun@arm.com/
Yeoreum Yun (13):
coresight: etm4x: fix wrong check of etm4x_sspcicrn_present()
coresight: etm4x: fix underflow for usage of (nrseqstate - 1)
coresight: etm4x: introduce struct etm4_caps
coresight: etm4x: exclude ss_status from drvdata->config
coresight: etm4x: remove s_ex_level from config
coresight: etm4x: remove redundant fields in etmv4_save_state
coresight: etm4x: fix leaked trace id
coresight: etm4x: fix inconsistencies with sysfs configuration
coresight: etm4x: missing cscfg_csdev_disable_active_config() in perf
enable
coresight: etm3x: change drvdata->spinlock type to raw_spin_lock_t
coresight: etm3x: introduce struct etm_caps
coresight: etm3x: fix inconsistencies with sysfs configuration
coresight: etm3x: remove redundant cpu online check on
etm_enable_sysfs()
drivers/hwtracing/coresight/coresight-etm.h | 46 +-
.../coresight/coresight-etm3x-core.c | 96 ++--
.../coresight/coresight-etm3x-sysfs.c | 159 +++---
.../hwtracing/coresight/coresight-etm4x-cfg.c | 5 +-
.../coresight/coresight-etm4x-core.c | 454 ++++++++++--------
.../coresight/coresight-etm4x-sysfs.c | 204 ++++----
drivers/hwtracing/coresight/coresight-etm4x.h | 202 ++++----
7 files changed, 639 insertions(+), 527 deletions(-)
--
LEVI:{C3F47F37-75D8-414A-A8BA-3980EC8A46D7}
Hi Sudeep & Suzuki,
Gentle reminder on this patch. Any feedback would be appreciated.
Thanks,
Yuanfang.
On 5/25/2026 9:17 PM, Maulik Shah (mkshah) wrote:
>
>
> On 12/19/2025 3:51 PM, Sudeep Holla wrote:
>> On Fri, Dec 19, 2025 at 10:13:14AM +0800, yuanfang zhang wrote:
>>>
>>>
>>> On 12/18/2025 7:33 PM, Sudeep Holla wrote:
>>>> On Thu, Dec 18, 2025 at 12:09:40AM -0800, Yuanfang Zhang wrote:
>>>>> This patch series adds support for CoreSight components local to CPU clusters,
>>>>> including funnel, replicator, and TMC, which reside within CPU cluster power
>>>>> domains. These components require special handling due to power domain
>>>>> constraints.
>>>>>
>>>>
>>>> Could you clarify why PSCI-based power domains associated with clusters in
>>>> domain-idle-states cannot address these requirements, given that PSCI CPU-idle
>>>> OSI mode was originally intended to support them? My understanding of this
>>>> patch series is that OSI mode is unable to do so, which, if accurate, appears
>>>> to be a flaw that should be corrected.
>>>
>>> It is due to the particular characteristics of the CPU cluster power
>>> domain.Runtime PM for CPU devices works little different, it is mostly used
>>> to manage hierarchicalCPU topology (PSCI OSI mode) to talk with genpd
>>> framework to manage the last CPU handling in cluster.
>>
>> That is indeed the intended design. Could you clarify which specific
>> characteristics differentiate it here?
>
> Sorry for coming very late on this.
>
> This series is intended to handle coresight components which resides within CPU cluster.
> For the cases where cluster is in deepest idle low power mode or all CPUs belonging to cluster
> are hotplugged off, access to coresight components can not be done.
>
> The implementation tried to address in two parts,
> 1. Using cluster power-domain to know which coresight component belongs to which cluster/CPUs
> 2. Schedule the task on intended cluster's CPU to make sure the CPU (and cluster) power is
> ON while coresight component of the cluster is being accessed (using smp_call_function_single()).
>
> The use of power-domains in (1) will limit this to PSCI OS-Initiated mode,
> to have this support on PSCI Platform-Coordinated mode too, probably instead of power-domains,
> cpu-maps (which also defines the clusters) from device tree is a better choice which will give
> the information on which CPU belongs to which cluster.
>
> (2) ensured that scheduling happened on intended CPU and while the access is in progress, CPU (and
> cluster) will not enter power down in between.
>
>>
>>> It doesn’t really send IPI to wakeup CPU device (It don’t have
>>> .power_on/.power_off) callback implemented which gets invoked from
>>> .runtime_resume callback. This behavior is aligned with the upstream Kernel.
>>>
>>
>> I am quite lost here. Why is it necessary to wake up the CPU? If I understand
>> correctly, all of this complexity is meant to ensure that the cluster power
>> domain is enabled before any of the funnel registers are accessed. Is that
>> correct?
>
> Yes, This is to ensure that CPU (and cluster) power is ON while coresight components
> for same cluster are being accessed.
>
>>
>> If so, and if the cluster domains are already defined as the power domains for
>> these funnel devices, then they should be requested to power on automatically
>> before any register access occurs. Is that not the case?
>
> Cluster power-domains will be only available for PSCI OS-initiated mode but also
> will not help for cases where all CPUs in cluster are hotplugged off as hotplugs are
> platform coordinated.
>
> After discussion with our HW team to automatically request power on for coresight
> component GPR [1] can be used but they seems not working as intended on the existing
> SoCs and will be available on next generation SoC.
>
> [1] https://developer.arm.com/documentation/ddi0480/d/Functional-Overview/Granu…
>
>>
>> What am I missing in this reasoning?
>>
>> The only explanation I can see is that the firmware does not properly honor
>> power-domain requests coming directly from the OS. I believe that may be the
>> case, but I would be glad to be proven wrong.
>>
>
> please see below comment for more details, This seems not a firmware issue.
>
>>>>
>>>>> Unlike system-level CoreSight devices, these components share the CPU cluster's
>>>>> power domain. When the cluster enters low-power mode (LPM), their registers
>>>>> become inaccessible. Notably, `pm_runtime_get` alone cannot bring the cluster
>>>>> out of LPM, making standard register access unreliable.
>>>>>
>>>>
>>>> Are these devices the only ones on the system that are uniquely bound to
>>>> cluster-level power domains? If not, what additional devices share this
>>>> dependency so that we can understand how they are managed in comparison?
>>>>
>>>
>>> Yes, devices like ETM and TRBE also share this power domain and access constraint.
>>> Their drivers naturally handle enablement/disablement on the specific CPU they
>>> belong to (e.g., via hotplug callbacks or existing smp_call_function paths).
>>
>> I understand many things are possible to implement, but the key question
>> remains: why doesn’t the existing OSI mechanism - added specifically to cover
>> cases like this solve the problem today?
>>
>> Especially on platforms with OSI enabled, what concrete limitation forces us
>> into this additional “wake-up” path instead of relying on OSI to manage the
>> dependency/power sequencing?
>
> + Ulf in loop.
>
> Current platforms with OSI enabled, Linux PSCI do not implement the power_on/power_off
> requests, as far as i know, runtime PM was never meant to implement this part and
> pm_runtime_get_sync() (from drivers/cpuidle/cpuidle-psci.c) call is only used to convey
> to cluster power domains about a child CPU/ sub domain being on after it has already
> been landed in Linux.
>
> The standalone invoke of pm_runtime_get_sync() from another CPU do not really turn on/get
> the CPU (and cluster), as the CPUs either use CPUidle / CPU hotplug paths to enter/exit
> low power mode.
>
> To put it other way,
> For a hot-plugged off CPU, invoking a pm_runtime_get_sync() won't get the CPU (and make
> its cluster power domain) ON. In order to turn on the CPU, one has to still request
> the online of the CPU, say via sysfs command echo 1 > /sys/devices/system/cpu/cpuX/online
> which would invoke PSCI CPU_ON function and the power domain for CPU gets marked as ON
> only after CPU already landed in Linux via psci_idle_cpuhp_up() invoking pm_runtime_get_sync().
>
> I used specific hotplug example but same applies to idle powered down CPU (or Cluster) too.
>
>>
>>>>> To address this, the series introduces:
>>>>> - Identifying cluster-bound devices via a new `qcom,cpu-bound-components`
>>>>> device tree property.
>>>>
>>>> Really, no please.
>>>>
>>>
>>> Our objective is to determine which CoreSight components are physically locate
>>> within the CPU cluster power domain.
>>>
>>> Would it be acceptable to derive this relationship from the existing
>>> power-domains binding?
>>
>> In my opinion, this is not merely a possibility but an explicit expectation.
>>
>>> For example, if a Funnel or Replicator node is linked to a power-domains
>>> entry that specifies a cpumask, the driver could recognize this shared
>>> dependency and automatically apply the appropriate cluster-aware behavior.
>>>
>>
>> Sure, but for the driver to use that information, we need clear explanation
>> for all the questions above. In short, why it is not working with the existing
>> PSCI domain idle support.
>>
>
> Explained above.
>
> Thanks,
> Maulik
On Fri, Jun 26, 2026 at 08:09:58PM +0800, Jie Gan wrote:
[...]
> I have another proposal: what if we allocate the ATID in trace_noc_id() when
> the device does not already have a valid ATID?
>
> Possible scenarios:
>
> If the itnoc device is connected to a TPDM device (which has no ATID),
> trace_noc_id() will be invoked via coresight_path_assign_trace_id(), and a
> valid ATID can be allocated for the path.
>
> If the itnoc device is connected to sources other than TPDM, trace_noc_id()
> will never be invoked, and therefore no ATID will be allocated for the
> device, saving resources.
TBH, I'm not sure I can make a judgement here, as I don't have enough
knowledge of the topology. And I'm not sure whether the listed
connections cover all possible cases.
I also found commit 5799dee92dc2:
| This patch adds platform driver support for the CoreSight Interconnect
| TNOC, Interconnect TNOC is a CoreSight link that forwards trace data
| from a subsystem to the Aggregator TNOC. Compared to Aggregator TNOC,
| it does not have aggregation and ATID functionality.
With your proposal, wouldn't ATID be allocated for the interconnect
TNOC while being skipped for the Aggregator TNOC? That seems to
contradict the commit log.
Thanks,
Leo
Hi Jie,
On Fri, Jun 26, 2026 at 10:03:41AM +0800, Jie Gan wrote:
[...]
> Hi Leo,
>
> To be honest, I would prefer not to modify the interconnect platform driver.
> On some Qualcomm platforms, multiple itnoc devices reside within small
> blocks(one or more than one for each block) and are connected to a dummy
> source. In such cases, two ATIDs are allocated for a path (the dummy source
> and the itnoc), which is inefficient. This is why the itnoc platform driver
> created to avoid this waste.
>
> The TraceNoC (called as AG TraceNoC) is a generic TraceNoC device which
> connected to multiple source and link devices, aggregating data from all
> source devices into a single output path.
As I said, it may be fragile to couple a specific device property (ATID)
to the AMBA driver.
You're now facing a case where a device cannot be registered as an AMBA
device, so it cannot use ATID. Likewise, I can imagine in future where a
device is registered as an AMBA device, but you don't want ATID.
> This device is implemented as an AMBA device but lacks proper hardware
> configuration. As a result, it must be handled in the driver as a
> workaround, which unfortunately breaks the original design intent.
Seems to me, it is not reasonable to pretend an AMBA device but AMBA
ID registers are absent.
How about add a new DT property ("qcom,tnoc-enable-atid") to force
enabling ATID?
Thanks,
Leo
On Thu, Jun 25, 2026 at 09:01:18AM +0800, Jie Gan wrote:
[...]
> > > However, I believe it is acceptable to allocate an ATID for the itNoC device
> > > and the issue can be fixed with this way.
> >
> > I think so.
>
> Hi Suzuki/Leo
>
> Which solution do you prefer to address the issue?
I will leave this to Suzuki.
> The interconnect traceNoC platform driver is intended for the itnoc device,
> implying that no TPDM devices are connected to it. So, if I modify it to
> allocate an ATID, I think it would be better to rename the “itnoc” node
> accordingly? Or it's ok to leave it as-is?
>
> BTW, the traceNoC device definitely is an AMBA device with CID/PID
> registers.
Just to share a bit thoughts on the driver's design.
I think it would be better to keep the probe function generic. The AMBA
probe should not be specific to TraceNoC, and the platform probe should
not be only dedicated to the interconnect TraceNoC. The probe function
should simply handle a device that appears on either the AMBA bus or the
platform bus.
So the question is: if allocat an ATID for all traceNoC devices, do you
still need to distinguish TraceNoC types? If no, then the code can be
unified.
Thanks,
Leo
On Wed, Jun 24, 2026 at 11:08:32PM +0800, Jie Gan wrote:
[...]
> > Why does it fail ? power management ? hw broken ? Is it really AMBA or
> > do you pretend that to be an AMBA device by faking the CID/PID?
>
> The CID reads as 0 from the register, which I suspect is a hardware design
> issue. I have not yet confirmed this with the hardware team. As a
> workaround, I provided a fake periphid via a DT property to bypass
> amba_read_periphid.
>
>
> Leo commented in other thread:
> >>tnoc.c registers both an AMBA driver and a platform driver. Shouldn't >>it
> >>be registered as a platform device instead?
>
> The platform driver is intended for the interconnect TraceNoC device and is
> not designed to allocate an ATID. The issue is that the TPDM device borrows
> the ATID from the TraceNoC device, resulting in the ATID always being 0 when
> associated with an interconnect NoC device.
>
> However, I believe it is acceptable to allocate an ATID for the itNoC device
> and the issue can be fixed with this way.
I think so.
On Wed, Jun 24, 2026 at 05:49:26PM +0800, Jie Gan wrote:
> The AMBA bus attempts to read the CID/PID of a device before invoking
> its probe function if the arm,primecell-periphid property is absent.
> This causes a deferred probe issue for the TraceNoC device, as the
> CID/PID cannot be read from the periphid register.
> Add the arm,primecell-periphid property to bypass the AMBA bus
> check and resolve the probe issue.
tnoc.c registers both an AMBA driver and a platform driver. Shouldn't it
be registered as a platform device instead?
Thanks,
Leo