CoreSight May 2025

coresight@lists.linaro.org

8 participants
38 discussions

[PATCH v4 0/7] Arm CoreSight: Support AUX pause and resume

by Leo Yan

This series is to enable AUX pause and resume on Arm CoreSight. The first patch extracts the trace unit controlling operations to two functions. These two functions will be used by AUX pause and resume. Patches 02 and 03 change the ETMv4 driver to prepare callback functions for AUX pause and resume. Patch 04 changes the ETM perf layer to support AUX pause and resume in a perf session. The patch 05 re-enables sinks after buffer update, based on it, the patch 06 updates buffer on AUX pause occasion, which can mitigate the trace data lose issue. Patch 07 documents the AUX pause usages with Arm CoreSight. This patch set has been verified on the Hikey960 board. It is suggested to disable CPUIdle (add `nohlt` option in Linux command line) when verifying this series. ETM and funnel drivers are found issues during CPU suspend and resume which will be addressed separately. Changes from v3: - Re-enabled sink in buffer update callbacks (Suzuki). Changes from v2: - Rebased on CoreSight next branch. - Dropped the uAPI 'update_buf_on_pause' and updated document respectively (Suzuki). - Renamed ETM callbacks to .pause_perf() and .resume_perf() (Suzuki). - Minor improvement for error handling in the AUX resume flow. Changes from v1: - Added validation function pointers in pause and resume APIs (Mike). Leo Yan (7): coresight: etm4x: Extract the trace unit controlling coresight: Introduce pause and resume APIs for source coresight: etm4x: Hook pause and resume callbacks coresight: perf: Support AUX trace pause and resume coresight: tmc: Re-enable sink after buffer update coresight: perf: Update buffer on AUX pause Documentation: coresight: Document AUX pause and resume Documentation/trace/coresight/coresight-perf.rst | 31 +++++++++ drivers/hwtracing/coresight/coresight-core.c | 22 +++++++ drivers/hwtracing/coresight/coresight-etm-perf.c | 84 +++++++++++++++++++++++- drivers/hwtracing/coresight/coresight-etm4x-core.c | 143 +++++++++++++++++++++++++++++------------ drivers/hwtracing/coresight/coresight-etm4x.h | 2 + drivers/hwtracing/coresight/coresight-priv.h | 2 + drivers/hwtracing/coresight/coresight-tmc-etf.c | 9 +++ drivers/hwtracing/coresight/coresight-tmc-etr.c | 10 +++ include/linux/coresight.h | 4 ++ 9 files changed, 265 insertions(+), 42 deletions(-) -- 2.34.1

7 months, 2 weeks

[PATCH v5 0/2] coresight: Add Coresight Trace Network On Chip driver

by Yuanfang Zhang

The Trace Network On Chip (TNOC) is an integration hierarchy which is a hardware component that integrates the functionalities of TPDA and funnels. It collects trace from subsystems and transfers it to coresight sink. In addition to the generic TNOC mentioned above, there is also a special type of TNOC called Interconnect TNOC. Unlike the generic TNOC, the Interconnect TNOC doesn't need ATID. Its primary function is to connect the source of subsystems to the Aggregator TNOC. Its driver is different from this patch and will describe it and upstream its driver separately. Signed-off-by: Yuanfang Zhang <quic_yuanfang(a)quicinc.com> --- Changes in v5: - update cover-letter to describe the Interconnect TNOC. - Link to v4: https://lore.kernel.org/r/20250415-trace-noc-v4-0-979938fedfd8@quicinc.com Changes in v4: - Fix dt_binding warning. - update mask of trace_noc amba_id. - Modify driver comments. - rename TRACE_NOC_SYN_VAL to TRACE_NOC_SYNC_INTERVAL. - Link to v3: https://lore.kernel.org/r/20250411-trace-noc-v3-0-1f19ddf7699b@quicinc.com Changes in v3: - Remove unnecessary sysfs nodes. - update commit messages. - Use 'writel' instead of 'write_relaxed' when writing to the register for the last time. - Add trace_id ops. - Link to v2: https://lore.kernel.org/r/20250226-trace-noc-driver-v2-0-8afc6584afc5@quici… Changes in v2: - Modified the format of DT binging file. - Fix compile warnings. - Link to v1: https://lore.kernel.org/r/46643089-b88d-49dc-be05-7bf0bb21f847@quicinc.com --- Yuanfang Zhang (2): dt-bindings: arm: Add device Trace Network On Chip definition coresight: add coresight Trace Network On Chip driver .../bindings/arm/qcom,coresight-tnoc.yaml | 111 ++++++++++++ drivers/hwtracing/coresight/Kconfig | 13 ++ drivers/hwtracing/coresight/Makefile | 1 + drivers/hwtracing/coresight/coresight-tnoc.c | 191 +++++++++++++++++++++ drivers/hwtracing/coresight/coresight-tnoc.h | 34 ++++ 5 files changed, 350 insertions(+) --- base-commit: a2cc6ff5ec8f91bc463fd3b0c26b61166a07eb11 change-id: 20250403-trace-noc-f8286b30408e Best regards, -- Yuanfang Zhang <quic_yuanfang(a)quicinc.com>

7 months, 2 weeks

Re: [PATCH v1 1/1] coresight: cti: Replace inclusion by struct fwnode_handle forward declaration

by Suzuki K Poulose

On Mon, 31 Mar 2025 10:14:53 +0300, Andy Shevchenko wrote: > The fwnode.h is not supposed to be used by the drivers as it > has the definitions for the core parts for different device > property provider implementations. Drop it. > > Since the code wants to use the pointer to the struct fwnode_handle > the forward declaration is provided. > > [...] Applied, thanks! [1/1] coresight: cti: Replace inclusion by struct fwnode_handle forward declaration https://git.kernel.org/coresight/c/aad548a9 Best regards, -- Suzuki K Poulose <suzuki.poulose(a)arm.com>

7 months, 2 weeks

Re: [PATCH v4] perf: Allocate non-contiguous AUX pages by default

by Anshuman Khandual

On 5/7/25 23:43, Yabin Cui wrote: > perf always allocates contiguous AUX pages based on aux_watermark. > However, this contiguous allocation doesn't benefit all PMUs. For > instance, ARM SPE and TRBE operate with virtual pages, and Coresight > ETR allocates a separate buffer. For these PMUs, allocating contiguous > AUX pages unnecessarily exacerbates memory fragmentation. This > fragmentation can prevent their use on long-running devices. > > This patch modifies the perf driver to be memory-friendly by default, > by allocating non-contiguous AUX pages. For PMUs requiring contiguous > pages (Intel BTS and some Intel PT), the existing > PERF_PMU_CAP_AUX_NO_SG capability can be used. For PMUs that don't > require but can benefit from contiguous pages (some Intel PT), a new > capability, PERF_PMU_CAP_AUX_PREFER_LARGE, is added to maintain their > existing behavior. > > Signed-off-by: Yabin Cui <yabinc(a)google.com> > --- > Changes since v3: > Add comments and a local variable to explain max_order value > changes in rb_alloc_aux(). > > Changes since v2: > Let NO_SG imply PREFER_LARGE. So PMUs don't need to set both flags. > Then the only place needing PREFER_LARGE is intel/pt.c. > > Changes since v1: > In v1, default is preferring contiguous pages, and add a flag to > allocate non-contiguous pages. In v2, default is allocating > non-contiguous pages, and add a flag to prefer contiguous pages. > > v1 patchset: > perf,coresight: Reduce fragmentation with non-contiguous AUX pages for > cs_etm > > arch/x86/events/intel/pt.c | 2 ++ > include/linux/perf_event.h | 1 + > kernel/events/ring_buffer.c | 33 ++++++++++++++++++++++++--------- > 3 files changed, 27 insertions(+), 9 deletions(-) > > diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c > index fa37565f6418..25ead919fc48 100644 > --- a/arch/x86/events/intel/pt.c > +++ b/arch/x86/events/intel/pt.c > @@ -1863,6 +1863,8 @@ static __init int pt_init(void) > > if (!intel_pt_validate_hw_cap(PT_CAP_topa_multiple_entries)) > pt_pmu.pmu.capabilities = PERF_PMU_CAP_AUX_NO_SG; > + else > + pt_pmu.pmu.capabilities = PERF_PMU_CAP_AUX_PREFER_LARGE; > > pt_pmu.pmu.capabilities |= PERF_PMU_CAP_EXCLUSIVE | > PERF_PMU_CAP_ITRACE | > diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h > index 0069ba6866a4..56d77348c511 100644 > --- a/include/linux/perf_event.h > +++ b/include/linux/perf_event.h > @@ -301,6 +301,7 @@ struct perf_event_pmu_context; > #define PERF_PMU_CAP_AUX_OUTPUT 0x0080 > #define PERF_PMU_CAP_EXTENDED_HW_TYPE 0x0100 > #define PERF_PMU_CAP_AUX_PAUSE 0x0200 > +#define PERF_PMU_CAP_AUX_PREFER_LARGE 0x0400 > > /** > * pmu::scope > diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c > index 5130b119d0ae..69c90ea1b79a 100644 > --- a/kernel/events/ring_buffer.c > +++ b/kernel/events/ring_buffer.c > @@ -679,7 +679,19 @@ int rb_alloc_aux(struct perf_buffer *rb, struct perf_event *event, > { > bool overwrite = !(flags & RING_BUFFER_WRITABLE); > int node = (event->cpu == -1) ? -1 : cpu_to_node(event->cpu); > - int ret = -ENOMEM, max_order; > + /* > + * True if the PMU needs a contiguous AUX buffer (CAP_AUX_NO_SG) or > + * prefers large contiguous pages (CAP_AUX_PREFER_LARGE). > + */ > + bool use_contiguous_pages = event->pmu->capabilities & ( > + PERF_PMU_CAP_AUX_NO_SG | PERF_PMU_CAP_AUX_PREFER_LARGE); > + /* > + * Initialize max_order to 0 for page allocation. This allocates single > + * pages to minimize memory fragmentation. This is overriden if Small nit typo -- s/overriden/overridden ^^^^ > + * use_contiguous_pages is true. > + */ > + int max_order = 0; > + int ret = -ENOMEM; > > if (!has_aux(event)) > return -EOPNOTSUPP; > @@ -689,8 +701,8 @@ int rb_alloc_aux(struct perf_buffer *rb, struct perf_event *event, > > if (!overwrite) { > /* > - * Watermark defaults to half the buffer, and so does the > - * max_order, to aid PMU drivers in double buffering. > + * Watermark defaults to half the buffer, to aid PMU drivers > + * in double buffering. > */ > if (!watermark) > watermark = min_t(unsigned long, > @@ -698,16 +710,19 @@ int rb_alloc_aux(struct perf_buffer *rb, struct perf_event *event, > (unsigned long)nr_pages << (PAGE_SHIFT - 1)); > > /* > - * Use aux_watermark as the basis for chunking to > - * help PMU drivers honor the watermark. > + * If using contiguous pages, use aux_watermark as the basis > + * for chunking to help PMU drivers honor the watermark. > */ > - max_order = get_order(watermark); > + if (use_contiguous_pages) > + max_order = get_order(watermark); > } else { > /* > - * We need to start with the max_order that fits in nr_pages, > - * not the other way around, hence ilog2() and not get_order. > + * If using contiguous pages, we need to start with the > + * max_order that fits in nr_pages, not the other way around, > + * hence ilog2() and not get_order. > */ > - max_order = ilog2(nr_pages); > + if (use_contiguous_pages) > + max_order = ilog2(nr_pages); > watermark = 0; > } > Reviewed-by: Anshuman Khandual <anshuman.khandual(a)arm.com>

7 months, 2 weeks

Re: [PATCH v4] perf: Allocate non-contiguous AUX pages by default

by James Clark

On 07/05/2025 7:13 pm, Yabin Cui wrote: > perf always allocates contiguous AUX pages based on aux_watermark. > However, this contiguous allocation doesn't benefit all PMUs. For > instance, ARM SPE and TRBE operate with virtual pages, and Coresight > ETR allocates a separate buffer. For these PMUs, allocating contiguous > AUX pages unnecessarily exacerbates memory fragmentation. This > fragmentation can prevent their use on long-running devices. > > This patch modifies the perf driver to be memory-friendly by default, > by allocating non-contiguous AUX pages. For PMUs requiring contiguous > pages (Intel BTS and some Intel PT), the existing > PERF_PMU_CAP_AUX_NO_SG capability can be used. For PMUs that don't > require but can benefit from contiguous pages (some Intel PT), a new > capability, PERF_PMU_CAP_AUX_PREFER_LARGE, is added to maintain their > existing behavior. > > Signed-off-by: Yabin Cui <yabinc(a)google.com> > --- > Changes since v3: > Add comments and a local variable to explain max_order value > changes in rb_alloc_aux(). > > Changes since v2: > Let NO_SG imply PREFER_LARGE. So PMUs don't need to set both flags. > Then the only place needing PREFER_LARGE is intel/pt.c. > > Changes since v1: > In v1, default is preferring contiguous pages, and add a flag to > allocate non-contiguous pages. In v2, default is allocating > non-contiguous pages, and add a flag to prefer contiguous pages. > > v1 patchset: > perf,coresight: Reduce fragmentation with non-contiguous AUX pages for > cs_etm > > arch/x86/events/intel/pt.c | 2 ++ > include/linux/perf_event.h | 1 + > kernel/events/ring_buffer.c | 33 ++++++++++++++++++++++++--------- > 3 files changed, 27 insertions(+), 9 deletions(-) > > diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c > index fa37565f6418..25ead919fc48 100644 > --- a/arch/x86/events/intel/pt.c > +++ b/arch/x86/events/intel/pt.c > @@ -1863,6 +1863,8 @@ static __init int pt_init(void) > > if (!intel_pt_validate_hw_cap(PT_CAP_topa_multiple_entries)) > pt_pmu.pmu.capabilities = PERF_PMU_CAP_AUX_NO_SG; > + else > + pt_pmu.pmu.capabilities = PERF_PMU_CAP_AUX_PREFER_LARGE; > > pt_pmu.pmu.capabilities |= PERF_PMU_CAP_EXCLUSIVE | > PERF_PMU_CAP_ITRACE | > diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h > index 0069ba6866a4..56d77348c511 100644 > --- a/include/linux/perf_event.h > +++ b/include/linux/perf_event.h > @@ -301,6 +301,7 @@ struct perf_event_pmu_context; > #define PERF_PMU_CAP_AUX_OUTPUT 0x0080 > #define PERF_PMU_CAP_EXTENDED_HW_TYPE 0x0100 > #define PERF_PMU_CAP_AUX_PAUSE 0x0200 > +#define PERF_PMU_CAP_AUX_PREFER_LARGE 0x0400 > > /** > * pmu::scope > diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c > index 5130b119d0ae..69c90ea1b79a 100644 > --- a/kernel/events/ring_buffer.c > +++ b/kernel/events/ring_buffer.c > @@ -679,7 +679,19 @@ int rb_alloc_aux(struct perf_buffer *rb, struct perf_event *event, > { > bool overwrite = !(flags & RING_BUFFER_WRITABLE); > int node = (event->cpu == -1) ? -1 : cpu_to_node(event->cpu); > - int ret = -ENOMEM, max_order; > + /* > + * True if the PMU needs a contiguous AUX buffer (CAP_AUX_NO_SG) or > + * prefers large contiguous pages (CAP_AUX_PREFER_LARGE). > + */ > + bool use_contiguous_pages = event->pmu->capabilities & (> + PERF_PMU_CAP_AUX_NO_SG | PERF_PMU_CAP_AUX_PREFER_LARGE); Reviewed-by: James Clark <james.clark(a)linaro.org> Minor nit: this comment is a bit verbose IMO, and it's only describing what rather than why. But the other one is ok. > + /* > + * Initialize max_order to 0 for page allocation. This allocates single > + * pages to minimize memory fragmentation. This is overriden if > + * use_contiguous_pages is true. > + */ > + int max_order = 0; > + int ret = -ENOMEM; > > if (!has_aux(event)) > return -EOPNOTSUPP; > @@ -689,8 +701,8 @@ int rb_alloc_aux(struct perf_buffer *rb, struct perf_event *event, > > if (!overwrite) { > /* > - * Watermark defaults to half the buffer, and so does the > - * max_order, to aid PMU drivers in double buffering. > + * Watermark defaults to half the buffer, to aid PMU drivers > + * in double buffering. > */ > if (!watermark) > watermark = min_t(unsigned long, > @@ -698,16 +710,19 @@ int rb_alloc_aux(struct perf_buffer *rb, struct perf_event *event, > (unsigned long)nr_pages << (PAGE_SHIFT - 1)); > > /* > - * Use aux_watermark as the basis for chunking to > - * help PMU drivers honor the watermark. > + * If using contiguous pages, use aux_watermark as the basis > + * for chunking to help PMU drivers honor the watermark. > */ > - max_order = get_order(watermark); > + if (use_contiguous_pages) > + max_order = get_order(watermark); > } else { > /* > - * We need to start with the max_order that fits in nr_pages, > - * not the other way around, hence ilog2() and not get_order. > + * If using contiguous pages, we need to start with the > + * max_order that fits in nr_pages, not the other way around, > + * hence ilog2() and not get_order. > */ > - max_order = ilog2(nr_pages); > + if (use_contiguous_pages) > + max_order = ilog2(nr_pages); > watermark = 0; > } >

7 months, 3 weeks

[PATCH v3 0/2] coresight: Add Coresight Trace Network On Chip driver

by Yuanfang Zhang

The Trace Network On Chip (TNOC) is an integration hierarchy which is a hardware component that integrates the functionalities of TPDA and funnels. It collects trace form subsystems and transfers to coresight sink. Signed-off-by: Yuanfang Zhang <quic_yuanfang(a)quicinc.com> --- Changes in v3: - Remove unnecessary sysfs nodes. - update commit messages. - Use 'writel' instead of 'write_relaxed' when writing to the register for the last time. - Add trace_id ops. - Link to v2: https://lore.kernel.org/r/20250226-trace-noc-driver-v2-0-8afc6584afc5@quici… Changes in v2: - Modified the format of DT binging file. - Fix compile warnings. - Link to v1: https://lore.kernel.org/r/46643089-b88d-49dc-be05-7bf0bb21f847@quicinc.com --- Yuanfang Zhang (2): dt-bindings: arm: Add device Trace Network On Chip definition coresight: add coresight Trace Network On Chip driver .../bindings/arm/qcom,coresight-tnoc.yaml | 111 ++++++++++++ drivers/hwtracing/coresight/Kconfig | 13 ++ drivers/hwtracing/coresight/Makefile | 1 + drivers/hwtracing/coresight/coresight-tnoc.c | 186 +++++++++++++++++++++ drivers/hwtracing/coresight/coresight-tnoc.h | 34 ++++ 5 files changed, 345 insertions(+) --- base-commit: a2cc6ff5ec8f91bc463fd3b0c26b61166a07eb11 change-id: 20250403-trace-noc-f8286b30408e Best regards, -- Yuanfang Zhang <quic_yuanfang(a)quicinc.com>

7 months, 3 weeks

[PATCH RESEND] coresight: etr: Use noncontiguous api instead of noncoherent

by Mao Jinlong

From: Shilpa Suresh <quic_c_sbsure(a)quicinc.com> The iommu support for noncoherent is removed by commit(dma-mapping: remove the {alloc,free}_noncoherent methods 81d88ce55092edf1a1f928efb373f289c6b90efd). Use alloc_noncontiguous function for etr flat buffer allocation. Signed-off-by: Shilpa Suresh <quic_c_sbsure(a)quicinc.com> Signed-off-by: Mao Jinlong <quic_jinlmao(a)quicinc.com> --- .../hwtracing/coresight/coresight-tmc-etr.c | 47 ++++++++++++++----- 1 file changed, 34 insertions(+), 13 deletions(-) diff --git a/drivers/hwtracing/coresight/coresight-tmc-etr.c b/drivers/hwtracing/coresight/coresight-tmc-etr.c index 76a8cb29b68a..3b204f3bd45b 100644 --- a/drivers/hwtracing/coresight/coresight-tmc-etr.c +++ b/drivers/hwtracing/coresight/coresight-tmc-etr.c @@ -24,6 +24,7 @@ struct etr_flat_buf { dma_addr_t daddr; void *vaddr; size_t size; + struct sg_table *sgt; }; struct etr_buf_hw { @@ -616,15 +617,24 @@ static int tmc_etr_alloc_flat_buf(struct tmc_drvdata *drvdata, if (!flat_buf) return -ENOMEM; - flat_buf->vaddr = dma_alloc_noncoherent(real_dev, etr_buf->size, - &flat_buf->daddr, - DMA_FROM_DEVICE, - GFP_KERNEL | __GFP_NOWARN); - if (!flat_buf->vaddr) { + flat_buf->sgt = dma_alloc_noncontiguous(real_dev, etr_buf->size, + DMA_FROM_DEVICE, GFP_KERNEL, 0); + if (!flat_buf->sgt) { kfree(flat_buf); return -ENOMEM; } + flat_buf->daddr = sg_dma_address(flat_buf->sgt->sgl); + flat_buf->vaddr = dma_vmap_noncontiguous(real_dev, etr_buf->size, + flat_buf->sgt); + if (!flat_buf->vaddr) { + dma_free_noncontiguous(real_dev, etr_buf->size, + flat_buf->sgt, + DMA_FROM_DEVICE); + flat_buf->sgt = NULL; + return -ENOMEM; + } + flat_buf->size = etr_buf->size; flat_buf->dev = &drvdata->csdev->dev; etr_buf->hwaddr = flat_buf->daddr; @@ -640,9 +650,12 @@ static void tmc_etr_free_flat_buf(struct etr_buf *etr_buf) if (flat_buf && flat_buf->daddr) { struct device *real_dev = flat_buf->dev->parent; - dma_free_noncoherent(real_dev, etr_buf->size, - flat_buf->vaddr, flat_buf->daddr, + dma_vunmap_noncontiguous(real_dev, flat_buf->vaddr); + dma_free_noncontiguous(real_dev, etr_buf->size, + flat_buf->sgt, DMA_FROM_DEVICE); + flat_buf->vaddr = NULL; + flat_buf->sgt = NULL; } kfree(flat_buf); } @@ -651,6 +664,9 @@ static void tmc_etr_sync_flat_buf(struct etr_buf *etr_buf, u64 rrp, u64 rwp) { struct etr_flat_buf *flat_buf = etr_buf->private; struct device *real_dev = flat_buf->dev->parent; + s64 buf_len; + int i; + struct scatterlist *sg; /* * Adjust the buffer to point to the beginning of the trace data @@ -668,12 +684,17 @@ static void tmc_etr_sync_flat_buf(struct etr_buf *etr_buf, u64 rrp, u64 rwp) * is full. Sync the entire buffer in one go for this case. */ if (etr_buf->offset + etr_buf->len > etr_buf->size) - dma_sync_single_for_cpu(real_dev, flat_buf->daddr, - etr_buf->size, DMA_FROM_DEVICE); - else - dma_sync_single_for_cpu(real_dev, - flat_buf->daddr + etr_buf->offset, - etr_buf->len, DMA_FROM_DEVICE); + dma_sync_sgtable_for_cpu(real_dev, flat_buf->sgt, + DMA_FROM_DEVICE); + else { + buf_len = etr_buf->len; + for_each_sg(flat_buf->sgt->sgl, sg, flat_buf->sgt->orig_nents, i) { + dma_sync_sg_for_cpu(real_dev, sg, 1, DMA_FROM_DEVICE); + buf_len -= sg->length; + if (buf_len <= 0) + break; + } + } } static ssize_t tmc_etr_get_data_flat_buf(struct etr_buf *etr_buf, -- 2.25.1

7 months, 3 weeks

[PATCH v2 0/9] coresight: Fix and improve clock usage

by Leo Yan

This series fixes and improves clock usage in the Arm CoreSight drivers. Based on the DT binding documents, the trace clock (atclk) is defined in some CoreSight modules, but support is absent. In most cases, the issue is hidden because the atclk clock is shared by multiple CoreSight modules and the clock is enabled anyway by other drivers. The first three patches address this issue. The programming clock (pclk) management in CoreSight drivers does not use the devm_XXX() variant APIs, the drivers needs to manually disable and release clocks for errors and for normal module exit. However, the drivers miss to disable clocks during module exit. The atclk may also not be disabled in CoreSight drivers during module exit. By using devm APIs, patches 04 and 05 fix clock disabling issues. Another issue is pclk might be enabled twice in init phase - once by AMBA bus driver, and again by CoreSight drivers. This is fixed in patch 06. Patches 07 to 09 refactor the clock related code. Patch 07 consolidats the clock initialization into a central place. Patch 08 makes the clock enabling sequence consistent. Patch 09 removes redundant condition checks and adds error handling in runtime PM. This series is verified on Arm64 Hikey960 platform. Changes from v1: - Moved the coresight_get_enable_clocks() function into CoreSight core layer (James). - Added comments for clock naming "apb_pclk" and "apb" (James). - Re-ordered patches for easier understanding (Anshuman). - Minor improvement for commit log in patch 01 (Anshuman). Leo Yan (9): coresight: tmc: Support atclk coresight: catu: Support atclk coresight: etm4x: Support atclk coresight: Disable programming clock properly coresight: Disable trace bus clock properly coresight: Avoid enable programming clock duplicately coresight: Consolidate clock enabling coresight: Make clock sequence consistent coresight: Refactor runtime PM drivers/hwtracing/coresight/coresight-catu.c | 53 ++++++++++++++++----------------- drivers/hwtracing/coresight/coresight-catu.h | 1 + drivers/hwtracing/coresight/coresight-core.c | 45 ++++++++++++++++++++++++++++ drivers/hwtracing/coresight/coresight-cpu-debug.c | 41 +++++++++----------------- drivers/hwtracing/coresight/coresight-ctcu-core.c | 24 +++++---------- drivers/hwtracing/coresight/coresight-etb10.c | 18 ++++-------- drivers/hwtracing/coresight/coresight-etm3x-core.c | 17 ++++------- drivers/hwtracing/coresight/coresight-etm4x-core.c | 32 ++++++++++---------- drivers/hwtracing/coresight/coresight-etm4x.h | 4 ++- drivers/hwtracing/coresight/coresight-funnel.c | 66 +++++++++++++++--------------------------- drivers/hwtracing/coresight/coresight-replicator.c | 63 ++++++++++++++-------------------------- drivers/hwtracing/coresight/coresight-stm.c | 34 +++++++++------------- drivers/hwtracing/coresight/coresight-tmc-core.c | 48 +++++++++++++++--------------- drivers/hwtracing/coresight/coresight-tmc.h | 2 ++ drivers/hwtracing/coresight/coresight-tpiu.c | 36 ++++++++++------------- include/linux/coresight.h | 30 ++----------------- 16 files changed, 225 insertions(+), 289 deletions(-) -- 2.34.1

7 months, 3 weeks

[PATCH v2] coresight: Disable MMIO logging for coresight stm driver

by Mao Jinlong

With MMIO logging enabled, the MMIO access are traced and could be sent to an STM device. Thus, an STM driver MMIO access could create circular call chain with MMIO logging. Disable it for STM driver. [] stm_source_write[stm_core]+0xc4 [] stm_ftrace_write[stm_ftrace]+0x40 [] trace_event_buffer_commit+0x238 [] trace_event_raw_event_rwmmio_rw_template+0x8c [] log_post_write_mmio+0xb4 [] writel_relaxed[coresight_stm]+0x80 [] stm_generic_packet[coresight_stm]+0x1a8 [] stm_data_write[stm_core]+0x78 [] stm_source_write[stm_core]+0x7c [] stm_ftrace_write[stm_ftrace]+0x40 [] trace_event_buffer_commit+0x238 [] trace_event_raw_event_rwmmio_read+0x84 [] log_read_mmio+0xac [] readl_relaxed[coresight_tmc]+0x50 Signed-off-by: Mao Jinlong <quic_jinlmao(a)quicinc.com> Reviewed-by: Leo Yan <leo.yan(a)arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual(a)arm.com> --- Changes in V2: update the commit message drivers/hwtracing/coresight/Makefile | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/hwtracing/coresight/Makefile b/drivers/hwtracing/coresight/Makefile index 4ba478211b31..f3158266f75e 100644 --- a/drivers/hwtracing/coresight/Makefile +++ b/drivers/hwtracing/coresight/Makefile @@ -22,6 +22,8 @@ condflags := \ $(call cc-option, -Wstringop-truncation) subdir-ccflags-y += $(condflags) +CFLAGS_coresight-stm.o := -D__DISABLE_TRACE_MMIO__ + obj-$(CONFIG_CORESIGHT) += coresight.o coresight-y := coresight-core.o coresight-etm-perf.o coresight-platform.o \ coresight-sysfs.o coresight-syscfg.o coresight-config.o \ -- 2.25.1

7 months, 3 weeks

[PATCH] coresight: replicator: Fix panic for clearing claim tag

by Leo Yan

On platforms with a static replicator, a kernel panic occurs during boot: [ 4.999406] replicator_probe+0x1f8/0x360 [ 5.003455] replicator_platform_probe+0x64/0xd8 [ 5.008115] platform_probe+0x70/0xf0 [ 5.011812] really_probe+0xc4/0x2a8 [ 5.015417] __driver_probe_device+0x80/0x140 [ 5.019813] driver_probe_device+0xe4/0x170 [ 5.024032] __driver_attach+0x9c/0x1b0 [ 5.027900] bus_for_each_dev+0x7c/0xe8 [ 5.031769] driver_attach+0x2c/0x40 [ 5.035373] bus_add_driver+0xec/0x218 [ 5.039154] driver_register+0x68/0x138 [ 5.043023] __platform_driver_register+0x2c/0x40 [ 5.047771] coresight_init_driver+0x4c/0xe0 [ 5.052079] replicator_init+0x30/0x48 [ 5.055865] do_one_initcall+0x4c/0x280 [ 5.059736] kernel_init_freeable+0x1ec/0x3c8 [ 5.064134] kernel_init+0x28/0x1f0 [ 5.067655] ret_from_fork+0x10/0x20 A static replicator doesn't have registers, so accessing the claim register results in a NULL pointer deference. Fixes the issue by accessing the claim registers only after the I/O resource has been successfully mapped. Fixes: 6f4c6f70575f ("coresight: Clear self hosted claim tag on probe") Signed-off-by: Leo Yan <leo.yan(a)arm.com> --- drivers/hwtracing/coresight/coresight-replicator.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/hwtracing/coresight/coresight-replicator.c b/drivers/hwtracing/coresight/coresight-replicator.c index f1d2f764e898..06efd2b01a0f 100644 --- a/drivers/hwtracing/coresight/coresight-replicator.c +++ b/drivers/hwtracing/coresight/coresight-replicator.c @@ -262,6 +262,7 @@ static int replicator_probe(struct device *dev, struct resource *res) drvdata->base = base; desc.groups = replicator_groups; desc.access = CSDEV_ACCESS_IOMEM(base); + coresight_clear_self_claim_tag(&desc.access); } if (fwnode_property_present(dev_fwnode(dev), @@ -284,7 +285,6 @@ static int replicator_probe(struct device *dev, struct resource *res) desc.pdata = dev->platform_data; desc.dev = dev; - coresight_clear_self_claim_tag(&desc.access); drvdata->csdev = coresight_register(&desc); if (IS_ERR(drvdata->csdev)) { ret = PTR_ERR(drvdata->csdev); -- 2.34.1

7 months, 3 weeks

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

CoreSight May 2025