Hi,
On Wed, 9 Feb 2022 at 10:57, Mao Jinlong <quic_jinlmao(a)quicinc.com> wrote:
>
> Except from STM and ETM/ETE, there could be other sources. Each
> source needs a unique trace id. Define a bitmap for the trace ids.
> The position of each bit represents trace id of the source.
>
> Signed-off-by: Mao Jinlong <quic_jinlmao(a)quicinc.com>
> ---
> drivers/hwtracing/coresight/coresight-core.c | 48 ++++++++++++++++++++
> include/linux/coresight-pmu.h | 11 +++++
> 2 files changed, 59 insertions(+)
>
> diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c
> index a90097f88425..6cb55c3f41d5 100644
> --- a/drivers/hwtracing/coresight/coresight-core.c
> +++ b/drivers/hwtracing/coresight/coresight-core.c
> @@ -16,6 +16,7 @@
> #include <linux/mutex.h>
> #include <linux/clk.h>
> #include <linux/coresight.h>
> +#include <linux/coresight-pmu.h>
see my comment below about using coresigh-priv.h
> #include <linux/of_platform.h>
> #include <linux/delay.h>
> #include <linux/pm_runtime.h>
> @@ -25,8 +26,11 @@
> #include "coresight-syscfg.h"
>
> static DEFINE_MUTEX(coresight_mutex);
> +static DEFINE_MUTEX(coresight_id_mutex);
> static DEFINE_PER_CPU(struct coresight_device *, csdev_sink);
>
> +static DECLARE_BITMAP(coresight_trace_id, CORESIGHT_TRACE_ID_NUM);
> +
> /*
> * Use IDR to map the hash length of the source's device name
> * to the pointer of path for the source
> @@ -51,6 +55,48 @@ struct coresight_node {
> const u32 coresight_barrier_pkt[4] = {0x7fffffff, 0x7fffffff, 0x7fffffff, 0x7fffffff};
> EXPORT_SYMBOL_GPL(coresight_barrier_pkt);
>
> +/* Init the coresight_trace_id bit map. */
> +static void coresight_init_trace_id(void)
> +{
> + int i;
> +
> + /* Trace id 0 is invalid. */
> + set_bit(CORESIGHT_TRACE_ID_0, coresight_trace_id);
> + /* Trace id 1 is fixed for STM. */
> + set_bit(CORESIGHT_TRACE_ID_1, coresight_trace_id);
> + /* Trace id from 112 to 127 are reserved. */
> + for (i = CORESIGHT_TRACE_ID_112; i <= CORESIGHT_TRACE_ID_127; i++)
> + set_bit(i, coresight_trace_id);
> + /* Skip the trace ids of ETM/ETE. */
> + for (i = 0; i <= cpumask_last(cpu_possible_mask); i++)
> + set_bit(coresight_get_trace_id(i), coresight_trace_id);
> +
> +}
> +
> +/*
> + * Return the first zero bit position of bitmap coresight_trace_id
> + * as source's trace id.
> + *
> + */
> +int coresight_get_system_trace_id(void)
> +{
> + int id;
> +
> + mutex_lock(&coresight_id_mutex);
> + id = find_first_zero_bit(coresight_trace_id, CORESIGHT_TRACE_ID_NUM);
> + /* If no zero bit is found, return error value. */
> + if (id == CORESIGHT_TRACE_ID_NUM) {
> + mutex_unlock(&coresight_id_mutex);
> + return -EINVAL;
> + }
> +
> + set_bit(id, coresight_trace_id);
> + mutex_unlock(&coresight_id_mutex);
> +
> + return id;
> +}
> +EXPORT_SYMBOL(coresight_get_system_trace_id);
> +
> static const struct cti_assoc_op *cti_assoc_ops;
>
> void coresight_set_cti_ops(const struct cti_assoc_op *cti_op)
> @@ -1750,6 +1796,8 @@ static int __init coresight_init(void)
> return 0;
>
> etm_perf_exit();
> +
> + coresight_init_trace_id();
> exit_bus_unregister:
> bus_unregister(&coresight_bustype);
> return ret;
> diff --git a/include/linux/coresight-pmu.h b/include/linux/coresight-pmu.h
> index 4ac5c081af93..1e2c5ca4c6e6 100644
> --- a/include/linux/coresight-pmu.h
> +++ b/include/linux/coresight-pmu.h
> @@ -32,6 +32,14 @@
> #define ETM4_CFG_BIT_RETSTK 12
> #define ETM4_CFG_BIT_VMID_OPT 15
>
The following additional defines and function should appear in coresight-priv.h
The coresight-pmu.h file contains data for the interface between the
drivers and perf.
> +/* Coresight component supports 7 bits trace id. */
additional comment here to explain that 0, 0x70- 0x7F IDs are reserved
by the architecture, 1 is default for STM
> +#define CORESIGHT_TRACE_ID_NUM 128
> +
> +#define CORESIGHT_TRACE_ID_0 0
> +#define CORESIGHT_TRACE_ID_1 1
> +#define CORESIGHT_TRACE_ID_112 112
> +#define CORESIGHT_TRACE_ID_127 127
> +
can we have these names a little more descriptive -
e.g.
CORESIGHT_TRACE_ID_0_RES 0
CORESIGHT_TRACE_ID_STM 1
CORESIGHT_TRACE_ID_RANGE_LO_RES 0x70
CORESIGHT_TRACE_ID_RANGE_HI_RES 0x7F
Additionally - now we are declaring a #define for the STM ID - it
would be better to use that in coresight-stm.c stm_init_default_data()
to set the trace id for the STM.
Regards
Mike
> static inline int coresight_get_trace_id(int cpu)
> {
> /*
> @@ -43,4 +51,7 @@ static inline int coresight_get_trace_id(int cpu)
> return (CORESIGHT_ETM_PMU_SEED + (cpu * 2));
> }
>
> +/* Get the trace id for the sources except from STM, ETM/ETE. */
> +extern int coresight_get_system_trace_id(void);
> +
> #endif
> --
> 2.17.1
>
--
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK
Hi,
On Fri, 11 Feb 2022 at 04:17, Jinlong Mao <quic_jinlmao(a)quicinc.com> wrote:
>
>
> On 2/10/2022 6:30 PM, Mike Leach wrote:
> > Hello Mao,
> >
> > I have looked through this set and have a few general questions.
> >
> > My understanding based on the information in the code is that the TPDM
> > devices will always send data to the TPDA device, the TPDM is not
> > capable of directly driving the ATB itself?
> > The TPDA device will then packetize the inputs and output these to
> > the ATB over the normal CoreSight architecture to a standard ETR / ETF
> > for collection.
> >
> > Looking at the TPDM driver - it is assigned a trace ID but never
> > actually uses it in the hardware. My assumption here is that this is
> > used purely to satisfy the requirement that the CoreSight core has
> > that all sources have a unique trace id?
> >
> > For the TPDA driver you assign an ATID as an attribute in device tree,
> > and then program this into the devices control register.
> >
> > The trace IDs in ETM / ETE / STM, are programmed into the hardware and
> > these values drive the ATID value on the trace bus. So assigning an
> > ATID value to the TPDA driver through device tree will lead to clashes
> > with the assignment of trace IDs in the other driver software.
> >
> > The topology here appears to me that you have multiple "data source"
> > devices TPDM, supplying a TPDA - which is the real CoreSight "trace
> > source" from the viewpoint of the trace bus and CoreSight
> > infrastructure.
> > To get this to work in the current CoreSight driver stack, you have
> > assigned the TPDM as a source type, and the TPDA as a link to ensure
> > that when a TPDM is started, all the components on the path to the
> > sink are activated.
> > This is fine.
> >
> > If my assumptions above are all accurate I suggest the following improvements
> >
> > For TPDA drop the device tree assignment of ATID and instead use the
> > coresight_get_system_trace_id() function you introduce in the 2nd
> > patch in this set.
> >
> > For TPDM you have assigned a unique source sub-type
> > CORESIGHT_DEV_SUBTYPE_SOURCE_SYS.- this could become
> > CORESIGHT_DEV_SUBTYPE_SOURCE_DATA_ONLY. If the trace ID assigned to
> > this device is only to satisfy the unique ID requirement and is not
> > used elsewhere, then the sub type could become
> > CORESIGHT_DEV_SUBTYPE_SOURCE_DATA_ONLY. We can agree that this sub
> > type does not need a unique ID and acts as none ATB a source for
> > another component, The core code can be altered to drop the
> > requirement for this sub-type and trace ID can be dropped for this
> > component.
> >
> > You should be aware that we are in the process of re-designed how
> > trace IDs are allocated. The current mechanism does not scale for
> > large multi-core systems (currently broken for any system > 46 cores),
> > and as you have discovered there is a need for additional allocation
> > of IDs. Also the ETE / TRBE combination does not need a trace ID. A
> > dynamic allocation system is being proposed.
> >
> > Regards
> >
> > Mike
>
>
> Hi Mike,
>
> Your assumptions above are all correct.
> TPDMs connect to the same TPDA will share the atid of the TPDA.
> We have a PC tool to parse the TPDM trace data. It needs the fixed atid
> for each TPDA to identify the data.
> So we configure the atid for TPDA in device tree with fixed ids.
> I will discuss with internal tool team to see if TPDA's id can become
> dynamic when parse the data.
>
I understand that it is essential to know the ID when extracting the
trace from a sink with a coresight frame fomatter.
However, tools such as perf, achieve this by saving the metadata for
the session. If you are programming the trace ID, then it should be
possible to read back the same register to get the required
information for your trace session.
If this is genuinely not possible for your system, then treat as a
fixed ID as I describe below.
> Apart from the TPDA's atid, we also have some other sources with fixed
> id in HW on our internal device.
> Do you have any suggestion to how to allocate the IDs for the source
> with fixed id in HW ?
>
>
Fixed IDs in hardware - assuming that these devices do write directly
onto the trace bus using the ATID, should be accommodated by reserving
the fixed ID in any scheme used to allocate IDs,
In your case this would be to set a bit at the correct area in the bitfield.
As mentioned we are working on an updated system to dynamically
allocate IDs for CoreSight sources - a reservation scheme for fixed ID
devices will be provided
.
Regards
Mike
>< Thanks
> Jinlong Mao
>
>
> >
> >
> > On Wed, 9 Feb 2022 at 10:57, Mao Jinlong <quic_jinlmao(a)quicinc.com> wrote:
> >> This series adds support for the trace performance monitoring and
> >> diagnostics hardware (TPDM and TPDA). It is composed of two major
> >> elements.
> >> a) Changes for original coresight framework to support for TPDM and TPDA.
> >> b) Add driver code for TPDM and TPDA.
> >>
> >> Introduction of changes for original coresight framework
> >> Support TPDM as new coresight source.
> >> Since only STM and ETM are supported as coresight source originally.
> >> TPDM is a newly added coresight source. We need to change
> >> the original way of saving coresight path to support more types source
> >> for coresight driver.
> >> The following patch is to add support more coresight sources.
> >> Use IDR to maintain all the enabled sources' paths.
> >> coresight: Use bitmap to assign trace id to the sources
> >>
> >> Introduction of TPDM and TPDA
> >> TPDM - The trace performance monitoring and diagnostics monitor or TPDM in
> >> short serves as data collection component for various dataset types
> >> specified in the QPMDA(Qualcomm performance monitoring and diagnostics
> >> architecture) spec. The primary use case of the TPDM is to collect data
> >> from different data sources and send it to a TPDA for packetization,
> >> timestamping and funneling.
> >> Coresight: Add coresight TPDM source driver
> >> dt-bindings: arm: Adds CoreSight TPDM hardware definitions
> >> coresight-tpdm: Add DSB dataset support
> >> coresight-tpdm: Add integration test support
> >> docs: sysfs: coresight: Add sysfs ABI documentation for TPDM
> >>
> >> TPDA - The trace performance monitoring and diagnostics aggregator or
> >> TPDA in short serves as an arbitration and packetization engine for the
> >> performance monitoring and diagnostics network as specified in the QPMDA
> >> (Qualcomm performance monitoring and diagnostics architecture)
> >> specification. The primary use case of the TPDA is to provide
> >> packetization, funneling and timestamping of Monitor data as specified
> >> in the QPMDA specification.
> >> The following patch is to add driver for TPDA.
> >> Coresight: Add TPDA link driver
> >> dt-bindings: arm: Adds CoreSight TPDA hardware definitions
> >>
> >> The last patch of this series is a device tree modification, which add
> >> the TPDM and TPDA configuration to device tree for validating.
> >> ARM: dts: msm: Add coresight components for SM8250
> >>
> >> Once this series patches are applied properly, the tpdm and tpda nodes
> >> should be observed at the coresight path /sys/bus/coresight/devices
> >> e.g.
> >> /sys/bus/coresight/devices # ls -l | grep tpd
> >> tpda0 -> ../../../devices/platform/soc(a)0/6004000.tpda/tpda0
> >> tpdm0 -> ../../../devices/platform/soc(a)0/6c08000.mm.tpdm/tpdm0
> >>
> >> We can use the commands are similar to the below to validate TPDMs.
> >> Enable coresight sink first.
> >>
> >> echo 1 > /sys/bus/coresight/devices/tmc_etf0/enable_sink
> >> echo 1 > /sys/bus/coresight/devices/tpdm0/enable_source
> >> echo 1 > /sys/bus/coresight/devices/tpdm0/integration_test
> >> echo 2 > /sys/bus/coresight/devices/tpdm0/integration_test
> >> The test data will be collected in the coresight sink which is enabled.
> >> If rwp register of the sink is keeping updating when do
> >> integration_test (by cat tmc_etf0/mgmt/rwp), it means there is data
> >> generated from TPDM to sink.
> >>
> >> Changes from V2:
> >> 1. Use bitmap to assign the trace id. (Mathieu Poirier)
> >>
> >> Mao Jinlong (10):
> >> Use IDR to maintain all the enabled sources' paths.
> >> coresight: Use bitmap to assign trace id to the sources
> >> Coresight: Add coresight TPDM source driver
> >> dt-bindings: arm: Adds CoreSight TPDM hardware definitions
> >> coresight-tpdm: Add DSB dataset support
> >> coresight-tpdm: Add integration test support
> >> docs: sysfs: coresight: Add sysfs ABI documentation for TPDM
> >> Coresight: Add TPDA link driver
> >> dt-bindings: arm: Adds CoreSight TPDA hardware definitions
> >> ARM: dts: msm: Add coresight components for SM8250
> >>
> >> .../testing/sysfs-bus-coresight-devices-tpdm | 6 +
> >> .../bindings/arm/coresight-tpda.yaml | 129 ++++
> >> .../bindings/arm/coresight-tpdm.yaml | 81 ++
> >> .../devicetree/bindings/arm/coresight.txt | 7 +
> >> MAINTAINERS | 1 +
> >> .../arm64/boot/dts/qcom/sm8250-coresight.dtsi | 690 ++++++++++++++++++
> >> arch/arm64/boot/dts/qcom/sm8250.dtsi | 2 +
> >> drivers/hwtracing/coresight/Kconfig | 33 +
> >> drivers/hwtracing/coresight/Makefile | 2 +
> >> drivers/hwtracing/coresight/coresight-core.c | 127 ++--
> >> drivers/hwtracing/coresight/coresight-tpda.c | 193 +++++
> >> drivers/hwtracing/coresight/coresight-tpda.h | 32 +
> >> drivers/hwtracing/coresight/coresight-tpdm.c | 270 +++++++
> >> drivers/hwtracing/coresight/coresight-tpdm.h | 57 ++
> >> include/linux/coresight-pmu.h | 11 +
> >> include/linux/coresight.h | 1 +
> >> 16 files changed, 1592 insertions(+), 50 deletions(-)
> >> create mode 100644 Documentation/ABI/testing/sysfs-bus-coresight-devices-tpdm
> >> create mode 100644 Documentation/devicetree/bindings/arm/coresight-tpda.yaml
> >> create mode 100644 Documentation/devicetree/bindings/arm/coresight-tpdm.yaml
> >> create mode 100644 arch/arm64/boot/dts/qcom/sm8250-coresight.dtsi
> >> create mode 100644 drivers/hwtracing/coresight/coresight-tpda.c
> >> create mode 100644 drivers/hwtracing/coresight/coresight-tpda.h
> >> create mode 100644 drivers/hwtracing/coresight/coresight-tpdm.c
> >> create mode 100644 drivers/hwtracing/coresight/coresight-tpdm.h
> >>
> >> --
> >> 2.17.1
> >>
> >
> > --
> > Mike Leach
> > Principal Engineer, ARM Ltd.
> > Manchester Design Centre. UK
--
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK
Hello Mao,
I have looked through this set and have a few general questions.
My understanding based on the information in the code is that the TPDM
devices will always send data to the TPDA device, the TPDM is not
capable of directly driving the ATB itself?
The TPDA device will then packetize the inputs and output these to
the ATB over the normal CoreSight architecture to a standard ETR / ETF
for collection.
Looking at the TPDM driver - it is assigned a trace ID but never
actually uses it in the hardware. My assumption here is that this is
used purely to satisfy the requirement that the CoreSight core has
that all sources have a unique trace id?
For the TPDA driver you assign an ATID as an attribute in device tree,
and then program this into the devices control register.
The trace IDs in ETM / ETE / STM, are programmed into the hardware and
these values drive the ATID value on the trace bus. So assigning an
ATID value to the TPDA driver through device tree will lead to clashes
with the assignment of trace IDs in the other driver software.
The topology here appears to me that you have multiple "data source"
devices TPDM, supplying a TPDA - which is the real CoreSight "trace
source" from the viewpoint of the trace bus and CoreSight
infrastructure.
To get this to work in the current CoreSight driver stack, you have
assigned the TPDM as a source type, and the TPDA as a link to ensure
that when a TPDM is started, all the components on the path to the
sink are activated.
This is fine.
If my assumptions above are all accurate I suggest the following improvements
For TPDA drop the device tree assignment of ATID and instead use the
coresight_get_system_trace_id() function you introduce in the 2nd
patch in this set.
For TPDM you have assigned a unique source sub-type
CORESIGHT_DEV_SUBTYPE_SOURCE_SYS.- this could become
CORESIGHT_DEV_SUBTYPE_SOURCE_DATA_ONLY. If the trace ID assigned to
this device is only to satisfy the unique ID requirement and is not
used elsewhere, then the sub type could become
CORESIGHT_DEV_SUBTYPE_SOURCE_DATA_ONLY. We can agree that this sub
type does not need a unique ID and acts as none ATB a source for
another component, The core code can be altered to drop the
requirement for this sub-type and trace ID can be dropped for this
component.
You should be aware that we are in the process of re-designed how
trace IDs are allocated. The current mechanism does not scale for
large multi-core systems (currently broken for any system > 46 cores),
and as you have discovered there is a need for additional allocation
of IDs. Also the ETE / TRBE combination does not need a trace ID. A
dynamic allocation system is being proposed.
Regards
Mike
On Wed, 9 Feb 2022 at 10:57, Mao Jinlong <quic_jinlmao(a)quicinc.com> wrote:
>
> This series adds support for the trace performance monitoring and
> diagnostics hardware (TPDM and TPDA). It is composed of two major
> elements.
> a) Changes for original coresight framework to support for TPDM and TPDA.
> b) Add driver code for TPDM and TPDA.
>
> Introduction of changes for original coresight framework
> Support TPDM as new coresight source.
> Since only STM and ETM are supported as coresight source originally.
> TPDM is a newly added coresight source. We need to change
> the original way of saving coresight path to support more types source
> for coresight driver.
> The following patch is to add support more coresight sources.
> Use IDR to maintain all the enabled sources' paths.
> coresight: Use bitmap to assign trace id to the sources
>
> Introduction of TPDM and TPDA
> TPDM - The trace performance monitoring and diagnostics monitor or TPDM in
> short serves as data collection component for various dataset types
> specified in the QPMDA(Qualcomm performance monitoring and diagnostics
> architecture) spec. The primary use case of the TPDM is to collect data
> from different data sources and send it to a TPDA for packetization,
> timestamping and funneling.
> Coresight: Add coresight TPDM source driver
> dt-bindings: arm: Adds CoreSight TPDM hardware definitions
> coresight-tpdm: Add DSB dataset support
> coresight-tpdm: Add integration test support
> docs: sysfs: coresight: Add sysfs ABI documentation for TPDM
>
> TPDA - The trace performance monitoring and diagnostics aggregator or
> TPDA in short serves as an arbitration and packetization engine for the
> performance monitoring and diagnostics network as specified in the QPMDA
> (Qualcomm performance monitoring and diagnostics architecture)
> specification. The primary use case of the TPDA is to provide
> packetization, funneling and timestamping of Monitor data as specified
> in the QPMDA specification.
> The following patch is to add driver for TPDA.
> Coresight: Add TPDA link driver
> dt-bindings: arm: Adds CoreSight TPDA hardware definitions
>
> The last patch of this series is a device tree modification, which add
> the TPDM and TPDA configuration to device tree for validating.
> ARM: dts: msm: Add coresight components for SM8250
>
> Once this series patches are applied properly, the tpdm and tpda nodes
> should be observed at the coresight path /sys/bus/coresight/devices
> e.g.
> /sys/bus/coresight/devices # ls -l | grep tpd
> tpda0 -> ../../../devices/platform/soc(a)0/6004000.tpda/tpda0
> tpdm0 -> ../../../devices/platform/soc(a)0/6c08000.mm.tpdm/tpdm0
>
> We can use the commands are similar to the below to validate TPDMs.
> Enable coresight sink first.
>
> echo 1 > /sys/bus/coresight/devices/tmc_etf0/enable_sink
> echo 1 > /sys/bus/coresight/devices/tpdm0/enable_source
> echo 1 > /sys/bus/coresight/devices/tpdm0/integration_test
> echo 2 > /sys/bus/coresight/devices/tpdm0/integration_test
> The test data will be collected in the coresight sink which is enabled.
> If rwp register of the sink is keeping updating when do
> integration_test (by cat tmc_etf0/mgmt/rwp), it means there is data
> generated from TPDM to sink.
>
> Changes from V2:
> 1. Use bitmap to assign the trace id. (Mathieu Poirier)
>
> Mao Jinlong (10):
> Use IDR to maintain all the enabled sources' paths.
> coresight: Use bitmap to assign trace id to the sources
> Coresight: Add coresight TPDM source driver
> dt-bindings: arm: Adds CoreSight TPDM hardware definitions
> coresight-tpdm: Add DSB dataset support
> coresight-tpdm: Add integration test support
> docs: sysfs: coresight: Add sysfs ABI documentation for TPDM
> Coresight: Add TPDA link driver
> dt-bindings: arm: Adds CoreSight TPDA hardware definitions
> ARM: dts: msm: Add coresight components for SM8250
>
> .../testing/sysfs-bus-coresight-devices-tpdm | 6 +
> .../bindings/arm/coresight-tpda.yaml | 129 ++++
> .../bindings/arm/coresight-tpdm.yaml | 81 ++
> .../devicetree/bindings/arm/coresight.txt | 7 +
> MAINTAINERS | 1 +
> .../arm64/boot/dts/qcom/sm8250-coresight.dtsi | 690 ++++++++++++++++++
> arch/arm64/boot/dts/qcom/sm8250.dtsi | 2 +
> drivers/hwtracing/coresight/Kconfig | 33 +
> drivers/hwtracing/coresight/Makefile | 2 +
> drivers/hwtracing/coresight/coresight-core.c | 127 ++--
> drivers/hwtracing/coresight/coresight-tpda.c | 193 +++++
> drivers/hwtracing/coresight/coresight-tpda.h | 32 +
> drivers/hwtracing/coresight/coresight-tpdm.c | 270 +++++++
> drivers/hwtracing/coresight/coresight-tpdm.h | 57 ++
> include/linux/coresight-pmu.h | 11 +
> include/linux/coresight.h | 1 +
> 16 files changed, 1592 insertions(+), 50 deletions(-)
> create mode 100644 Documentation/ABI/testing/sysfs-bus-coresight-devices-tpdm
> create mode 100644 Documentation/devicetree/bindings/arm/coresight-tpda.yaml
> create mode 100644 Documentation/devicetree/bindings/arm/coresight-tpdm.yaml
> create mode 100644 arch/arm64/boot/dts/qcom/sm8250-coresight.dtsi
> create mode 100644 drivers/hwtracing/coresight/coresight-tpda.c
> create mode 100644 drivers/hwtracing/coresight/coresight-tpda.h
> create mode 100644 drivers/hwtracing/coresight/coresight-tpdm.c
> create mode 100644 drivers/hwtracing/coresight/coresight-tpdm.h
>
> --
> 2.17.1
>
--
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK
Hi,
On Wed, 9 Feb 2022 at 10:57, Mao Jinlong <quic_jinlmao(a)quicinc.com> wrote:
>
> Use hash length of the source's device name to map to the pointer
> of the enabled path. Using IDR will be more efficient than using
> the list. And there could be other sources except STM and CPU etms
> in the new HWs. It is better to maintain all the paths together.
>
> Signed-off-by: Mao Jinlong <quic_jinlmao(a)quicinc.com>
> ---
> drivers/hwtracing/coresight/coresight-core.c | 76 +++++++-------------
> 1 file changed, 27 insertions(+), 49 deletions(-)
>
> diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c
> index 88653d1c06a4..a90097f88425 100644
> --- a/drivers/hwtracing/coresight/coresight-core.c
> +++ b/drivers/hwtracing/coresight/coresight-core.c
> @@ -7,6 +7,7 @@
> #include <linux/init.h>
> #include <linux/types.h>
> #include <linux/device.h>
> +#include <linux/idr.h>
> #include <linux/io.h>
> #include <linux/err.h>
> #include <linux/export.h>
> @@ -26,6 +27,12 @@
> static DEFINE_MUTEX(coresight_mutex);
> static DEFINE_PER_CPU(struct coresight_device *, csdev_sink);
>
> +/*
> + * Use IDR to map the hash length of the source's device name
> + * to the pointer of path for the source
> + */
> +static DEFINE_IDR(path_idr);
> +
> /**
> * struct coresight_node - elements of a path, from source to sink
> * @csdev: Address of an element.
> @@ -36,20 +43,6 @@ struct coresight_node {
> struct list_head link;
> };
>
> -/*
> - * When operating Coresight drivers from the sysFS interface, only a single
> - * path can exist from a tracer (associated to a CPU) to a sink.
> - */
> -static DEFINE_PER_CPU(struct list_head *, tracer_path);
> -
> -/*
> - * As of this writing only a single STM can be found in CS topologies. Since
> - * there is no way to know if we'll ever see more and what kind of
> - * configuration they will enact, for the time being only define a single path
> - * for STM.
> - */
> -static struct list_head *stm_path;
> -
> /*
> * When losing synchronisation a new barrier packet needs to be inserted at the
> * beginning of the data collected in a buffer. That way the decoder knows that
> @@ -1088,10 +1081,11 @@ static int coresight_validate_source(struct coresight_device *csdev,
>
> int coresight_enable(struct coresight_device *csdev)
> {
> - int cpu, ret = 0;
> + int ret = 0;
> struct coresight_device *sink;
> struct list_head *path;
> enum coresight_dev_subtype_source subtype;
> + u32 hash;
>
> subtype = csdev->subtype.source_subtype;
>
> @@ -1133,26 +1127,14 @@ int coresight_enable(struct coresight_device *csdev)
> if (ret)
> goto err_source;
>
> - switch (subtype) {
> - case CORESIGHT_DEV_SUBTYPE_SOURCE_PROC:
> - /*
> - * When working from sysFS it is important to keep track
> - * of the paths that were created so that they can be
> - * undone in 'coresight_disable()'. Since there can only
> - * be a single session per tracer (when working from sysFS)
> - * a per-cpu variable will do just fine.
> - */
> - cpu = source_ops(csdev)->cpu_id(csdev);
> - per_cpu(tracer_path, cpu) = path;
> - break;
> - case CORESIGHT_DEV_SUBTYPE_SOURCE_SOFTWARE:
> - stm_path = path;
> - break;
> - default:
> - /* We can't be here */
> - break;
> - }
> -
> + /*
> + * Use the hash length of source's device name as ID
Slightly confusing comment:
hashlen_string creates a hash and a length for the string. We are
using the hash here so comment should be:
"Use the hash of source's device name as ID"
> + * and map the ID to the pointer of the path.
> + */
> + hash = hashlen_hash(hashlen_string(NULL, dev_name(&csdev->dev)));
> + ret = idr_alloc_u32(&path_idr, path, &hash, hash, GFP_KERNEL);
> + if (ret)
> + goto err_source;
> out:
> mutex_unlock(&coresight_mutex);
> return ret;
> @@ -1168,8 +1150,9 @@ EXPORT_SYMBOL_GPL(coresight_enable);
>
> void coresight_disable(struct coresight_device *csdev)
> {
> - int cpu, ret;
> + int ret;
> struct list_head *path = NULL;
> + u32 hash;
>
> mutex_lock(&coresight_mutex);
>
> @@ -1180,21 +1163,16 @@ void coresight_disable(struct coresight_device *csdev)
> if (!csdev->enable || !coresight_disable_source(csdev))
> goto out;
>
> - switch (csdev->subtype.source_subtype) {
> - case CORESIGHT_DEV_SUBTYPE_SOURCE_PROC:
> - cpu = source_ops(csdev)->cpu_id(csdev);
> - path = per_cpu(tracer_path, cpu);
> - per_cpu(tracer_path, cpu) = NULL;
> - break;
> - case CORESIGHT_DEV_SUBTYPE_SOURCE_SOFTWARE:
> - path = stm_path;
> - stm_path = NULL;
> - break;
> - default:
> - /* We can't be here */
> - break;
> + hash = hashlen_hash(hashlen_string(NULL, dev_name(&csdev->dev)));
> + /* Find the path by the hash length. */
same comment as above re hash / hash length
> + path = idr_find(&path_idr, hash);
> + if (path == NULL) {
> + dev_info(&csdev->dev, "Path is not found for %s\n",
> + dev_name(&csdev->dev));
pr_err() is used in other parts of this file to print error messages.
> + return;
this skips the mutex_unlock(). use goto out;
> }
>
> + idr_remove(&path_idr, hash);
> coresight_disable_path(path);
> coresight_release_path(path);
>
> --
> 2.17.1
>
Regards
Mike
--
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK
Changes since v1:
* Change base to "[PATCH 1/1] coresight: no-op refactor to make INSTP0 check more idiomatic"
* Change 'name' to 'field' in REG_VAL() macro
* Add comment for REG_VAL() macro
While working on the branch broadcast change I found it difficult to search
for usages of registers and fields because of the magic numbers. I also
found it difficult to decide which style to make new code in because of the
varying ones used.
There was also a code review comment from Suzuki about replacing a magic
number so I'm proposing to refactor as many as possible into the style used
in sysreg.h which seems to be the new and most consistently used method.
For example it was used in the SPE and TRBE drivers.
This isn't an exhaustive refactor, but it does get all the basic accesses.
There are a couple of odd other cases remaining, mainly in the ETM3x code.
These can be found by searching for BMVAL.
There is one compromise to ensure this is a complete no-op and has
binary equivalence with the old version. I needed to keep two register
accesses here, where something like etmidr0 & TRCIDR0_INSTP0_LOAD_STORE
would be better:
- if (BMVAL(etmidr0, 1, 1) && BMVAL(etmidr0, 2, 2))
- drvdata->instrp0 = true;
- else
- drvdata->instrp0 = false;
-
+ drvdata->instrp0 = !!((REG_VAL(etmidr0, TRCIDR0_INSTP0) & 0b01) &&
+ (REG_VAL(etmidr0, TRCIDR0_INSTP0) & 0b10));
I think this change fixes quite a few issues like:
* Some registers aren't referred to by name but a different variable name.
For example eventctrl1 in mode_store() where TRCEVENTCTL1R isn't
mentioned in that function.
* Some bits aren't referred to by the name in the manual, even in the
comments. For example TRCCONFIGR.RS only occurs as /* bit[12], Return
stack enable bit */.
* Some bits in the same register are referred to either as magic numbers
or the publicly exported config macros, neiher of which are consistent
with any other register accesses. For example
config->cfg |= BIT(11);
config->cfg |= BIT(ETM4_CFG_BIT_CTXTID);
* Some fields are already partially referred to in the sysfs.h style way:
TRCVICTLR_EXLEVEL_... etc. But other fields in the same register are not
* Some fields are magic numbers that are repeated many times in different
functions. For example vinst_ctrl |= BIT(9)
* Some fields were referred to by magic numbers, even when there were
already existing #defines. For example ETMTECR1_INC_EXC
* Another benefit is that the #defines could be automatically checked
against the reference manual because the style is uniform.
Testing
=======
To test this I used gcc-11 which doesn't have a quirk about changing
register widths in some cases (as in w -> x). I also used elf_diff which
showed me exactly where I'd made a mistake when I did
(https://github.com/noseglasses/elf_diff). But now that there is no
difference, objdump and diff also work for validation.
make CC=gcc-11 modules
diff <(objdump -d drivers/hwtracing/coresight/coresight-etm4x.ko) <(objdump -d ../linux2/drivers/hwtracing/coresight/coresight-etm4x.ko)
diff <(objdump -d drivers/hwtracing/coresight/coresight.ko) <(objdump -d ../linux2/drivers/hwtracing/coresight/coresight.ko)
And for ETM3x (doesn't need gcc 11):
make ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- modules
diff <(objdump -d drivers/hwtracing/coresight/coresight-etm3x.ko) <(objdump -d ../linux2/drivers/hwtracing/coresight/coresight-etm3x.ko)
When there are no differences, the diff output looks like this with only
the filename listed:
< drivers/hwtracing/coresight/coresight-etm4x.ko: file format elf64-littleaarch64
---
> ../linux2/drivers/hwtracing/coresight/coresight-etm4x.ko: file format elf64-littleaarch64
Applies to "[PATCH 1/1] coresight: no-op refactor to make INSTP0 check more idiomatic"
James Clark (15):
coresight: Make ETM4x TRCIDR0 register accesses consistent with
sysreg.h
coresight: Make ETM4x TRCIDR2 register accesses consistent with
sysreg.h
coresight: Make ETM4x TRCIDR3 register accesses consistent with
sysreg.h
coresight: Make ETM4x TRCIDR4 register accesses consistent with
sysreg.h
coresight: Make ETM4x TRCIDR5 register accesses consistent with
sysreg.h
coresight: Make ETM4x TRCCONFIGR register accesses consistent with
sysreg.h
coresight: Make ETM4x TRCEVENTCTL1R register accesses consistent with
sysreg.h
coresight: Make ETM4x TRCSTALLCTLR register accesses consistent with
sysreg.h
coresight: Make ETM4x TRCVICTLR register accesses consistent with
sysreg.h
coresight: Make ETM3x ETMTECR1 register accesses consistent with
sysreg.h
coresight: Make ETM4x TRCACATRn register accesses consistent with
sysreg.h
coresight: Make ETM4x TRCSSCCRn and TRCSSCSRn register accesses
consistent with sysreg.h
coresight: Make ETM4x TRCSSPCICRn register accesses consistent with
sysreg.h
coresight: Make ETM4x TRCBBCTLR register accesses consistent with
sysreg.h
coresight: Make ETM4x TRCRSCTLRn register accesses consistent with
sysreg.h
.../coresight/coresight-etm3x-core.c | 2 +-
.../coresight/coresight-etm3x-sysfs.c | 2 +-
.../coresight/coresight-etm4x-core.c | 134 +++++--------
.../coresight/coresight-etm4x-sysfs.c | 178 +++++++++---------
drivers/hwtracing/coresight/coresight-etm4x.h | 159 ++++++++++++++--
drivers/hwtracing/coresight/coresight-priv.h | 5 +
6 files changed, 286 insertions(+), 194 deletions(-)
--
2.28.0
On Mon, Feb 07, 2022 at 11:14:50AM +0530, Anshuman Khandual wrote:
> Hi James,
>
> These are all ETM4X specific changes. Something like this might be cleaner
> and also more compact. Also would suggest to follow the same for subsequent
> patches as well.
>
> coresight: etm4x: Cleanup TRCIDR0 register accesses
>
> Consistency with sysreg.h could be mentioned in the commit message itself.
>
I agree with the above two comments.
> On 2/3/22 5:35 PM, James Clark wrote:
> > This is a no-op change for style and consistency and has no effect on the
> > binary produced by gcc-11.
>
> This patch adds register definitions, helper macros as well. Please expand
> the commit message to add more details. This is too short, for the change
> it creates. BTW why is it necessary to mention GCC version number here.
>
Here too - I'm not sure adding gcc's version number helps in any way.
> >
> > Signed-off-by: James Clark <james.clark(a)arm.com>
> > ---
> > .../coresight/coresight-etm4x-core.c | 36 +++++--------------
> > drivers/hwtracing/coresight/coresight-etm4x.h | 17 +++++++++
> > drivers/hwtracing/coresight/coresight-priv.h | 5 +++
> > 3 files changed, 30 insertions(+), 28 deletions(-)
> >
> > diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> > index e2eebd865241..107e81948f76 100644
> > --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
> > +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> > @@ -1091,41 +1091,21 @@ static void etm4_init_arch_data(void *info)
> > etmidr0 = etm4x_relaxed_read32(csa, TRCIDR0);
> >
> > /* INSTP0, bits[2:1] P0 tracing support field */
> > - if (BMVAL(etmidr0, 1, 2) == 0b11)
> > - drvdata->instrp0 = true;
> > - else
> > - drvdata->instrp0 = false;
> > -
> > + drvdata->instrp0 = !!(REG_VAL(etmidr0, TRCIDR0_INSTP0) == 0b11);
> > /* TRCBB, bit[5] Branch broadcast tracing support bit */
> > - if (BMVAL(etmidr0, 5, 5))
> > - drvdata->trcbb = true;
> > - else
> > - drvdata->trcbb = false;
> > -
> > + drvdata->trcbb = !!(etmidr0 & TRCIDR0_TRCBB);
> > /* TRCCOND, bit[6] Conditional instruction tracing support bit */
> > - if (BMVAL(etmidr0, 6, 6))
> > - drvdata->trccond = true;
> > - else
> > - drvdata->trccond = false;
> > -
> > + drvdata->trccond = !!(etmidr0 & TRCIDR0_TRCCOND);
> > /* TRCCCI, bit[7] Cycle counting instruction bit */
> > - if (BMVAL(etmidr0, 7, 7))
> > - drvdata->trccci = true;
> > - else
> > - drvdata->trccci = false;
> > -
> > + drvdata->trccci = !!(etmidr0 & TRCIDR0_TRCCCI);
> > /* RETSTACK, bit[9] Return stack bit */
> > - if (BMVAL(etmidr0, 9, 9))
> > - drvdata->retstack = true;
> > - else
> > - drvdata->retstack = false;
> > -
> > + drvdata->retstack = !!(etmidr0 & TRCIDR0_RETSTACK);
> > /* NUMEVENT, bits[11:10] Number of events field */
> > - drvdata->nr_event = BMVAL(etmidr0, 10, 11);
> > + drvdata->nr_event = REG_VAL(etmidr0, TRCIDR0_NUMEVENT);
> > /* QSUPP, bits[16:15] Q element support field */
> > - drvdata->q_support = BMVAL(etmidr0, 15, 16);
> > + drvdata->q_support = REG_VAL(etmidr0, TRCIDR0_QSUPP);
> > /* TSSIZE, bits[28:24] Global timestamp size field */
> > - drvdata->ts_size = BMVAL(etmidr0, 24, 28);
> > + drvdata->ts_size = REG_VAL(etmidr0, TRCIDR0_TSSIZE);
> >
> > /* maximum size of resources */
> > etmidr2 = etm4x_relaxed_read32(csa, TRCIDR2);
> > diff --git a/drivers/hwtracing/coresight/coresight-etm4x.h b/drivers/hwtracing/coresight/coresight-etm4x.h
> > index 3c4d69b096ca..2bd8ad953b8e 100644
> > --- a/drivers/hwtracing/coresight/coresight-etm4x.h
> > +++ b/drivers/hwtracing/coresight/coresight-etm4x.h
> > @@ -130,6 +130,23 @@
> >
> > #define TRCRSR_TA BIT(12)
> >
> > +/*
> > + * Bit positions of registers that are defined above, in the sysreg.h style
> > + * of _MASK, _SHIFT and BIT().
> > + */
>
> ^^^ not really necessary. Instead the format requirement for below mentioned
> CORESIGHT_REG_VAL() macro might be relevant and should be mentioned.
I'm not sure if you're asking to move the comment further below or remove it
altogether. In any case more comments is always better than less.
>
> > +#define TRCIDR0_INSTP0_SHIFT 1
> > +#define TRCIDR0_INSTP0_MASK GENMASK(1, 0)
> > +#define TRCIDR0_TRCBB BIT(5)
> > +#define TRCIDR0_TRCCOND BIT(6)
> > +#define TRCIDR0_TRCCCI BIT(7)
> > +#define TRCIDR0_RETSTACK BIT(9)
> > +#define TRCIDR0_NUMEVENT_SHIFT 10
> > +#define TRCIDR0_NUMEVENT_MASK GENMASK(1, 0)
> > +#define TRCIDR0_QSUPP_SHIFT 15
> > +#define TRCIDR0_QSUPP_MASK GENMASK(1, 0)
> > +#define TRCIDR0_TSSIZE_SHIFT 24
> > +#define TRCIDR0_TSSIZE_MASK GENMASK(4, 0)
> > +
> > /*
> > * System instructions to access ETM registers.
> > * See ETMv4.4 spec ARM IHI0064F section 4.3.6 System instructions
> > diff --git a/drivers/hwtracing/coresight/coresight-priv.h b/drivers/hwtracing/coresight/coresight-priv.h
> > index ff1dd2092ac5..1452c6038421 100644
> > --- a/drivers/hwtracing/coresight/coresight-priv.h
> > +++ b/drivers/hwtracing/coresight/coresight-priv.h
> > @@ -36,6 +36,11 @@
> >
> > #define TIMEOUT_US 100
> > #define BMVAL(val, lsb, msb) ((val & GENMASK(msb, lsb)) >> lsb)
> > +/*
> > + * Extract a field from a register where field is #defined in the form
> > + * <register_name>_<field_name>_MASK and <register_name>_<field_name>_SHIFT
> > + */
>
> Looking at the usage, <register_name> is already embedded in <filed_name>. So
> it requires <field_name>_SHIFT and <field_name>_MASK instead. Unless register
> name should be passed as separate argument (which actually might be better).
>
> REG_VAL(etmidr0, TRCIDR0_TSSIZE) ----> REG_VAL(etmidr0, TRCIDR0, TSSIZE)
I don't have strong preference, I'm good either way.
>
> with some restructuring in the comment ..
>
> /*
> * Extract a field from a coresight register
> *
> * Required fields are defined as macros like the following
> *
> * <register_name>_<field_name>_MASK and <register_name>_<field_name>_SHIFT
> */
>
> > +#define REG_VAL(val, field) ((val & (field##_MASK << field##_SHIFT)) >> field##_SHIFT)
>
> This is too generic to be in a coresight header or it should just be
> named CORESIGHT_REG_VAL() instead, making it more specific for here.
>
> The build should fail in case any required macro definition is absent.
> I guess no more fortification is required in case macros are missing.
>
> However CORESIGHT_REG_VAL() is better placed in <coresight-etm4x.h>
> just before all the dependent SHIFT/MASK register field definition
> starts.
>
Not sure about this... Someone might want to do the same for etmv3 and in that
case we'll end up moving the macro again.
Thanks,
Mathieu
> >
> > #define ETM_MODE_EXCL_KERN BIT(30)
> > #define ETM_MODE_EXCL_USER BIT(31)
> >
>
> - Anshuman
On 07/02/2022 05:51, Anshuman Khandual wrote:
> Hi James,
>
> On 2/3/22 5:35 PM, James Clark wrote:
>> James Clark (15):
>> coresight: Make ETM4x TRCIDR0 register accesses consistent with
>> sysreg.h
>> coresight: Make ETM4x TRCIDR2 register accesses consistent with
>> sysreg.h
>> coresight: Make ETM4x TRCIDR3 register accesses consistent with
>> sysreg.h
>> coresight: Make ETM4x TRCIDR4 register accesses consistent with
>> sysreg.h
>> coresight: Make ETM4x TRCIDR5 register accesses consistent with
>> sysreg.h
>> coresight: Make ETM4x TRCCONFIGR register accesses consistent with
>> sysreg.h
>> coresight: Make ETM4x TRCEVENTCTL1R register accesses consistent with
>> sysreg.h
>> coresight: Make ETM4x TRCSTALLCTLR register accesses consistent with
>> sysreg.h
>> coresight: Make ETM4x TRCVICTLR register accesses consistent with
>> sysreg.h
>> coresight: Make ETM3x ETMTECR1 register accesses consistent with
>> sysreg.h
>> coresight: Make ETM4x TRCACATRn register accesses consistent with
>> sysreg.h
>> coresight: Make ETM4x TRCSSCCRn and TRCSSCSRn register accesses
>> consistent with sysreg.h
>> coresight: Make ETM4x TRCSSPCICRn register accesses consistent with
>> sysreg.h
>> coresight: Make ETM4x TRCBBCTLR register accesses consistent with
>> sysreg.h
>> coresight: Make ETM4x TRCRSCTLRn register accesses consistent with
>> sysreg.h
>
> The changes here are very similar to each other. But they are split
> into different patches according to register names just for better
> review process ? OR is there any other rationale ?
Yes just for the review process. I didn't see a way of reviewing them all
in one change because it's so big, and the only logical way to split it was
by register so I did it that way.
>
> - Anshuman