Re: [RFC PATCH v1] perf test: Introduce script for Arm CoreSight testing

List overview All Threads
Download

newer

older

[RFC PATCH v1] perf test:...

[PATCH v2 00/16] coresight: Add...

Wojciech Żmuda

24 Apr 2019 24 Apr '19

12:25 p.m.

Hi Leo,

I ran your script on a Zynq Ultrascale+ devboard with 'perf test ...' and it succeeded. After examining the source code, I came up with the following questions:

1) What about other sinks? For now I see it is hardcoded to look for sysfs devices with *.etr only. For example, Zynq US+ has one TMC-ETR and two TMC-ETFs - shouldn't ETFs be tested as well?

2) Is there any sink naming convention? AFAIR sysfs file names depend on what's in DTS. I can imagine a situation where I have two ETRs with nodes names etr1@88f00f00, etr2@88f00f80 - the script will not discover them. I took a brief look at the bindings document and I don't see any obvious remarks on how should we name DTS nodes. Sorry if I confused something here, but perhaps this is the right moment to establish such convention?

Best regards, Wojciech

Show replies by date

Leo Yan

24 Apr 24 Apr

2:15 p.m.

New subject: [RFC PATCH v1] perf test: Introduce script for Arm CoreSight testing

Hi Wojciech,

On Wed, Apr 24, 2019 at 12:25:21PM +0000, Wojciech Żmuda wrote:

...

Hi Leo,

I ran your script on a Zynq Ultrascale+ devboard with 'perf test ...' and it succeeded.

Thanks a lot for testing!

...

After examining the source code, I came up with the following questions:

What about other sinks? For now I see it is hardcoded to look for sysfs devices with *.etr only.

For example, Zynq US+ has one TMC-ETR and two TMC-ETFs - shouldn't ETFs be tested as well?

Yes, actually I have considered this question when I wrote the script.

...

From my understanding, ETR is located after ETF in the trace data path

and it's more complex than ETF. So if we test ETR successfully, usually means the ETF in the middle of trace path also has been (works as LINKSINK type).

Another thing I noted that on several platforms (Juno, Hikey/Hikey960, DB410c), all of them only has single one ETR component, so means this testing is valid on these platforms. But ETF usage is quite diverse on different platforms, e.g. as a special case, on Juno board, one ETF even cannot create any trace path [1] so it will always fail if we use perf with it.

...

Is there any sink naming convention? AFAIR sysfs file names depend on what's in DTS. I can imagine

a situation where I have two ETRs with nodes names etr1@88f00f00, etr2@88f00f80 - the script will not discover them. I took a brief look at the bindings document and I don't see any obvious remarks on how should we name DTS nodes.

Good point. I will change the code as below:

arm_cs_etr_test() { - for i in /sys/bus/coresight/devices/*.etr; do + for i in /sys/bus/coresight/devices/*.etr*; do

Just want to remind (but maybe this is off topic), ePAPR [2] suggests to use node name as general as possible, so the ETR will use 'etr@xxxxxxxx' as device node name in dts, and in theory we should always use the unified format for ETR node; anyway, will change the code with more flexible match.

...

Sorry if I confused something here, but perhaps this is the right moment to establish such convention?

Not at all. Thanks a lot for reviewing and suggestions!

Thanks, Leo Yan

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch... [2] https://elinux.org/images/c/cf/Power_ePAPR_APPROVED_v1.1.pdf

Wojciech Żmuda

2:43 p.m.

New subject: [RFC PATCH v1] perf test: Introduce script for Arm CoreSight testing

Hi Leo,

...

-----Original Message----- From: Leo Yan leo.yan@linaro.org Sent: Wednesday, April 24, 2019 4:15 PM To: Wojciech Żmuda wzmuda@n7space.com Cc: coresight@lists.linaro.org Subject: Re: [RFC PATCH v1] perf test: Introduce script for Arm CoreSight testing

Hi Wojciech,

On Wed, Apr 24, 2019 at 12:25:21PM +0000, Wojciech Żmuda wrote:

...
Hi Leo,

I ran your script on a Zynq Ultrascale+ devboard with 'perf test ...'

and it succeeded.

Thanks a lot for testing!

...
After examining the source code, I came up with the following questions:

What about other sinks? For now I see it is hardcoded to look for

sysfs devices with *.etr only.

...
For example, Zynq US+ has one TMC-ETR and two TMC-ETFs - shouldn't ETFs

be tested as well?

Yes, actually I have considered this question when I wrote the script.

From my understanding, ETR is located after ETF in the trace data path and it's more complex than ETF. So if we test ETR successfully, usually means the ETF in the middle of trace path also has been (works as LINKSINK type).

This is actually how CoreSight in Zynq US+ is wired. I asked because I wasn't sure if this is a common design, that ETR is always preceded by ETF. If this is how CoreSight topology looks like in every SoC, then I agree that testing only ETRs sounds reasonable.

...

Another thing I noted that on several platforms (Juno, Hikey/Hikey960, DB410c), all of them only has single one ETR component, so means this testing is valid on these platforms. But ETF usage is quite diverse on different platforms, e.g. as a special case, on Juno board, one ETF even cannot create any trace path [1] so it will always fail if we use perf with it.

That's interesting. What is the purpose of such ETF? LINKSINK mode only?

Thanks, Wojciech

...

...

Is there any sink naming convention? AFAIR sysfs file names depend

on what's in DTS. I can imagine a situation where I have two ETRs with nodes names etr1@88f00f00, etr2@88f00f80 - the script will not discover them. I took a brief look at the bindings document and I don't

see any obvious remarks on how should we name DTS nodes.

Good point. I will change the code as below:

arm_cs_etr_test() {
  for i in /sys/bus/coresight/devices/*.etr; do
  for i in /sys/bus/coresight/devices/*.etr*; do
Just want to remind (but maybe this is off topic), ePAPR [2] suggests to use node name as general as possible, so the ETR will use 'etr@xxxxxxxx' as device node name in dts, and in theory we should always use the unified format for ETR node; anyway, will change the code with more flexible match.

...
Sorry if I confused something here, but perhaps this is the right moment to establish such convention?

Not at all. Thanks a lot for reviewing and suggestions!

Thanks, Leo Yan

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/ar ch/arm64/boot/dts/arm/juno-cs-r1r2.dtsi?h=v5.1-rc6#n26 [2] https://elinux.org/images/c/cf/Power_ePAPR_APPROVED_v1.1.pdf

Mike Leach

3:29 p.m.

New subject: [RFC PATCH v1] perf test: Introduce script for Arm CoreSight testing

Hi,

On Wed, 24 Apr 2019 at 15:43, Wojciech Żmuda wzmuda@n7space.com wrote:

...

Hi Leo,

...
-----Original Message----- From: Leo Yan leo.yan@linaro.org Sent: Wednesday, April 24, 2019 4:15 PM To: Wojciech Żmuda wzmuda@n7space.com Cc: coresight@lists.linaro.org Subject: Re: [RFC PATCH v1] perf test: Introduce script for Arm CoreSight testing

Hi Wojciech,

On Wed, Apr 24, 2019 at 12:25:21PM +0000, Wojciech Żmuda wrote:

...
Hi Leo,

I ran your script on a Zynq Ultrascale+ devboard with 'perf test ...'

and it succeeded.

Thanks a lot for testing!

...
After examining the source code, I came up with the following questions:

What about other sinks? For now I see it is hardcoded to look for

sysfs devices with *.etr only.

...
For example, Zynq US+ has one TMC-ETR and two TMC-ETFs - shouldn't ETFs

be tested as well?

Yes, actually I have considered this question when I wrote the script.

From my understanding, ETR is located after ETF in the trace data path and it's more complex than ETF. So if we test ETR successfully, usually means the ETF in the middle of trace path also has been (works as LINKSINK type).

This is actually how CoreSight in Zynq US+ is wired. I asked because I wasn't sure if this is a common design, that ETR is always preceded by ETF. If this is how CoreSight topology looks like in every SoC, then I agree that testing only ETRs sounds reasonable.

This is a frequent topology for CoreSight - the intervening ETF smooths out incoming trace which can be quite bursty depending on the processes being run on the cores. We do predict that future topologies will move towards a 1:1 relationship between ETM and ETR - with no STF between.

...

...
Another thing I noted that on several platforms (Juno, Hikey/Hikey960, DB410c), all of them only has single one ETR component, so means this testing is valid on these platforms. But ETF usage is quite diverse on different platforms, e.g. as a special case, on Juno board, one ETF even cannot create any trace path [1] so it will always fail if we use perf with it.

That's interesting. What is the purpose of such ETF? LINKSINK mode only?

This ETF is ETF1 on Juno.

On Juno, ETF0 collects data from the cores, and ETF1 attaches to the STM, SCP and system profiler. While these devices can generate trace, they are not part of the application processor trace path so are not directly controlled by perf Thus trace from the Cortex processors will never have a path that passes through ETF1. The output from ETF0 and ETF1 both funnel into the system ETR.

Regards

Mike

...

Thanks, Wojciech

...
...

Is there any sink naming convention? AFAIR sysfs file names depend

on what's in DTS. I can imagine a situation where I have two ETRs with nodes names etr1@88f00f00, etr2@88f00f80 - the script will not discover them. I took a brief look at the bindings document and I don't

see any obvious remarks on how should we name DTS nodes.

Good point. I will change the code as below:

arm_cs_etr_test() {
  for i in /sys/bus/coresight/devices/*.etr; do
  for i in /sys/bus/coresight/devices/*.etr*; do
Just want to remind (but maybe this is off topic), ePAPR [2] suggests to use node name as general as possible, so the ETR will use 'etr@xxxxxxxx' as device node name in dts, and in theory we should always use the unified format for ETR node; anyway, will change the code with more flexible match.

...
Sorry if I confused something here, but perhaps this is the right moment to establish such convention?

Not at all. Thanks a lot for reviewing and suggestions!

Thanks, Leo Yan

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/ar ch/arm64/boot/dts/arm/juno-cs-r1r2.dtsi?h=v5.1-rc6#n26 [2] https://elinux.org/images/c/cf/Power_ePAPR_APPROVED_v1.1.pdf
CoreSight mailing list CoreSight@lists.linaro.org https://lists.linaro.org/mailman/listinfo/coresight

-- Mike Leach Principal Engineer, ARM Ltd. Manchester Design Centre. UK

Al Grant

6:42 p.m.

New subject: [RFC PATCH v1] perf test: Introduce script for Arm CoreSight testing

...

This is a frequent topology for CoreSight - the intervening ETF smooths out incoming trace which can be quite bursty depending on the processes being run on the cores.

Some ETF/ETBs have big enough buffers that they can capture significant amounts of trace in circular-buffer mode (with readout via the RRD register). This is useful in early stages of bringup when the ETR's path to main memory via the system interconnect may not have been established, but may also be useful to capture trace without any probe-effect on the system bus.

So it would be useful if perf test could try all possibilities on the path to the ETR - i.e. if there are ETBs on the path, put those into circular buffer mode and test those too.

One system I'm using right now has one CPU with CPU->ETB->ETB->ETR. Each ETB has an 8K buffer. So in fact it is 1:1 but with three options for where trace can be captured.

Off-list we have been discussing generic ways to specify the sink in which trace is collected, e.g. by a standard way of enumerating the possible sinks as seen from a CPU.

For example we could number the sinks seen by each CPU according to a walk of its outgoing trace path, with ETB numbered before ETR if they occur at the same level (or the other way round if you prefer).

This would cope with multiple topologies, in a generic way. It's quite similar to the way caches are represented in sysfs. If you want to test all possibilities you simply iterate through the sinks until perf tells you there are no more. For physically partitioned trace fabrics (including multi-socket) this is much simpler than specifying trace sinks individually, since an individual sink is only valid for a subset of cores.

...

We do predict that future topologies will move towards a 1:1 relationship between ETM and ETR - with no STF between.

Meanwhile we need to support current silicon.

...

...
...
Another thing I noted that on several platforms (Juno, Hikey/Hikey960, DB410c), all of them only has single one ETR component, so means this testing is valid on these platforms. But ETF usage is quite diverse on different platforms, e.g. as a special case, on Juno board, one ETF even cannot create any trace path [1] so it will always fail if we use perf with it.

That's interesting. What is the purpose of such ETF? LINKSINK mode only?

This ETF is ETF1 on Juno.

On Juno, ETF0 collects data from the cores, and ETF1 attaches to the STM, SCP and system profiler. While these devices can generate trace, they are not part of the application processor trace path so are not directly controlled by perf Thus trace from the Cortex processors will never have a path that passes through ETF1. The output from ETF0 and ETF1 both funnel into the system ETR.

Regards

Mike

...
Thanks, Wojciech

...
...

Is there any sink naming convention? AFAIR sysfs file names

depend on what's in DTS. I can imagine a situation where I have two ETRs with nodes names etr1@88f00f00, etr2@88f00f80 - the script will not discover them. I took a brief look at the bindings document and I don't

see any obvious remarks on how should we name DTS nodes.

Good point. I will change the code as below:

arm_cs_etr_test() {
  for i in /sys/bus/coresight/devices/*.etr; do
  for i in /sys/bus/coresight/devices/*.etr*; do
Just want to remind (but maybe this is off topic), ePAPR [2] suggests to use node name as general as possible, so the ETR will use
'etr@xxxxxxxx'

...
...
as device node name in dts, and in theory we should always use the unified format for ETR node; anyway, will change the code with more flexible match.

...
Sorry if I confused something here, but perhaps this is the right moment to establish such convention?

Not at all. Thanks a lot for reviewing and suggestions!

Thanks, Leo Yan

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/t ree/ar ch/arm64/boot/dts/arm/juno-cs-r1r2.dtsi?h=v5.1-rc6#n26 [2] https://elinux.org/images/c/cf/Power_ePAPR_APPROVED_v1.1.pdf

CoreSight mailing list CoreSight@lists.linaro.org https://lists.linaro.org/mailman/listinfo/coresight

-- Mike Leach Principal Engineer, ARM Ltd. Manchester Design Centre. UK _______________________________________________ CoreSight mailing list CoreSight@lists.linaro.org https://lists.linaro.org/mailman/listinfo/coresight

IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

Leo Yan

25 Apr 25 Apr

2:11 a.m.

New subject: [RFC PATCH v1] perf test: Introduce script for Arm CoreSight testing

On Wed, Apr 24, 2019 at 06:42:18PM +0000, Al Grant wrote:

...

...
This is a frequent topology for CoreSight - the intervening ETF smooths out incoming trace which can be quite bursty depending on the processes being run on the cores.

Some ETF/ETBs have big enough buffers that they can capture significant amounts of trace in circular-buffer mode (with readout via the RRD register). This is useful in early stages of bringup when the ETR's path to main memory via the system interconnect may not have been established, but may also be useful to capture trace without any probe-effect on the system bus.

So it would be useful if perf test could try all possibilities on the path to the ETR - i.e. if there are ETBs on the path, put those into circular buffer mode and test those too.

One system I'm using right now has one CPU with CPU->ETB->ETB->ETR. Each ETB has an 8K buffer. So in fact it is 1:1 but with three options for where trace can be captured.

Off-list we have been discussing generic ways to specify the sink in which trace is collected, e.g. by a standard way of enumerating the possible sinks as seen from a CPU.

For example we could number the sinks seen by each CPU according to a walk of its outgoing trace path, with ETB numbered before ETR if they occur at the same level (or the other way round if you prefer).

This would cope with multiple topologies, in a generic way. It's quite similar to the way caches are represented in sysfs. If you want to test all possibilities you simply iterate through the sinks until perf tells you there are no more. For physically partitioned trace fabrics (including multi-socket) this is much simpler than specifying trace sinks individually, since an individual sink is only valid for a subset of cores.

Thanks a lot for suggestions, Al.

Combined Al's suggestions in this email and Mathieu's suggestions in another email, I'd like to summary two different testing methodologies:

- Source oriented testing (Al's suggestion)

Source oriented testing is based on every source (now only refers to CPU) and test its all possible sinks.

Pros: the testing is very sanity and can test all possibilities from sources to sinks.

Cons: the difficult thing is to find a general (and simple) method to traverse the pathes from sources to sinks.

Suzuki's patch '36/36: [RFC] coresight: Expose device connections via sysfs' will be helpful for analysis the path between one CPUs to its sinks.

Another option is to create sysfs nodes (as Al has mentioned) like '/sys/devices/system/cpu/cpuX/coresight/sinkX', so this can give out the CoreSight topology from the CPU's view. But I think we should give this low priority (we can try this method only if there have block issues with Suzuki's patch 36/36).

- Sink oriented testing (Mathieu's suggestion and this is also the method in my patch)

Sink oriented testing is to iterate every sink and generate trace data stream to it.

Pros: usually the testing will take less iterations than source oriented testing (at least this is true for the hardwares in my hand);

The sink cannot receive any trace data if the testing task doesn't run on the 'right' CPUs. So how we needs to know what's the connected CPUs to the specific sink, one possible solution is to create cpumask nodes for every sink, like below:

/sys/bus/coresight/devices/XXXXXXXX.[etr|etf|etb]/cpumask

Cons: the testing might be not very complete, it might not cover some corner cases.

So now I am bias to use source oriented testing as Al suggested, it's would be general enough and cover current & future topologies. If you have any furthermore comments, please let me know.

...

...
We do predict that future topologies will move towards a 1:1 relationship between ETM and ETR - with no STF between.

Meanwhile we need to support current silicon.

Totally agree.

Thanks! Leo Yan

Mathieu Poirier

24 Apr 24 Apr

4:46 p.m.

New subject: [RFC PATCH v1] perf test: Introduce script for Arm CoreSight testing

On Wed, 24 Apr 2019 at 06:25, Wojciech Żmuda wzmuda@n7space.com wrote:

...

Hi Leo,

I ran your script on a Zynq Ultrascale+ devboard with 'perf test ...' and it succeeded. After examining the source code, I came up with the following questions:

What about other sinks? For now I see it is hardcoded to look for sysfs devices with *.etr only.

For example, Zynq US+ has one TMC-ETR and two TMC-ETFs - shouldn't ETFs be tested as well?

One problem with designing a test suite for coresight is that all implementation are different. Leo's test program works well on platform such as HiKey, QC410c, Juno and the Ultrascale+ where an ETR is available to collect traces from all processors. But on platform where an ETR is not available or where there is one ETR per cluster, the test won't run properly. It will also fail on future platforms enacting a 1:1 source/sink topology. One option is to script all sort of sorcery to determine the best sink to use but we'd always find an exception to break the heuristic.

We've been planning to enhance the framework with the capability to detect the best sink to use [1], which would remove the need to identify a sink on the perf command line. That would essentially fix the above mentioned problems. But for now and since we don't have that feature yet I suggest the test script be modified to look for more than an ETR. For example if an ETR is found on the system things proceed as they currently do. Otherwise look for an ETF and proceed if found. Last look for an ETB. Note that this won't solve the problem where a system as one ETR per cluster - that will require more thinking.

Suzuki is also working on a way to discover system topology from sysfs [2], something that could be useful for testing. I encourage people to have a look and comment on the approach.

Thanks, Mathieu

[1]. That feature is in need of willing hands if someone is interested. [2]. https://lkml.org/lkml/2019/4/15/644 (patch 36/36)

...

Is there any sink naming convention? AFAIR sysfs file names depend on what's in DTS. I can imagine

a situation where I have two ETRs with nodes names etr1@88f00f00, etr2@88f00f80 - the script will not discover them. I took a brief look at the bindings document and I don't see any obvious remarks on how should we name DTS nodes. Sorry if I confused something here, but perhaps this is the right moment to establish such convention?

Best regards, Wojciech _______________________________________________ CoreSight mailing list CoreSight@lists.linaro.org https://lists.linaro.org/mailman/listinfo/coresight

2276

days inactive

2277

days old

coresight@lists.linaro.org

6 comments

participants

tags (0)

participants (5)

Al Grant
Leo Yan
Mathieu Poirier
Mike Leach
Wojciech Żmuda