Hi Mike,
This is precisely the sort of thing that CoreSight System Configuration Management is designed to handle (the "official" name for what we have been calling Coresight complex config".)
It will (once the patchsets are upstream) be straightforwards to create a configuration with the features you require. including adjustable parameters if you wish, load it, then enable it from the perf command line. No driver updates required. This avoids the cycles of tweaking various drivers / perf sources to add yet more bits into the config attributes etc, etc. At some point the space to do this will run out.
Using the complex configs is a possibility, although Intel PT has no trouble fitting these parameters into the event using the Linux standard mechanism. Intel PT has an equivalent to what I'm suggesting:
perf record -e intel_pt/cyc=1,cyc_thresh=4/ ...
We could go further and name cs_etm's parameters the same as PT's, since they are effectively serving the same purpose.
That still leaves the questions of
- should we change the default of 256 downwards
- should an attempt to set a parameter lower than CCITMIN be accepted, or capped at CCITMIN.
Those questions are independent of the mechanism for changing the config.
Al
Regards
Mike
On Wed, 18 Nov 2020 at 20:47, Mathieu Poirier mathieu.poirier@linaro.org wrote:
Good day,
On Tue, Nov 17, 2020 at 08:11:49PM +0000, Al Grant wrote:
Hi,
When using the /cycacc=1/ option the cycle-counting threshold (setting the minimum gap between cycle packets) is set to a default of 256, which is very high and gives poor cycle resolution. Cores can manage far better than this (single digits, if not 1).
The threshold can be set lower via sysfs, but it's capped by an ID register which has a (CPU-specific) minimum possible value. Unfortunately on some widely used cores this ID register incorrectly reports
0x100 i.e. 256.
The intended value for these cores is 0b100 i.e. 4.
The default threshold is rather high and means we don't get as much value out of this feature as we could. In fact, we typically get timestamp packets more often than we get cycle count packets (timestamp packets also have cycle counts, but take up more space in the trace, so it's not the best way to get high resolution cycle counting).
The effect of writing a threshold value below what the ID register says, is architecturally unpredictable. So it's not something we can do
generally.
What I would suggest as the way forward is to:
- set the default lower
- cap it against the value in the ID register (i.e. the default might be 10, but if the ID register on a given core says 20, program the ETM
to 20)
- add an event parameter to set the threshold: -e cs_etm/cycacc=1,cycthreshold=10/
- add a 1-bit event parameter to override the checking of the ID register: -e cs_etm/cycacc=1,cycthreshold=10,cycoverride=1/
This looks quite complex just to deal with broken hardware. I would simply remove the check in cyc_threshold_store() and add a comment that TRCIDR3.CCITMIN can't be trusted. I expect people that do this kind of things to know what they are doing and be aware of the platform's
limitation.
We could use the CPU errata mechanism to work around the incorrect ID register, but it seems overkill in this case.
I'd suggest a default of 8 for the threshold, and 10 bits for the cycthreshold event parameter. If anyone has other ideas, please say.
I have no problem lowering the default value, but is has to be something that is supported on _all_ platforms. Or as many platforms that supported 256. The end goal is to avoid braking users where a value of
256 worked.
Thanks, Mathieu
Al _______________________________________________ CoreSight mailing list CoreSight@lists.linaro.org https://lists.linaro.org/mailman/listinfo/coresight
CoreSight mailing list CoreSight@lists.linaro.org https://lists.linaro.org/mailman/listinfo/coresight
-- Mike Leach Principal Engineer, ARM Ltd. Manchester Design Centre. UK