Hi,
When using the /cycacc=1/ option the cycle-counting threshold (setting the minimum gap between cycle packets) is set to a default of 256, which is very high and gives poor cycle resolution. Cores can manage far better than this (single digits, if not 1).
The threshold can be set lower via sysfs, but it's capped by an ID register which has a (CPU-specific) minimum possible value. Unfortunately on some widely used cores this ID register incorrectly reports 0x100 i.e. 256. The intended value for these cores is 0b100 i.e. 4.
The default threshold is rather high and means we don't get as much value out of this feature as we could. In fact, we typically get timestamp packets more often than we get cycle count packets (timestamp packets also have cycle counts, but take up more space in the trace, so it's not the best way to get high resolution cycle counting).
The effect of writing a threshold value below what the ID register says, is architecturally unpredictable. So it's not something we can do generally. What I would suggest as the way forward is to:
- set the default lower - cap it against the value in the ID register (i.e. the default might be 10, but if the ID register on a given core says 20, program the ETM to 20) - add an event parameter to set the threshold: -e cs_etm/cycacc=1,cycthreshold=10/ - add a 1-bit event parameter to override the checking of the ID register: -e cs_etm/cycacc=1,cycthreshold=10,cycoverride=1/
We could use the CPU errata mechanism to work around the incorrect ID register, but it seems overkill in this case.
I'd suggest a default of 8 for the threshold, and 10 bits for the cycthreshold event parameter. If anyone has other ideas, please say.
Al