Thanks Al. Sorry, I was not quite clear here. What I wanted to understand was How exactly does timestamp generator increment its counter? I guess that it is F times per second, where F is the frequency of TSGEN's clock, but I may be wrong.
Generally yes, though this will be a fixed frequency system clock which may be independent of the CPU clock. Register 0x020 in TSGEN holds the frequency, but I believe this is simply a place for software to make a note of the value in case other software wants to know. I.e. it's zero unless something writes it, and writing it has no effect on the frequency.
Whoever first enables the timestamp, if it finds register 0x020 is zero, ought to measure the frequency and make a note of it.
Thanks for this great write-up. So, concatenating this as well as Al's suggestions, I think what needs to be done is the following:
- When timestamp packet is encountered, get the packet queue and iterate
backwards, until the previous timestamp packet or discontinuity occurs. 2. For each encountered range/exception/exception_ret packet:
- if this packet happened immediately before the timestamp packet, mark it with
value of this timestamp, decremented by the number of instructions executed in this range
I don't think you need to decrement. The TS packet applies to the most recent branch or exception. Also, decrementing by number of instructions is definitely wrong as there is little relation between instruction counts and timing of any sort.
- if this is a subsequent non-TS packet, mark it with the timestamp value of the
previous non-TS packet, Decremented further in the same fashion.
BTW, is timestamp unit a clock cycle? It seems so, since we decrement it by instruction count with assumption that one instruction lasts one clock cycle. If so, what would be the difference between cycacc packets and timestamp packets?
A timestamp unit is a system clock cycle of some kind.
Any system bigger than a small microcontroller would typically have several clocks. There will be the system interconnect clock, generally fixed frequency. CPUs may run at a variable ratio to the interconnect clock and it may be possible to vary them independently of each other. In addition to this, there is the system generic timer/counter (which Linux uses for timing), and the CoreSight timestamp generator, which both run at yet another fixed frequency. So you might have
- system interconnect at 1GHz - one CPU varying between 1.5GHz and 3GHz - another CPU varying between 500MHz and 2GHz - system generic timer/counter at 40MHz - CoreSight timestamp generator at 100MHz
You are likely to see some convergence between the system generic timer, and the CoreSight timestamp generator. It's always been a bit of a bugbear that across our various logs and traces, we have these two different timebases. On the other hand, there's something to be said for having a debug-related timestamp that's protected against the sort of adjustment Linux might make to the system timer.
On top of all this lot, there is very little relation between instruction count and cycle count. At best the core will execute 3 or 4 instructions per cycle, at worst it will take tens or even hundreds of cycles to do one instruction. The actual time at which an instruction does its work may be quite loosely related to where you see it in the trace - you can see this if you use the ETM Event packets to instrument performance events, as you may get an event relating to an instruction when you haven't even had the branch atom that leads up to the instruction. Cycle counts in traces can help us understand how long instructions are taking.
Al
Thank you and best regards, Wojciech
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.