Greetings,
I'm experiencing what appears to be a minimum clock resolution issue in using clock_gettime() on a PandaBoard ES running ubuntu.
*> uname -r* 3.1.1-8-linaro-lt-omap
*> cat /proc/version* Linux version 3.1.1-8-linaro-lt-omap (buildd@diphda) (gcc version 4.6.1 (Ubuntu/Linaro 4.6.1-9ubuntu3) ) #8~lt~ci~20120118001257+025756-Ubuntu SMP PREEMPT Thu Jan 19 09:
I'm using clock_gettime() (and have tried gettimeofday()) to compute the elapsed time around roughly 15ms of computation (image processing). While the computed time is stable on my x86_64 machine, it is not on my PandaBoard ES. I have tried various clocks (e.g. CLOCK_REALTIME), but the issue remains. No error codes are returned by clock_gettime().
The result on my x86_64 machine looks like this:
*elapsed (s) elapsed (ns) elapsed (us) time (after) time (before)* 0s 532260ns *532us* (t1: 73741s 92573265ns) (t0: 73741s 92041005ns) 0s 544413ns *544us* (t1: 73741s 109390136ns) (t0: 73741s 108845723ns) 0s 529328ns *529us* (t1: 73741s 126024860ns) (t0: 73741s 125495532ns)
A: 1.7s in total. *0.536ms* on average.
If I move over to my PandaBoard ES, I calculate elapsed times of 0us on some iterations.
*elapsed (s) elapsed (ns) elapsed (us) time (after) time (before)* 0s 0ns *0us* (t1: 269529s 192626951ns) (t0: 269529s 192626951ns) 0s 0ns *0us* (t1: 269529s 215606688ns) (t0: 269529s 215606688ns) 0s 2655030ns *2655us* (t1: 269529s 252349852ns) (t0: 269529s 249694822ns) 0s 2593994ns *2593us* (t1: 269529s 286163328ns) (t0: 269529s 283569334ns) 0s 30518ns *30us* (t1: 269529s 317657469ns) (t0: 269529s 317626951ns)
If I crank up the amount of work done between the time calls (timetest.c:18: inneriters = 1e7;) such that the timed loop takes around 72ms, the timing results seem accurate and none of the intermediate calculations result in a 0us elapsed time. If I reduce it to around 10-25ms (inneriters=1e6), I get occasional 0us elapsed times. Around 2ms (inneriters=1e5), most results measure an elapsed time of 0us.
I'm trying to optimize image processing functions, which take on the order of 2-15ms to process. Am I stuck with this timing resolution? I want to be careful to not omit issues like cache performance when timing, as I might if I repeatedly process an image to average the results. Currently, that seems like the best option.
Source code and makefile attached, as well as /proc/timer_list
Is this a property of the hardware, or might it be a bug?
Thanks, Andrew