Greetings,

I'm experiencing what appears to be a minimum clock resolution issue in using clock_gettime() on a PandaBoard ES running ubuntu.
> uname -r
3.1.1-8-linaro-lt-omap

> cat /proc/version
Linux version 3.1.1-8-linaro-lt-omap (buildd@diphda) (gcc version 4.6.1 (Ubuntu/Linaro 4.6.1-9ubuntu3) ) #8~lt~ci~20120118001257+025756-Ubuntu SMP PREEMPT Thu Jan 19 09:
I'm using clock_gettime() (and have tried gettimeofday()) to compute the elapsed time around roughly 15ms of computation (image processing). While the computed time is stable on my x86_64 machine, it is not on my PandaBoard ES. I have tried various clocks (e.g. CLOCK_REALTIME), but the issue remains. No error codes are returned by clock_gettime().

The result on my x86_64 machine looks like this:
elapsed (s)   elapsed (ns)   elapsed (us)     time (after)                     time (before)
         0s       532260ns          532us     (t1:  73741s     92573265ns)     (t0:  73741s     92041005ns)
         0s       544413ns          544us     (t1:  73741s    109390136ns)     (t0:  73741s    108845723ns)
         0s       529328ns          529us     (t1:  73741s    126024860ns)     (t0:  73741s    125495532ns)

A: 1.7s in total.    0.536ms on average.


If I move over to my PandaBoard ES, I calculate elapsed times of 0us on some iterations.
elapsed (s)   elapsed (ns)   elapsed (us)     time (after)                     time (before)
         0s            0ns            0us     (t1: 269529s    192626951ns)     (t0: 269529s    192626951ns)
         0s            0ns            0us     (t1: 269529s    215606688ns)     (t0: 269529s    215606688ns)
    
     0s      2655030ns         2655us     (t1: 269529s    252349852ns)     (t0: 269529s    249694822ns)
    
     0s      2593994ns         2593us     (t1: 269529s    286163328ns)     (t0: 269529s    283569334ns)
    
    0s        30518ns           30us     (t1: 269529s    317657469ns)     (t0: 269529s    317626951ns)

If I crank up the amount of work done between the time calls (timetest.c:18: inneriters = 1e7;) such that the timed loop takes around 72ms, the timing results seem accurate and none of the intermediate calculations result in a 0us elapsed time. If I reduce it to around 10-25ms (inneriters=1e6), I get occasional 0us elapsed times. Around 2ms (inneriters=1e5), most results measure an elapsed time of 0us.

I'm trying to optimize image processing functions, which take on the order of 2-15ms to process. Am I stuck with this timing resolution? I want to be careful to not omit issues like cache performance when timing, as I might if I repeatedly process an image to average the results. Currently, that seems like the best option.

Source code and makefile attached, as well as /proc/timer_list

Is this a property of the hardware, or might it be a bug?

Thanks,
Andrew