Hi, All

Thanks for all of your analysis.

The related information is as following:
> * What is your source code?
It's the "void CCParticleSystem::update(float dt)" method of the attached CCParticleSystem.cpp file.
> * How did you compile your source code?
I Compiled it with the android-ndk-r8d with the attached build_native.sh script.
Also I set it to use armeabi-v7a and neon. but the neon should have no affects because there is no source using the neon features.
> * What compiler did you use?
I use the default compiler of android-ndk-r8d, it should be arm-linux-androideabi-4.6
> * What platform are you testing on?
The device I am testing on is an SP8810 device. here is the content of the cpuinfo
root@android:/ # cat /proc/cpuinfo                                             
Processor : ARMv7 Processor rev 1 (v7l)
BogoMIPS : 1024.00
Features : swp half thumb fastmult vfp edsp thumbee neon vfpv3 
CPU implementer : 0x41
CPU architecture: 7
CPU variant : 0x0
CPU part : 0xc05
CPU revision : 1

Hardware : SP8810
Revision : 0000
Serial : 0000000000000000
root@android:/ # 
> * Is there anyway you can generate a smaller test case?
Sorry, not able to that now.

For  the "-fprefetch-loop-arrays", 
I enabled it by appending option "-O3 -fprefetch-loop-arrays", but there seems no improvement.

Thanks,
Yongqin Liu

On 14 May 2013 22:03, Renato Golin <renato.golin@linaro.org> wrote:
On 14 May 2013 13:23, Will Newton <will.newton@linaro.org> wrote:
It looks like there is a data dependency on the preceding load, it
might be worth looking into prefetching the data, either manually or
maybe try -fprefetch-loop-arrays?

I agree with Matt on needing more info, but I also agree with Will that a pre-fetch could speed things up.

The beginning of the block is a few instructions up, and the address of the VLDR is computed by almost all instructions in the block, in chain, I'm assuming (without evidence) that it's the VLDR itself who is taking all that time to release S15 for VSUB.

Furthermore, the VLDR was hit 100x less than the VSUB, hinting that it's not waiting for too long waiting for anything, so the instructions before it calculating the offset are pretty much streamlined, another hint that it's the VLDR itself who is taking that long.

cheers,
--renato




--
Thanks,
Yongqin Liu
---------------------------------------------------------------
#mailing list
linaro-android@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-android
linaro-validation@lists.linaro.org
http://lists.linaro.org/pipermail/linaro-validation