2011/8/1 Loïc Minier loic.minier@linaro.org:
On Sun, Jul 31, 2011, Paulo César Pereira de Andrade wrote:
If I understand correctly, neon will have better support for simd instructions right?
NEON is effectively SIMD instructions, but not all modern SoCs have NEON, e.g. NVidia Tegra2 don't have NEON. It's becoming common place in recent SoCs though.
Either way, I used two simple benchmarks to try to sell myself the idea of breaking compatibility with armv5 or older binaries, but still not convinced, but, as I said, we should use whatever "The Industry" chooses :-)
Depends which industry though. Yes, ARMv5 will be around for a while, but high-end devices, phones, tablets etc. are designed around ARMv7. Depends what your distro targets too; for instance Debian armel targets ARMv4T+ while Ubuntu armel is based of the same sources but targets ARMv7+ and uses Thumb2.
I used for benchmark http://www.tux.org/~mayer/linux/bmark.html and http://www.linuxfordevices.com/c/a/Linux-For-Devices-Articles/Why-ARMs-EABI-... and also compared with my home computer (quad)core i5 x86_64, and attached results...
You're likely not going to see much difference switching float ABI alone with common benchmarks, because GCC is clever enough to use the best possible ABI for non-public functions. It's mostly visible when you're crossing library calls with floating points.
Yes. That is what I noticed. Trying a simple test case of what should be the worst case for softfp, e.g. this "dumb" program: -%<- #include <stdio.h>
__attribute__((noinline)) double d_d(double a, double b, double c, double d, double e, double f, double g, double h) { return a + b + c + d + e + f + g + h; }
int main(int argc, char *argv[]) { int i; double d;
for (d = 0.0, i = 0; i < 100000000; i++) d += d_d(d, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0); printf("%f\n", d);
return 0; } -%<-
and compiling with -O0 otherwise gcc is smart enough to figure out the constant additions, or other variants of the above, I see 20-25% slower softfp on panda board.
Besides breaking compatibility with armv5 binaries due to special corner cases like linking to libraries with functions with several "float by value arguments" is still to convince me (mostly because gcc will optimize most of it for "internal functions")... There is also the issue of the abi still using softfp abi for varargs, e.g. printf...
You should however definitely see a difference between ARM mode and Thumb-2 mode (which is ARMv6+ only IIRC), as the code is denser and fits more easily in CPU cache.
There were many discussions around this on the debian-arm list last year; Konstantinos Margaritis collected benchmarks for hard-float which should be linked from http://wiki.debian.org/ArmHardFloatPort
I tried to follow all links etc from there, but the best I found was an interesting example of wrapping a single assembly instruction in a function call; somewhat like my example above (where it is required to use -O0 to see the difference), and I am afraid some tests, following links from there, were even using different compilers...
Either way, so far I have only tested on a remote panda board, and I am still to see a "physical" armv7, so blame me for not testing on other hardware (due to lack of access). But from more "readings", it would be a shame if the switch to an incompatible abi was mainly to satisfy binary blobs like nvidia binary video drivers... (you can bash me now for saying that).
HTH
Loïc Minier
Paulo