About arm choice of toolchain options
Paulo César Pereira de Andrade
paulo.cesar.pereira.de.andrade at gmail.com
Wed Aug 3 02:21:02 UTC 2011
2011/8/1 Loïc Minier <loic.minier at linaro.org>:
> On Sun, Jul 31, 2011, Paulo César Pereira de Andrade wrote:
>> If I understand correctly, neon will have better support for
>> simd instructions right?
>
> NEON is effectively SIMD instructions, but not all modern SoCs have
> NEON, e.g. NVidia Tegra2 don't have NEON. It's becoming common place
> in recent SoCs though.
>
>> Either way, I used two simple benchmarks to try to sell
>> myself the idea of breaking compatibility with armv5 or
>> older binaries, but still not convinced, but, as I said, we
>> should use whatever "The Industry" chooses :-)
>
> Depends which industry though. Yes, ARMv5 will be around for a while,
> but high-end devices, phones, tablets etc. are designed around ARMv7.
> Depends what your distro targets too; for instance Debian armel targets
> ARMv4T+ while Ubuntu armel is based of the same sources but targets
> ARMv7+ and uses Thumb2.
>
>> I used for benchmark http://www.tux.org/~mayer/linux/bmark.html
>> and http://www.linuxfordevices.com/c/a/Linux-For-Devices-Articles/Why-ARMs-EABI-matters/
>> and also compared with my home computer (quad)core i5 x86_64,
>> and attached results...
>
> You're likely not going to see much difference switching float ABI
> alone with common benchmarks, because GCC is clever enough to use the
> best possible ABI for non-public functions. It's mostly visible when
> you're crossing library calls with floating points.
Yes. That is what I noticed. Trying a simple test case of what should be
the worst case for softfp, e.g. this "dumb" program:
-%<-
#include <stdio.h>
__attribute__((noinline))
double d_d(double a, double b, double c, double d,
double e, double f, double g, double h)
{
return a + b + c + d + e + f + g + h;
}
int
main(int argc, char *argv[])
{
int i;
double d;
for (d = 0.0, i = 0; i < 100000000; i++)
d += d_d(d, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0);
printf("%f\n", d);
return 0;
}
-%<-
and compiling with -O0 otherwise gcc is smart enough to figure out the
constant additions, or other variants of the above, I see 20-25% slower
softfp on panda board.
Besides breaking compatibility with armv5 binaries due to special
corner cases like linking to libraries with functions with several
"float by value arguments" is still to convince me (mostly because
gcc will optimize most of it for "internal functions")... There is also
the issue of the abi still using softfp abi for varargs, e.g. printf...
> You should however definitely see a difference between ARM mode and
> Thumb-2 mode (which is ARMv6+ only IIRC), as the code is denser and
> fits more easily in CPU cache.
>
> There were many discussions around this on the debian-arm list last
> year; Konstantinos Margaritis collected benchmarks for hard-float which
> should be linked from
> http://wiki.debian.org/ArmHardFloatPort
I tried to follow all links etc from there, but the best I found was an
interesting example of wrapping a single assembly instruction in a
function call; somewhat like my example above (where it is required
to use -O0 to see the difference), and I am afraid some tests, following
links from there, were even using different compilers...
Either way, so far I have only tested on a remote panda board,
and I am still to see a "physical" armv7, so blame me for not
testing on other hardware (due to lack of access). But from more
"readings", it would be a shame if the switch to an incompatible
abi was mainly to satisfy binary blobs like nvidia binary video
drivers... (you can bash me now for saying that).
> HTH
> --
> Loïc Minier
Paulo
More information about the cross-distro
mailing list