Re: [Linaro-dev] Discussion on new ARM hard-float port on debian-arm@

9 Jul 2010


      Hi there,
On Thu, Jul 8, 2010 at 7:44 PM, Hector Oron hector.oron@gmail.com wrote:
...
Hello,
2010/7/8, JD Zheng entropy.zjd@gmail.com:
...
I don't quite get why A9 won't gain much by using hard float.
Because A9 benefits from `softfp` which it is compatible with `soft`.
In theory, hard floating point (incompatible with soft*) should not be
much of a win over softfp on A9 cores which much better structured
pipeline.
Regarding this discussion, I stongly advocate getting some benchmarks
--- we should be careful about drawing conclusions like "won't be much
of a win on A9" without some quantification.
For all v7 processors (A8, A9, etc.) the hardfp ABI will increase the
register bandwidth for funtion calls.  In some cases of floating-point
intensive code, the increase will be substantial.  For VFPv2 or
VFPv3-D16:
* Up to 8 double-precision arguments, or 16 single-precision
arguments can be passed in fp registers, in addition to the usual
limit of up to 4 integer or pointer arguments in the integer regs.
This can eliminate many instructions at call sites and can reduce
stack frame size and cache footprint, particularly in and around leaf
functions.  For C++ the benefit increases again due to the precense of
'this' as an implicit first argument in member functions: a C++ member
function with a single explicit double argument will use r0 for the
'this' pointer and r1 will be wasted because double arguments must be
padded to an even-numbered register in the register bank.  So hardfp
could allow up to three extra integer/pointer arguments to be moved
from the stack into registers in such cases.
  * A floating-point result can be returned in an fp register and used
directly by the caller
  * Moving values between the floating-point and integer pipelines can
be reduced.  This is a benefit on all processors, particularly for
floating->integer moves, but as discussed previously the benefit is
significantly greater on A8 than it is on A9.
One particular issue we have is that the toolchain cannot easily
handle intermixing of multiple ABIs, so it isn't straightforward to
use a different ABI (hard) internally to a library or shared object
compared with the ABI (softfp) used at the public interface.  This
means that some libs which may get significant benefit from the hard
fp ABI to accelerate internal function calls (such as libm, as well as
any computational library) cannot be built using the hard fp ABI
internally without doing significant work, unless the whole system is
built with hard fp.  It's certainly not something we can achieve by
simply using dififerent build options for targeted libraries, as can
be done for NEON optimisations for example.
Judging how much these changes will improve the performance of
real-world code, and how the improvement compares on A9 versus A8, is
difficult without doing some benchmarking though.
Cheers
---Dave

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

Re: [Linaro-dev] Discussion on new ARM hard-float port on debian-arm@