Dave Martin wrote:
To benefit, a program must have a call to a small function that takes floating-point arguments or returns a floating-point value in an inner loop. (It must be a small function, since otherwise the parameter-passing costs will be dwarfed by the function itself.) This is a relatively rare situation, but OpenGL or the like are probably examples of where this could be important. In many cases, making the small function "inline" may be a better solution than the hardfp ABI.
This is a fair point, although there are a good number of projects, which follow the "many tiny source files" approach and so where the compiler doesn't get many static functions to optimise and doesn't get much opportunity to inline. I believe libm is an example of this, but I'm prepered to be overridden...
libm (if we're talking about the version in GLIBC) is many tiny source files, but many of them do not call one another.
From the user's point of view, it's not just code of this type that will benefit, but anything that calls it --- in practice that's going to be a larger set of software.
Yes, but it's still the case that those calls must be in an inner loop. For example, if your application calls "cos" in an inner loop, then this optimization might be important. (That depends on how many cycles "cos" takes to execute, but assuming "cos" takes only 100 cycles or so, then this is going to be important.)
- Use ABI tagging (high effort, involving modifications to affected
projects - permits hardvfp ABI for explicitly selected functions)
- Build a fully hard-float world (high effort - requires packing and
distro work to define and build a new armelfp port of the archive)
Since (2) and (3) are both high-effort, perhaps it would be better to choose one of the other approach for now, rather than attempting to do both initially.
I agree. And, for what it's worth, I would try to avoid (3) at almost all costs. One of the advantages of the ARM ABI and one of the objectives of Linaro is to provide a standard platform. Life as a Linux ISV is complex enough (multiple distributions, kernel versions, etc.) without also having to worry about the ABI. I think it would be better to do quite a bit of tools work than to fall back to the approach of a completely parallel distribution.