Hi Mathew

On Tue, Oct 9, 2012 at 11:37 PM, Matthew Gretton-Dann <matthew.gretton-dann@linaro.org> wrote:
On 9 October 2012 14:44, Jubi Taneja <jubitaneja@gmail.com> wrote:
>
>
> On Tue, Oct 9, 2012 at 5:21 PM, Matthew Gretton-Dann
> <matthew.gretton-dann@linaro.org> wrote:
>>
>> >> /* arm-none-linux-gnueabi-gcc -mcpu=cortex-a15 -mfpu=vfpv4 -S -o-
>> >> /tmp/fma.c -mfloat-abi=hard -O2 */
>> >> float f(float a, float b, float c)
>> >> {
>> >>   return a * b + c;
>> >> }
>> >> /* end of tmp.c */
>> >>
>> >> (Note that -mfloat-abi=softfp will also work in this example.  Which
>> >> one you want to use depends on whether you have configured your system
>> >> for hard or soft-float ABIs).
>> >>
>> > I checked both with -mfpu=vfpv3 and -mfpu=vfpv4 and it generates the
>> > same
>> > assembly code. VMLA insn is emitted for both the cases. I was wondering
>> > if I
>> > can get any test case so that I may observe the difference in the two
>> > objdumps.
>>
>> Which compiler are you using?  VFMA support is only in trunk FSF GCC.
>> Linaro has not yet backported support to 4.7.
>
>
> I am using FSF GCC only.

What version of GCC (what does arm-none-linux-gneabi-gcc -v report?).
# arm-none-linux-gneabi-gcc -v
Using built-in specs.
COLLECT_GCC=arm-none-linux-gneabi-gcc
COLLECT_LTO_WRAPPER=/opt/toolchains/arm/bin/../libexec/gcc/arm-none-linux-gneabi/4.6.3/lto-wrapper
Target: arm-none-linux-gneabi
Configured with: /home/user/arm-src/build/sources/gcc_1/configure --build=i686-pc-linux-gnu --host=i686-pc-linux-gnu --target=arm-none-linux-gneabi --prefix=/opt/arm --with-sysroot=/opt/arm/arm-none-linux-gneabi/sys-root --disable-libmudflap --disable-libssp --disable-libgomp --disable-nls --disable-libstdcxx-pch --with-interwork --with-mode=arm --with-fpu=vfp3 --with-cpu=cortex-a9 --with-tune=cortex-a9 --with-float=softfp --enable-extra-vd-multilibs --enable-poison-system-directories --enable-long-long --enable-threads --enable-languages=c,c++ --enable-shared --enable-lto --enable-symvers=gnu --enable-__cxa_atexit --with-pkgversion=arm-toolchain.v1  --with-gnu-as --with-gnu-ld --with-host-libstdcxx='-static-libgcc -Wl,-Bstatic,-lstdc++,-Bdynamic -lm' --with-build-time-tools=/opt/arm/bin --with-gmp=/opt/arm --with-mpfr=/opt/arm --with-ppl=/opt/arm --with-cloog=/opt/arm --with-libelf=/opt/arm
Thread model: posix
gcc version 4.6.3 (arm-toolchain.v1) 

When I compile the test case above with a recent (within last month or
so) trunk GCC I get the following output which uses vfma:

$ /work/builds/gcc-fsf-arm-none-linux-gnueabi/tools/bin/arm-none-linux-gnueabi-gcc
-mcpu=cortex-a15 -mfpu=vfpv4 -S -o- /tmp/fma.c -mfloat-abi=hard -O2
        .cpu cortex-a15
        .eabi_attribute 27, 3
        .eabi_attribute 28, 1
        .fpu vfpv4
        .eabi_attribute 20, 1
        .eabi_attribute 21, 1
        .eabi_attribute 23, 3
        .eabi_attribute 24, 1
        .eabi_attribute 25, 1
        .eabi_attribute 26, 2
        .eabi_attribute 30, 2
        .eabi_attribute 34, 1
        .eabi_attribute 18, 4
        .file   "fma.c"
        .text
        .align  2
        .global f
        .type   f, %function
f:
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 0, uses_anonymous_args = 0
        @ link register save eliminated.
        vfma.f32        s2, s0, s1
        fcpys   s0, s2
        bx      lr
        .size   f, .-f
        .ident  "GCC: (GNU) 4.8.0 20120913 (experimental)"
        .section        .note.GNU-stack,"",%progbits

--


$ arm-none-linux-gnueabi-gcc -mcpu=cortex-a15 -mfpu=vfpv4 -S -o- prog.c -O2
    .cpu cortex-a15
    .eabi_attribute 27, 3
    .fpu vfpv4
    .eabi_attribute 20, 1
    .eabi_attribute 21, 1
    .eabi_attribute 23, 3
    .eabi_attribute 24, 1
    .eabi_attribute 25, 1
    .eabi_attribute 26, 2
    .eabi_attribute 30, 2
    .eabi_attribute 34, 0
    .eabi_attribute 18, 4
    .file    "prog.c"
    .section    .text.f,"ax",%progbits
    .align    2
    .global    f
    .type    f, %function
f:
    .fnstart
    @ args = 0, pretend = 0, frame = 0
    @ frame_needed = 0, uses_anonymous_args = 0
    @ link register save eliminated.
    fmsr    s14, r0
    fmsr    s13, r2
    fmsr    s15, r1
    fmacs    s13, s14, s15
    fmrs    r0, s13
    bx    lr
    .fnend
    .size    f, .-f
    .section    .text.startup.main,"ax",%progbits
    .align    2
    .global    main
    .type    main, %function
main:
    .fnstart
    @ args = 0, pretend = 0, frame = 0
    @ frame_needed = 0, uses_anonymous_args = 0
    @ link register save eliminated.
    bx    lr
    .fnend
    .size    main, .-main
    .ident    "GCC: (VDLinux.GA1.2012-10-03) 4.6.4"
    .section    .note.GNU-stack,"",%progbits

I could not conclude the difference in two results and the overall conclusion for my query... Can you please guide to dig deeper in it?

Jubi
Matthew Gretton-Dann
Linaro Toolchain Working Group
matthew.gretton-dann@linaro.org