O2 optimization with vectorize

Singh, Ravi Kumar (Ravi) Ravi.Singh at lsi.com
Wed Apr 11 14:34:34 UTC 2012


All,

In the below code, I tried few compiler options and got following observations:


1)      arm-linux-gnueabi-gcc -O2 -mcpu=cortex-a15 -mfpu=neon -ftree-vectorizer-verbose=6  -ftree-vectorize



Compiler throws following info messages:



foo.c:16: note: not vectorized: unsupported use in stmt.

foo.c:16: note: not vectorized: unsupported use in stmt.

foo.c:18: note: not vectorized: unsupported use in stmt.

foo.c:18: note: not vectorized: unsupported use in stmt.





2)      -O2 -mcpu=cortex-a15 -mfpu=neon


None of the generated code contains the NEON instructions. Code generated with case 1 is taking 3000 cycles, and code generated by option 2 is taking 2500 cycles.

Even if vectorization failed in case1, it should not generate more inefficient code than case 2. My belief was that the executables from both would take same cycles, any thing done for doing unsuccessful vectorization must be reverted if it did not succeed.

###################################################################
#define SIZE1 20
#define SIZE2 26

unsigned int array[SIZE1][SIZE2];

void  foo()
{
  unsigned int i,j;
  unsigned int max = 0;

  for(i = 0; i < SIZE1; i++)
  {
    for(j = 0; j < SIZE2; j++)
    {
      if (array[i][j] > max)
      {
        max = array[i][j];
        index = j;
      }
    }
  }

  printf("Max value: %u Index: %u\n", max, index);
}



Regards
RKS
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linaro.org/pipermail/linaro-toolchain/attachments/20120411/be8da5dc/attachment.html>


More information about the linaro-toolchain mailing list