Vectorised copy

Michael Hope michael.hope at linaro.org
Wed Sep 7 00:39:19 UTC 2011


On Wed, Sep 7, 2011 at 2:14 AM, Richard Sandiford
<richard.sandiford at linaro.org> wrote:
> Michael Hope <michael.hope at linaro.org> writes:
>> While out benchmarking today, I ran across code similar to this:
>>
>> int *a;
>> int *b;
>> int *c;
>>
>> const int ad[320];
>> const int bd[320];
>> const int cd[320];
>>
>> void fill()
>> {
>>   for (int i = 0; i < 320; i++)
>>     {
>>       a[i] = ad[i];
>>       b[i] = bd[i];
>>       c[i] = cd[i];
>>     }
>> }
>>
>> I was surprised and happy to see the vectoriser kick in for the copy.
>> The inner loop looks like:
>>
>>       add     r5, r3, ip
>>       adds    r4, r3, r7
>>       vldmia  r2!, {d16-d17}
>>       vldmia  r1!, {d18-d19}
>>       adds    r0, r3, r6
>>       vst1.32 {q9}, [r5]
>>       vst1.32 {q8}, [r4]
>>       vldmia  r3, {d16-d17}
>>       adds    r3, r3, #16
>>       cmp     r3, r8
>>       vst1.32 {q8}, [r0]
>>       bne     .L3
>>
>> so r3 is the loop variable and {ip,r7} are the offsets from r3 to the
>> destination pointers.  Adding a __restrict doesn't change the code.
>
> FWIW, this comes from ivopts.  I raised the "problem" on gcc@
> a few months back, but it seems to be intentional behaviour:
>
>    http://gcc.gnu.org/ml/gcc/2011-07/msg00050.html
>
> That is, all things being equal, the current code tends to prefer
> cases where it can hoist the difference between potential ivs
> rather than creating separate ivs.
>
> As far as the end of today's meeting goes: ivopts is one of those
> things on my unwritten list of areas that it would be nice to look at.
> I posted some benchmark comparing -fivopts with -fno-ivopts to the
> benchmark list in July.  As expected, ivopts does help a lot cases,
> but there were also a fair number of cases where turning it off
> significantly improved performance.

Spawned into:
 https://blueprints.launchpad.net/gcc-linaro/+spec/investigate-ivopts

-- Michael



More information about the linaro-toolchain mailing list