Thumb2 code size improvements

Richard Earnshaw rearnsha at arm.com
Fri Sep 10 18:16:17 BST 2010


On Tue, 2010-09-07 at 12:24 +0100, Julian Brown wrote:
> On Tue, 7 Sep 2010 12:55:59 +0200
> Loïc Minier <loic.minier at linaro.org> wrote:
> 
> > On Tue, Sep 07, 2010, Julian Brown wrote:
> > >                                                                       Do
> > > you still have the code fragment handy (I don't remember exactly how
> > > it went)?
> > 
> >  You can extract it from the wiki history with the "Info" action on
> > the page and then diffing revisions:
> 
> Oh right, I should have realised that :-).
> 
> > 1. stmdb/ldmia registers that are not used
> >  * Observations
> > {{{
> > Dump of assembler code for function history_expand_line_internal:
> >    0x00001c1c <+0>: stmdb sp!, {r4, r5, r6, r7, r8, lr}
> 
> This could be:
> 
>   push {r3, r4, r5, r6, r7, lr}
> 
> >    0x00001c20 <+4>: movs r1, #0
> >    0x00001c22 <+6>: ldr r5, [pc, #52] ; (0x1c58
> > <history_expand_line_internal+60>) 0x00001c24 <+8>: mov r2, r1
> >    0x00001c26 <+10>: mov r6, r0
> >    0x00001c28 <+12>: ldr r7, [r5, #0]
> >    0x00001c2a <+14>: str r1, [r5, #0]
> >    0x00001c2c <+16>: bl 0x1c2c <history_expand_line_internal+16>
> >    0x00001c30 <+20>: str r7, [r5, #0]
> >    0x00001c32 <+22>: cmp r0, r6
> >    0x00001c34 <+24>: mov r4, r0
> >    0x00001c36 <+26>: bne.n 0x1c52 <history_expand_line_internal+54>
> >    0x00001c38 <+28>: bl 0x1c38 <history_expand_line_internal+28>
> >    0x00001c3c <+32>: ldr r1, [pc, #28] ; (0x1c5c
> > <history_expand_line_internal+64>) 0x00001c3e <+34>: movw r2, #1850 ;
> > 0x73a 0x00001c42 <+38>: adds r0, #1
> >    0x00001c44 <+40>: bl 0x1c44 <history_expand_line_internal+40>
> >    0x00001c48 <+44>: mov r1, r4
> >    0x00001c4a <+46>: ldmia.w sp!, {r4, r5, r6, r7, r8, lr}
> 
> This must remain a wide instruction...
> 
>   ldmia.w sp!, {r3, r4, r5, r6, r7, lr}
> 
> >    0x00001c4e <+50>: b.w 0x1c4e <history_expand_line_internal+50>
> >    0x00001c52 <+54>: ldmia.w sp!, {r4, r5, r6, r7, r8, pc}
> 
> But this could be:
> 
>   pop {r3, r4, r5, r6, r7, pc}
> 
> >    0x00001c56 <+58>: nop
> >    0x00001c58 <+60>: andeq r0, r0, r0
> >    0x00001c5c <+64>: andeq r0, r0, r0
> > }}}
> > Register r8 is not used in this function, so no need to save/restore
> > r8.
> >  * Possible improvements
> 
> So yeah, I think there is indeed a possible improvement here (and we
> don't even need to break the EABI, I don't think). Unless I've
> overlooked something, anyway...
> 

GCC 4.5 should already do this:

2009-06-02  Richard Earnshaw  <rearnsha at arm.com>

        * arm.c (arm_get_frame_offsets): Prefer using r3 for padding a
        push/pop multiple to 8-byte alignment.

R.




More information about the linaro-toolchain mailing list