Hi,
I have an interesting observation that I thought might be interesting to the tool-chain team.
I was trying to build u-boot in Thumb2 for OMAP4. Everything was fine until I added some patches recently. One of these patches introduced an API (let's say foo()) that has a weakly linked alias(let's say __foo()) and a strongly linked implementation(the real foo()) in an assembly file.
Although I give -mthumb and -mthumb-interwork for all the files, apparently GCC generates ARM code for assembly files. In the final image foobar() calls foo() using a BL. Since foobar() is in Thumb and foo() in ARM, it ends up crashing. Looks like foobar() assumed foo() to be Thumb because __foo() is Thumb.
Also I see that 'objdump -S' aborts when it tries to parse foo().
I could workaround this problem by having foo() also in a C file that in turn calls into the assembly file.
I tried Linaro GCC 4.5.2 and Codesourcery Lite GCC 4.4.1. Both seem to have the issue.
Isn't this an issue with GCC or am I missing something?
-Aneesh
Aneesh V aneesh@ti.com wrote:
I was trying to build u-boot in Thumb2 for OMAP4. Everything was fine until I added some patches recently. One of these patches introduced an API (let's say foo()) that has a weakly linked alias(let's say __foo()) and a strongly linked implementation(the real foo()) in an assembly file.
Although I give -mthumb and -mthumb-interwork for all the files, apparently GCC generates ARM code for assembly files. In the final image foobar() calls foo() using a BL. Since foobar() is in Thumb and foo() in ARM, it ends up crashing. Looks like foobar() assumed foo() to be Thumb because __foo() is Thumb.
I'm unable to reproduce this. Do you have a complete test case?
I've tried with the following small example:
foo1.c:
extern void foo (void) __attribute__ ((weak, alias ("__foo")));
void __foo (void) { }
int main (void) { foo (); }
foo2.S: .text .align 2 .global foo .type foo, %function foo: push {r7} add r7, sp, #0 mov sp, r7 pop {r7} bx lr .size foo, .-foo
When building just "gcc foo1.c", I get:
0000835c <__foo>: 835c: b480 push {r7} 835e: af00 add r7, sp, #0 8360: 46bd mov sp, r7 8362: bc80 pop {r7} 8364: 4770 bx lr 8366: bf00 nop
00008368 <main>: 8368: b580 push {r7, lr} 836a: af00 add r7, sp, #0 836c: f7ff fff6 bl 835c <__foo> 8370: 4618 mov r0, r3 8372: bd80 pop {r7, pc}
When building both files "gcc foo1.c foo2.S", I get instead:
00008368 <main>: 8368: b580 push {r7, lr} 836a: af00 add r7, sp, #0 836c: f000 e802 blx 8374 <foo> 8370: 4618 mov r0, r3 8372: bd80 pop {r7, pc}
00008374 <foo>: 8374: e92d0080 push {r7} 8378: e28d7000 add r7, sp, #0 837c: e1a0d007 mov sp, r7 8380: e8bd0080 pop {r7} 8384: e12fff1e bx lr
So it seems to me the linker is handling this correctly ...
(This is on Ubuntu Natty using system gcc and binutils.)
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
-- Dr. Ulrich Weigand | Phone: +49-7031/16-3727 STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E. IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294
Hi Ulrich,
On Tuesday 15 March 2011 09:00 PM, Ulrich Weigand wrote:
Aneesh Vaneesh@ti.com wrote:
I was trying to build u-boot in Thumb2 for OMAP4. Everything was fine until I added some patches recently. One of these patches introduced an API (let's say foo()) that has a weakly linked alias(let's say __foo()) and a strongly linked implementation(the real foo()) in an assembly file.
Although I give -mthumb and -mthumb-interwork for all the files, apparently GCC generates ARM code for assembly files. In the final image foobar() calls foo() using a BL. Since foobar() is in Thumb and foo() in ARM, it ends up crashing. Looks like foobar() assumed foo() to be Thumb because __foo() is Thumb.
I'm unable to reproduce this. Do you have a complete test case?
Thank you for looking into this problem.
I've tried with the following small example:
foo1.c:
extern void foo (void) __attribute__ ((weak, alias ("__foo")));
void __foo (void) { }
int main (void) { foo (); }
foo2.S: .text .align 2 .global foo .type foo, %function foo: push {r7} add r7, sp, #0 mov sp, r7 pop {r7} bx lr .size foo, .-foo
Can you try this sequence:
arm-linux-gnueabi-gcc -march=armv7-a -mthumb -mthumb-interwork -o foo1.o foo1.c -c arm-linux-gnueabi-gcc -march=armv7-a -mthumb -mthumb-interwork -o foo2.o foo2.S -c arm-linux-gnueabi-ld -r foo1.o foo2.o arm-linux-gnueabi-objdump -S a.out
With this sequence I get the following output:
00000000 <__foo>: 0: b480 push {r7} 2: af00 add r7, sp, #0 4: 46bd mov sp, r7 6: bc80 pop {r7} 8: 4770 bx lr a: bf00 nop
0000000c <main>: c: b580 push {r7, lr} e: af00 add r7, sp, #0 10: f7ff fffe bl 18 <foo> 14: 4618 mov r0, r3 16: bd80 pop {r7, pc}
00000018 <foo>: 18: e92d0080 push {r7} 1c: e28d7000 add r7, sp, #0 20: e1a0d007 mov sp, r7 24: e8bd0080 pop {r7} 28: e12fff1e bx lr
When building just "gcc foo1.c", I get:
0000835c<__foo>: 835c: b480 push {r7} 835e: af00 add r7, sp, #0 8360: 46bd mov sp, r7 8362: bc80 pop {r7} 8364: 4770 bx lr 8366: bf00 nop
00008368<main>: 8368: b580 push {r7, lr} 836a: af00 add r7, sp, #0 836c: f7ff fff6 bl 835c<__foo> 8370: 4618 mov r0, r3 8372: bd80 pop {r7, pc}
When building both files "gcc foo1.c foo2.S", I get instead:
00008368<main>: 8368: b580 push {r7, lr} 836a: af00 add r7, sp, #0 836c: f000 e802 blx 8374<foo> 8370: 4618 mov r0, r3 8372: bd80 pop {r7, pc}
00008374<foo>: 8374: e92d0080 push {r7} 8378: e28d7000 add r7, sp, #0 837c: e1a0d007 mov sp, r7 8380: e8bd0080 pop {r7} 8384: e12fff1e bx lr
So it seems to me the linker is handling this correctly ...
(This is on Ubuntu Natty using system gcc and binutils.)
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
-- Dr. Ulrich Weigand | Phone: +49-7031/16-3727 STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E. IBM Deutschland Research& Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294
Aneesh V aneesh@ti.com wrote on 03/16/2011 10:32:50 AM:
Can you try this sequence:
arm-linux-gnueabi-gcc -march=armv7-a -mthumb -mthumb-interwork -o foo1.o foo1.c -c arm-linux-gnueabi-gcc -march=armv7-a -mthumb -mthumb-interwork -o foo2.o foo2.S -c arm-linux-gnueabi-ld -r foo1.o foo2.o arm-linux-gnueabi-objdump -S a.out
With this sequence I get the following output:
0000000c <main>: c: b580 push {r7, lr} e: af00 add r7, sp, #0 10: f7ff fffe bl 18 <foo> 14: 4618 mov r0, r3 16: bd80 pop {r7, pc}
00000018 <foo>: 18: e92d0080 push {r7} 1c: e28d7000 add r7, sp, #0 20: e1a0d007 mov sp, r7 24: e8bd0080 pop {r7} 28: e12fff1e bx lr
Well, sure, but the result of "ld -r" is not a final executable, but just another relocatable object file. In particular, it will still carry relocation records. In fact, if you add --reloc to the objdump line, you should see:
0000000c <main>: c: b580 push {r7, lr} e: af00 add r7, sp, #0 10: f7ff fffe bl 18 <foo> 10: R_ARM_THM_CALL foo 14: 4618 mov r0, r3 16: bd80 pop {r7, pc}
The R_ARM_THM_CALL marks the branch instruction as still to be processed by the final link step. And once you actually *perform* the final link, e.g. via "gcc -o final a.out", the branch gets indeed converted to a blx by the linker for me:
0000836c <main>: 836c: b580 push {r7, lr} 836e: af00 add r7, sp, #0 8370: f000 e802 blx 8378 <foo> 8374: 4618 mov r0, r3 8376: bd80 pop {r7, pc}
00008378 <foo>: 8378: e92d0080 push {r7} 837c: e28d7000 add r7, sp, #0 8380: e1a0d007 mov sp, r7 8384: e8bd0080 pop {r7} 8388: e12fff1e bx lr
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
-- Dr. Ulrich Weigand | Phone: +49-7031/16-3727 STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E. IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294
Sorry fo the delay in responding. I couldn't find the time to do the experiments in the last couple of days.
On Wednesday 16 March 2011 07:14 PM, Ulrich Weigand wrote:
Aneesh Vaneesh@ti.com wrote on 03/16/2011 10:32:50 AM:
Can you try this sequence:
arm-linux-gnueabi-gcc -march=armv7-a -mthumb -mthumb-interwork -o foo1.o foo1.c -c arm-linux-gnueabi-gcc -march=armv7-a -mthumb -mthumb-interwork -o foo2.o foo2.S -c arm-linux-gnueabi-ld -r foo1.o foo2.o arm-linux-gnueabi-objdump -S a.out
With this sequence I get the following output:
0000000c<main>: c: b580 push {r7, lr} e: af00 add r7, sp, #0 10: f7ff fffe bl 18<foo> 14: 4618 mov r0, r3 16: bd80 pop {r7, pc}
00000018<foo>: 18: e92d0080 push {r7} 1c: e28d7000 add r7, sp, #0 20: e1a0d007 mov sp, r7 24: e8bd0080 pop {r7} 28: e12fff1e bx lr
Well, sure, but the result of "ld -r" is not a final executable, but just another relocatable object file. In particular, it will still carry relocation records. In fact, if you add --reloc to the objdump line, you should see:
Hmm. I was just trying to mimic the U-Boot link command there. Didn't really check what -r meant.
0000000c<main>: c: b580 push {r7, lr} e: af00 add r7, sp, #0 10: f7ff fffe bl 18<foo> 10: R_ARM_THM_CALL foo 14: 4618 mov r0, r3 16: bd80 pop {r7, pc}
The R_ARM_THM_CALL marks the branch instruction as still to be processed by the final link step. And once you actually *perform* the final link, e.g. via "gcc -o final a.out", the branch gets indeed converted to a blx by the linker for me:
This doesn't seem to be happening with U-boot. In U-boot, the situation is like this:
U-Boot builds multiple individual libraries and finally link them together. The call to the API and the weakly linked implementation is in one library(libarmv7.o), while the real implementation in assembly is in another library(libomap4.o). Here are the relevant contents of these libraries and the final image "u-boot" for your reference:
liarmv7.o: 000001b6 <arm_init_before_mmu>: }
void arm_init_before_mmu(void) { 1b6: b508 push {r3, lr} v7_outer_cache_enable(); 1b8: f7ff fffe bl 1f0 <__v7_outer_cache_enable> 1b8: R_ARM_THM_CALL v7_outer_cache_enable
libomap4.o: 0000111c <v7_outer_cache_enable>:
.globl v7_outer_cache_enable v7_outer_cache_enable: MOV r0, #1 111c: e3a00001 mov r0, #1 B set_pl310_ctrl_reg 1120: eafffff9 b 110c <set_pl310_ctrl_reg>
u-boot: void arm_init_before_mmu(void) { 80100636: b508 push {r3, lr} v7_outer_cache_enable(); 80100638: f001 f998 bl 8010196c <_end+0xfffaf2a4>
Here are the intermediate link commands used for the libraries: arm-linux-gnueabi-ld -r -o libarmv7.o cpu.o cache_v7.o syslib.o arm-linux-gnueabi-ld -r -o libomap4.o board.o mem.o sys_info.o ...
Here is the final link command(shortened and line-wrapped in the interest of readability - attaching the full build log): UNDEF_SYM=`arm-linux-gnueabi-objdump -x arch/arm/cpu/armv7/libarmv7.o arch/arm/cpu/armv7/omap4/libomap4.o arch/arm/lib/libarm.o | sed -n -e 's/.*(__u_boot_cmd_.*)/-u\1/p'|sort|uniq`; cd /home/a0393566local /u-boot-denx-bak && arm-linux-gnueabi-ld -pie -Bstatic -T u-boot.lds -Ttext 0x80100000 $UNDEF_SYM arch/arm/cpu/armv7/start.o --start-group arch/arm/cpu/armv7/libarmv7.o arch/arm/cpu/armv7/omap4/libomap4.o arch/arm/lib/libarm.o --end-group /home/a0393566local/u-boot-denx-bak /arch/arm/lib/eabi_compat.o -L /usr/lib/gcc/arm-linux-gnueabi/4.5.2 -lgcc -Map u-boot.map -o u-boot
I don't know what the UNDEF_SYM is about. Do you think that could be creating a problem?
This is now looking more like U-Boot build infrastructure issue. Maybe, I should take this up in the U-Boot mailing list now?
best regards, Aneesh
Aneesh V aneesh@ti.com wrote:
On Wednesday 16 March 2011 07:14 PM, Ulrich Weigand wrote:
The R_ARM_THM_CALL marks the branch instruction as still to be
processed
by the final link step. And once you actually *perform* the final
link,
e.g. via "gcc -o final a.out", the branch gets indeed converted to a
blx
by the linker for me:
This doesn't seem to be happening with U-boot. In U-boot, the situation is like this:
U-Boot builds multiple individual libraries and finally link them together. The call to the API and the weakly linked implementation is in one library(libarmv7.o), while the real implementation in assembly is in another library(libomap4.o). Here are the relevant contents of these libraries and the final image "u-boot" for your reference:
liarmv7.o: 000001b6 <arm_init_before_mmu>: }
void arm_init_before_mmu(void) { 1b6: b508 push {r3, lr} v7_outer_cache_enable(); 1b8: f7ff fffe bl 1f0 <__v7_outer_cache_enable> 1b8: R_ARM_THM_CALL v7_outer_cache_enable
libomap4.o: 0000111c <v7_outer_cache_enable>:
.globl v7_outer_cache_enable v7_outer_cache_enable: MOV r0, #1 111c: e3a00001 mov r0, #1 B set_pl310_ctrl_reg 1120: eafffff9 b 110c <set_pl310_ctrl_reg>
u-boot: void arm_init_before_mmu(void) { 80100636: b508 push {r3, lr} v7_outer_cache_enable(); 80100638: f001 f998 bl 8010196c <_end+0xfffaf2a4>
Here are the intermediate link commands used for the libraries: arm-linux-gnueabi-ld -r -o libarmv7.o cpu.o cache_v7.o syslib.o arm-linux-gnueabi-ld -r -o libomap4.o board.o mem.o sys_info.o ...
Well, those aren't really "libraries" strictly speaking, since they are build with "ld -r", so they're actually just regular object files. But that doesn't really matter one way or the other ...
Here is the final link command(shortened and line-wrapped in the interest of readability - attaching the full build log): UNDEF_SYM=`arm-linux-gnueabi-objdump -x arch/arm/cpu/armv7/libarmv7.o arch/arm/cpu/armv7/omap4/libomap4.o arch/arm/lib/libarm.o | sed -n -e 's/.*(__u_boot_cmd_.*)/-u\1/p'|sort|uniq`; cd /home/a0393566local /u-boot-denx-bak && arm-linux-gnueabi-ld -pie -Bstatic -T u-boot.lds -Ttext 0x80100000 $UNDEF_SYM arch/arm/cpu/armv7/start.o --start-group arch/arm/cpu/armv7/libarmv7.o arch/arm/cpu/armv7/omap4/libomap4.o arch/arm/lib/libarm.o --end-group /home/a0393566local/u-boot-denx-bak /arch/arm/lib/eabi_compat.o -L /usr/lib/gcc/arm-linux-gnueabi/4.5.2 -lgcc -Map u-boot.map -o u-boot
I don't know what the UNDEF_SYM is about. Do you think that could be creating a problem?
I can't see how ... this just adds a bunch of undefined symbols via -u options, which presumably causes some objects to be pulled into the final link that otherwise wouldn't be.
I've tried to recreate the problem using the specific linker options (-pie and -Bstatic), but that didn't make any difference for me; it still resolved the branch to a blx.
This is now looking more like U-Boot build infrastructure issue. Maybe, I should take this up in the U-Boot mailing list now?
I can see two options here: either there's something weird going on in one of the parts I cannot reproduce, or else there's some bug in the version of the linker you're using. So I'm wondering:
- Can you send me the full set of files (.o files, u-boot.lds linker script, exact full command line etc.) for me to be able to run the actual link command myself?
- What version of the linker are you using? Can you retry the link on a native ARM Ubuntu Natty system using the system linker?
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
-- Dr. Ulrich Weigand | Phone: +49-7031/16-3727 STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E. IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294
+ linaro-toolchain
Hello Ulrich,
I want to revisit this old thread. Sorry for the sloppy follow-up. But, this time around I have more data.
On Tue, Mar 15, 2011 at 9:00 PM, Ulrich Weigand Ulrich.Weigand@de.ibm.com wrote:
Aneesh V aneesh@ti.com wrote:
I was trying to build u-boot in Thumb2 for OMAP4. Everything was fine until I added some patches recently. One of these patches introduced an API (let's say foo()) that has a weakly linked alias(let's say __foo()) and a strongly linked implementation(the real foo()) in an assembly file.
Although I give -mthumb and -mthumb-interwork for all the files, apparently GCC generates ARM code for assembly files. In the final image foobar() calls foo() using a BL. Since foobar() is in Thumb and foo() in ARM, it ends up crashing. Looks like foobar() assumed foo() to be Thumb because __foo() is Thumb.
I'm unable to reproduce this. Do you have a complete test case?
I've tried with the following small example:
foo1.c:
extern void foo (void) __attribute__ ((weak, alias ("__foo")));
void __foo (void) { }
int main (void) { foo (); }
foo2.S: .text .align 2 .global foo .type foo, %function foo: push {r7} add r7, sp, #0 mov sp, r7 pop {r7} bx lr .size foo, .-foo
When building just "gcc foo1.c", I get:
0000835c <__foo>: 835c: b480 push {r7} 835e: af00 add r7, sp, #0 8360: 46bd mov sp, r7 8362: bc80 pop {r7} 8364: 4770 bx lr 8366: bf00 nop
00008368 <main>: 8368: b580 push {r7, lr} 836a: af00 add r7, sp, #0 836c: f7ff fff6 bl 835c <__foo> 8370: 4618 mov r0, r3 8372: bd80 pop {r7, pc}
When building both files "gcc foo1.c foo2.S", I get instead:
00008368 <main>: 8368: b580 push {r7, lr} 836a: af00 add r7, sp, #0 836c: f000 e802 blx 8374 <foo> 8370: 4618 mov r0, r3 8372: bd80 pop {r7, pc}
00008374 <foo>: 8374: e92d0080 push {r7} 8378: e28d7000 add r7, sp, #0 837c: e1a0d007 mov sp, r7 8380: e8bd0080 pop {r7} 8384: e12fff1e bx lr
So it seems to me the linker is handling this correctly ...
(This is on Ubuntu Natty using system gcc and binutils.)
I could reproduce the problem on older tool-chain(Sourcery G++ Lite 2010q1-202) [1] with a modified version of your sample code:
a.c: ==== extern void foo (void) __attribute__ ((weak, alias ("__foo")));
void __foo (void) { }
extern void call_foo(void);
int main (void) { call_foo (); }
b.S: ==== .text .align 2 .global foo foo: push {r7} add r7, sp, #0 mov sp, r7 pop {r7} bx lr .size foo, .-foo
c.S .text .align 2
.global call_foo call_foo: bl foo bx lr
.global __aeabi_unwind_cpp_pr0 __aeabi_unwind_cpp_pr0: bx lr
Now, I build them using the following commands, which is similar to what U-Boot does:
arm-none-linux-gnueabi-gcc -mthumb -mthumb-interwork -c a.c arm-none-linux-gnueabi-gcc -mthumb -mthumb-interwork -c b.S arm-none-linux-gnueabi-gcc -mthumb -mthumb-interwork -c c.S arm-none-linux-gnueabi-ld -r a.o -o alib.o arm-none-linux-gnueabi-ld -r b.o -o blib.o arm-none-linux-gnueabi-ld -r c.o -o clib.o arm-none-linux-gnueabi-ld --start-group clib.o alib.o blib.o --end-group -o a.out armobjdump -S --reloc a.out
You will get something like: 00008094 <call_foo>: 8094: fa000006 blx 80b4 <foo> 8098: e12fff1e bx lr
Please note that that the 'blx' is not correct. Now, do the following change:
diff --git a/b.S b/b.S index e0f2de9..96dba1f 100644 --- a/b.S +++ b/b.S @@ -1,5 +1,6 @@ .text .align 2 +.type foo, %function .global foo foo: push {r7}
And build it again the same way and you will see: 00008094 <call_foo>: 8094: eb000006 bl 80b4 <foo> 8098: e12fff1e bx lr
I can't reproduce this on Linaro GCC 2012.01, so looks like the problem is solved in recent tool-chains. However, sadly I could reproduce a different but similar problem with Linaro GCC 2012.01. This time the call is from C(Thumb) to assembly(ARM) and no weakly linked symbols are involved.
a.c: ==== int main (void) { foo (); }
b.S: ==== .text .align 2 .global foo foo: push {r7} add r7, sp, #0 mov sp, r7 pop {r7} bx lr .size foo, .-foo
.global __aeabi_unwind_cpp_pr0 __aeabi_unwind_cpp_pr0: bx lr
arm-linux-gnueabi-gcc -mthumb -mthumb-interwork -c a.c arm-linux-gnueabi-gcc -mthumb -mthumb-interwork -c b.S arm-linux-gnueabi-ld -r a.o -o alib.o arm-linux-gnueabi-ld -r b.o -o blib.o arm-linux-gnueabi-ld --start-group alib.o blib.o --end-group -o a.out arm-linux-gnueabi-objdump -S --reloc a.out
gives: 8076: af00 add r7, sp, #0 8078: f000 f802 bl 8080 <foo> 807c: 4618 mov r0, r3
It should have been "blx 8080 <foo>", isn't it? Again, %function solves it.
I agree that not marking the assembly functions ' %function' is a problem in the code, so it's not a critical bug. But I would've been happier if the linker refused to link it rather than branching with the wrong instruction. Isn't that a problem?
Problem No:2 ************* Linaro GCC 2012.01 is giving a new problem w.r.to Thumb build that is not existing in Sourcery G++ Lite 2010q1-202. However, I couldn't reproduce this problem with a small program like above. So, let me give you reference to the original u-boot code that shows the problem and steps to reproduce it.
tree: git://github.com/aneeshv/u-boot.git branch: thumb
The above branch has mainline u-boot with 4 additional patches from me for enabling Thumb build. You can build it like this:
make CROSS_COMPILE=arm-linux-gnueabi- ARCH=arm distclean make CROSS_COMPILE=arm-linux-gnueabi- ARCH=arm omap4_sdp4430
This builds two images u-boot and u-boot-spl. SPL is a tiny u-boot that runs from internal RAM and loads the u-boot to SDRAM. Now, please have a look at the map file of u-boot-spl which is at: spl/u-boot-spl.map
I see the following in my map file:
/spl/u-boot-spl.map: ================= .rodata.wkup_padconf_array_essential_4460 0x40309583 0x4 board/ti/sdp4430/libsdp4430.o 0x40309583 wkup_padconf_array_essential_4460 .rodata.wkup_padconf_array_essential 0x40309587 0xc board/ti/sdp4430/libsdp4430.o 0x40309587 wkup_padconf_array_essential .rodata.core_padconf_array_essential 0x40309593 0x60 board/ti/sdp4430/libsdp4430.o 0x40309593 core_padconf_array_essential
Please note that the .rodata symbols have odd addresses. These arrays actually need to be aligned at least to half-word boundary. In fact, in the image I verified that they are put at even addresses. So, the symbols have been kept as real address +1. I understand that this is the convention for Thumb functions. I guess the tool-chain did it for data too?? And I am getting unaligned access aborts on accessing them.
I notice that this doesn't happen with all .rodata. symbols in the image. I couldn't see any difference between working and non-working files nor any difference in the command used to build them!
Well, this doesn't happen if I don't use "-fdata-sections" in gcc options. So, apply the following patch and you will see that those symbols have even addresses now.
diff --git a/config.mk b/config.mk index ddaa477..723286a 100644 --- a/config.mk +++ b/config.mk @@ -190,7 +190,7 @@ CPPFLAGS := $(DBGFLAGS) $(OPTFLAGS) $(RELFLAGS) \
# Enable garbage collection of un-used sections for SPL ifeq ($(CONFIG_SPL_BUILD),y) -CPPFLAGS += -ffunction-sections -fdata-sections +CPPFLAGS += -ffunction-sections LDFLAGS_FINAL += --gc-sections endif
/spl/u-boot-spl.map: ================= .rodata 0x40309204 0x38c board/ti/sdp4430/libsdp4430.o 0x40309204 core_padconf_array_essential 0x40309264 wkup_padconf_array_essential 0x40309270 wkup_padconf_array_essential_4460 0x40309274 core_padconf_array_non_essential 0x40309540 wkup_padconf_array_non_essential 0x40309588 wkup_padconf_array_non_essential_4430
Will you be able to look into these?
Thanks, Aneesh
[1] Sourcery G++ Lite 2010q1-202 arm-none-linux-gnueabi-gcc (Sourcery G++ Lite 2010q1-202) 4.4.1 GNU ld (Sourcery G++ Lite 2010q1-202) - binutils 2.19.51.20090709
[2] Linaro 4.6-2012.01 arm-linux-gnueabi-gcc (crosstool-NG linaro-1.13.1-2012.01-20120125 - Linaro GCC 2012.01) 4.6.3 20120105 (prerelease) GNU ld (crosstool-NG linaro-1.13.1-2012.01-20120125 - Linaro GCC 2012.01) 2.22
"V, Aneesh" aneesh@ti.com wrote:
I agree that not marking the assembly functions ' %function' is a problem in the code, so it's not a critical bug. But I would've been happier if the linker refused to link it rather than branching with the wrong instruction. Isn't that a problem?
Well, if the target symbol of a branch is not marked as %function, the linker has no way of knowing whether that target is ARM or Thumb, so it cannot specifically error out if (and only if) the instruction is wrong.
The linker *could* in theory give an error *unconditionally* whenever it detects a branch to a non-%function symbol. The reason this is not done is probably for backwards compatibility with old hand-written code, say from an ARM-only era: branches to non-function symbols used to be allowed, and in fact work fine if all code is ARM-only. Adding an error would break such old code.
Problem No:2
Linaro GCC 2012.01 is giving a new problem w.r.to Thumb build that is not existing in Sourcery G++ Lite 2010q1-202. However, I couldn't reproduce this problem with a small program like above. So, let me give you reference to the original u-boot code that shows the problem and steps to reproduce it.
[snip]
Please note that the .rodata symbols have odd addresses. These arrays actually need to be aligned at least to half-word boundary. In fact, in the image I verified that they are put at even addresses. So, the symbols have been kept as real address +1.
This seems strange. How did you verify "that they are put at even addresses"? I can reproduce the odd addresses of .rodata symbols. However, this occurs simply because the linker put *no* alignment requirement whatsoever on those sections:
Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al [snip] [11] .rodata.wkup_padc PROGBITS 00000000 000100 000004 00 A 0 0 1 [12] .rodata.wkup_padc PROGBITS 00000000 000104 000048 00 A 0 0 1 [13] .rodata.wkup_padc PROGBITS 00000000 00014c 00000c 00 A 0 0 1 [14] .rodata.wkup_padc PROGBITS 00000000 000158 000004 00 A 0 0 1
Note the "Al" column values of 1. In the final executable, those sections happen to end up immediately following a .rodata.str string section with odd lenght, and since they don't have any alignment requirement, they start out at an odd address.
The reason for the lack of alignment requirement is actually in the source:
const struct pad_conf_entry core_padconf_array_essential[] = {
where
struct pad_conf_entry {
u16 offset;
u16 val;
} __attribute__ ((packed));
The "packed" attribute specifies that all struct elements ought to be considered to have alignment requirement 1 instead of their default alignment. Thus the whole struct ends up having alignment requirement 1; and since the section contains only a single variable of such struct type, this is then also the alignment requirement of the section.
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
-- Dr. Ulrich Weigand | Phone: +49-7031/16-3727 STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E. IBM Deutschland Research & Development GmbH Vorsitzende des Aufsichtsrats: Martina Koederitz | Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294
On Mon, Feb 20, 2012 at 06:59:53PM +0100, Ulrich Weigand wrote:
"V, Aneesh" aneesh@ti.com wrote:
I agree that not marking the assembly functions ' %function' is a problem in the code, so it's not a critical bug. But I would've been happier if the linker refused to link it rather than branching with the wrong instruction. Isn't that a problem?
Well, if the target symbol of a branch is not marked as %function, the linker has no way of knowing whether that target is ARM or Thumb, so it cannot specifically error out if (and only if) the instruction is wrong.
The linker *could* in theory give an error *unconditionally* whenever it detects a branch to a non-%function symbol. The reason this is not done is probably for backwards compatibility with old hand-written code, say from an ARM-only era: branches to non-function symbols used to be allowed, and in fact work fine if all code is ARM-only. Adding an error would break such old code.
Problem No:2
Linaro GCC 2012.01 is giving a new problem w.r.to Thumb build that is not existing in Sourcery G++ Lite 2010q1-202. However, I couldn't reproduce this problem with a small program like above. So, let me give you reference to the original u-boot code that shows the problem and steps to reproduce it.
[snip]
Please note that the .rodata symbols have odd addresses. These arrays actually need to be aligned at least to half-word boundary. In fact, in the image I verified that they are put at even addresses. So, the symbols have been kept as real address +1.
This seems strange. How did you verify "that they are put at even addresses"? I can reproduce the odd addresses of .rodata symbols. However, this occurs simply because the linker put *no* alignment requirement whatsoever on those sections:
Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al [snip] [11] .rodata.wkup_padc PROGBITS 00000000 000100 000004 00 A 0 0 1 [12] .rodata.wkup_padc PROGBITS 00000000 000104 000048 00 A 0 0 1 [13] .rodata.wkup_padc PROGBITS 00000000 00014c 00000c 00 A 0 0 1 [14] .rodata.wkup_padc PROGBITS 00000000 000158 000004 00 A 0 0 1
Note the "Al" column values of 1. In the final executable, those sections happen to end up immediately following a .rodata.str string section with odd lenght, and since they don't have any alignment requirement, they start out at an odd address.
The reason for the lack of alignment requirement is actually in the source:
const struct pad_conf_entry core_padconf_array_essential[] = {
where
struct pad_conf_entry {
u16 offset; u16 val;
} __attribute__ ((packed));
The "packed" attribute specifies that all struct elements ought to be considered to have alignment requirement 1 instead of their default alignment. Thus the whole struct ends up having alignment requirement 1; and since the section contains only a single variable of such struct type, this is then also the alignment requirement of the section.
Is "packed" even wanted here?
Based on a very brief skim of the code, it looks like the packed attribute is an unnecessary attempt to save some space -- unnecessary because all ARM ABI variants I know of (actually, all arches I know of) will pack that structure into a word anyway as a result of natural alignment of the members. It doesn't look to me like the packed structure is used to map a memory-mapped peripheral directly or otherwise communicate with the outside world -- which would be the only situations in which a packed structure would normally make sense.
Of course, I may have missed something...
Cheers ---Dave
On Tue 21 Feb 2012 09:57:12 GMT, Dave Martin wrote:
struct pad_conf_entry {
u16 offset; u16 val;
} __attribute__ ((packed));
The "packed" attribute specifies that all struct elements ought to be considered to have alignment requirement 1 instead of their default alignment. Thus the whole struct ends up having alignment requirement 1; and since the section contains only a single variable of such struct type, this is then also the alignment requirement of the section.
Is "packed" even wanted here?
Based on a very brief skim of the code, it looks like the packed attribute is an unnecessary attempt to save some space -- unnecessary because all ARM ABI variants I know of (actually, all arches I know of) will pack that structure into a word anyway as a result of natural alignment of the members. It doesn't look to me like the packed structure is used to map a memory-mapped peripheral directly or otherwise communicate with the outside world -- which would be the only situations in which a packed structure would normally make sense.
Of course, I may have missed something...
I'm not sure, but I believe that the compiler requires (prefers) any structs that you want included inside packed structs to be themselves packed, so you can end up with some structs with apparently unnecessary attributes on them.
It might also have an effect when you place the struct inside an unpacked struct?
Without context I've no idea whether that's what's going on here. Of course, a no-op "packed" attribute ought to be harmless...
Andrew
Andrew Stubbs andrew.stubbs@linaro.org wrote on 21.02.2012 11:56:07:
I'm not sure, but I believe that the compiler requires (prefers) any structs that you want included inside packed structs to be themselves packed, so you can end up with some structs with apparently unnecessary attributes on them.
I don't see why the compiler would care. Its just that if an inner struct already has padding somewhere, this padding doesn't go away (the struct layout is not recomputed) just because it is embedded in a larger struct which is marked packed. [Marking the outer struct packed will still affect the overall alignment requirement of the inner struct; it just won't affect it inner layout.]
It might also have an effect when you place the struct inside an unpacked struct?
Yes, of course: if the struct is packed, then the struct as a whole has an alignment requirement of 1, which may affect the placement of the struct as an element within an outside (unpacked) struct.
Without context I've no idea whether that's what's going on here. Of course, a no-op "packed" attribute ought to be harmless...
It is not really "no-op": it reduces the alignment requirement of the struct from 2 (in this case) to 1, which is actually exactly what's causing the problems in the original test case.
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
-- Dr. Ulrich Weigand | Phone: +49-7031/16-3727 STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E. IBM Deutschland Research & Development GmbH Vorsitzende des Aufsichtsrats: Martina Koederitz | Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294
On Tue, Feb 21, 2012 at 12:05:54PM +0100, Ulrich Weigand wrote:
Andrew Stubbs andrew.stubbs@linaro.org wrote on 21.02.2012 11:56:07:
I'm not sure, but I believe that the compiler requires (prefers) any structs that you want included inside packed structs to be themselves packed, so you can end up with some structs with apparently unnecessary attributes on them.
I don't see why the compiler would care. Its just that if an inner struct already has padding somewhere, this padding doesn't go away (the struct layout is not recomputed) just because it is embedded in a larger struct which is marked packed. [Marking the outer struct packed will still affect the overall alignment requirement of the inner struct; it just won't affect it inner layout.]
It might also have an effect when you place the struct inside an unpacked struct?
Yes, of course: if the struct is packed, then the struct as a whole has an alignment requirement of 1, which may affect the placement of the struct as an element within an outside (unpacked) struct.
Without context I've no idea whether that's what's going on here. Of course, a no-op "packed" attribute ought to be harmless...
It is not really "no-op": it reduces the alignment requirement of the struct from 2 (in this case) to 1, which is actually exactly what's causing the problems in the original test case.
With packed, the compiler can validly emit misaligned memory accesses, which is bad when building bare-metal code where the underlying memory may be strongly-ordered, since the CPU will throw exceptions in that case.
I don't know whether that applies to this case, though.
I can't see struct pad_conf_entry nested inside anything else, so it just looks unnecessary for this to be packed (I still didn't check the context of this thoroughly, however...)
What happens if that attribute is simply removed from all the affected headers?
Cheers ---Dave
On Tuesday 21 February 2012 03:27 PM, Dave Martin wrote:
On Mon, Feb 20, 2012 at 06:59:53PM +0100, Ulrich Weigand wrote:
"V, Aneesh"aneesh@ti.com wrote:
I agree that not marking the assembly functions ' %function' is a problem in the code, so it's not a critical bug. But I would've been happier if the linker refused to link it rather than branching with the wrong instruction. Isn't that a problem?
Well, if the target symbol of a branch is not marked as %function, the linker has no way of knowing whether that target is ARM or Thumb, so it cannot specifically error out if (and only if) the instruction is wrong.
The linker *could* in theory give an error *unconditionally* whenever it detects a branch to a non-%function symbol. The reason this is not done is probably for backwards compatibility with old hand-written code, say from an ARM-only era: branches to non-function symbols used to be allowed, and in fact work fine if all code is ARM-only. Adding an error would break such old code.
Problem No:2
Linaro GCC 2012.01 is giving a new problem w.r.to Thumb build that is not existing in Sourcery G++ Lite 2010q1-202. However, I couldn't reproduce this problem with a small program like above. So, let me give you reference to the original u-boot code that shows the problem and steps to reproduce it.
[snip]
Please note that the .rodata symbols have odd addresses. These arrays actually need to be aligned at least to half-word boundary. In fact, in the image I verified that they are put at even addresses. So, the symbols have been kept as real address +1.
This seems strange. How did you verify "that they are put at even addresses"? I can reproduce the odd addresses of .rodata symbols. However, this occurs simply because the linker put *no* alignment requirement whatsoever on those sections:
Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al [snip] [11] .rodata.wkup_padc PROGBITS 00000000 000100 000004 00 A 0 0 1 [12] .rodata.wkup_padc PROGBITS 00000000 000104 000048 00 A 0 0 1 [13] .rodata.wkup_padc PROGBITS 00000000 00014c 00000c 00 A 0 0 1 [14] .rodata.wkup_padc PROGBITS 00000000 000158 000004 00 A 0 0 1
Note the "Al" column values of 1. In the final executable, those sections happen to end up immediately following a .rodata.str string section with odd lenght, and since they don't have any alignment requirement, they start out at an odd address.
The reason for the lack of alignment requirement is actually in the source:
const struct pad_conf_entry core_padconf_array_essential[] = {
where
struct pad_conf_entry {
u16 offset; u16 val;
} __attribute__ ((packed));
The "packed" attribute specifies that all struct elements ought to be considered to have alignment requirement 1 instead of their default alignment. Thus the whole struct ends up having alignment requirement 1; and since the section contains only a single variable of such struct type, this is then also the alignment requirement of the section.
Is "packed" even wanted here?
Based on a very brief skim of the code, it looks like the packed attribute is an unnecessary attempt to save some space -- unnecessary because all ARM ABI variants I know of (actually, all arches I know of) will pack that structure into a word anyway as a result of natural alignment of the members. It doesn't look to me like the packed structure is used to map a memory-mapped peripheral directly or otherwise communicate with the outside world -- which would be the only situations in which a packed structure would normally make sense.
Of course, I may have missed something...
No. I think packed is not needed here. Removing it didn't change the size of the arrays. I will remove it.
Thanks, Aneesh
Oops! Sorry. These mails skipped my Inbox and went into a sub-folder in my mail client that I hadn't used for a long time. I didn't realize that I had mails!
On Monday 20 February 2012 11:29 PM, Ulrich Weigand wrote:
"V, Aneesh"aneesh@ti.com wrote:
I agree that not marking the assembly functions ' %function' is a problem in the code, so it's not a critical bug. But I would've been happier if the linker refused to link it rather than branching with the wrong instruction. Isn't that a problem?
Well, if the target symbol of a branch is not marked as %function, the linker has no way of knowing whether that target is ARM or Thumb, so it cannot specifically error out if (and only if) the instruction is wrong.
The linker *could* in theory give an error *unconditionally* whenever it detects a branch to a non-%function symbol. The reason this is not done is probably for backwards compatibility with old hand-written code, say from an ARM-only era: branches to non-function symbols used to be allowed, and in fact work fine if all code is ARM-only. Adding an error would break such old code.
Ok. Agree. I never used to use %function when I wrote assembly functions earlier. I am sure a lot of code will break if this was enforced.
Problem No:2
Linaro GCC 2012.01 is giving a new problem w.r.to Thumb build that is not existing in Sourcery G++ Lite 2010q1-202. However, I couldn't reproduce this problem with a small program like above. So, let me give you reference to the original u-boot code that shows the problem and steps to reproduce it.
[snip]
Please note that the .rodata symbols have odd addresses. These arrays actually need to be aligned at least to half-word boundary. In fact, in the image I verified that they are put at even addresses. So, the symbols have been kept as real address +1.
This seems strange. How did you verify "that they are put at even addresses"? I can reproduce the odd addresses of .rodata symbols. However, this occurs simply because the linker put *no* alignment requirement whatsoever on those sections:
Sorry, that was an error in interpreting the data. The data is put at the odd address given in the map file.
Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al [snip] [11] .rodata.wkup_padc PROGBITS 00000000 000100 000004 00 A 0 0 1 [12] .rodata.wkup_padc PROGBITS 00000000 000104 000048 00 A 0 0 1 [13] .rodata.wkup_padc PROGBITS 00000000 00014c 00000c 00 A 0 0 1 [14] .rodata.wkup_padc PROGBITS 00000000 000158 000004 00 A 0 0 1
Note the "Al" column values of 1. In the final executable, those sections happen to end up immediately following a .rodata.str string section with odd lenght, and since they don't have any alignment requirement, they start out at an odd address.
The reason for the lack of alignment requirement is actually in the source:
const struct pad_conf_entry core_padconf_array_essential[] = {
where
struct pad_conf_entry {
u16 offset; u16 val;
} __attribute__ ((packed));
The "packed" attribute specifies that all struct elements ought to be considered to have alignment requirement 1 instead of their default alignment. Thus the whole struct ends up having alignment requirement 1; and since the section contains only a single variable of such struct type, this is then also the alignment requirement of the section.
Hmm.. Thanks. Removing packed seems to help. The following also helped.
@@ -34,7 +34,7 @@ struct pad_conf_entry {
u16 val;
-} __attribute__ ((packed)); +} __attribute__ ((packed)) __attribute__ ((aligned(2)));
BTW, just for my understanding: The effect of adding __attribute__ ((packed)) to a structure is equivalent to adding it for each field in the structure, right? And because the first field doesn't have any alignment requirement, the struct also doesn't have any alignment requirement right?
Wonder why the issue never occurred until we started building in Thumb.
Thanks a lot for looking into this.
best regards, Aneesh
Aneesh V aneesh@ti.com wrote on 23.02.2012 11:27:40:
The "packed" attribute specifies that all struct elements ought to be considered to have alignment requirement 1 instead of their default alignment. Thus the whole struct ends up having alignment requirement
1;
and since the section contains only a single variable of such struct type, this is then also the alignment requirement of the section.
Hmm.. Thanks. Removing packed seems to help. The following also helped.
@@ -34,7 +34,7 @@ struct pad_conf_entry {
u16 val;
-} __attribute__ ((packed)); +} __attribute__ ((packed)) __attribute__ ((aligned(2)));
BTW, just for my understanding: The effect of adding __attribute__ ((packed)) to a structure is equivalent to adding it for each field in the structure, right?
Right.
And because the first field doesn't have any alignment requirement, the struct also doesn't have any alignment requirement right?
Sort of. The alignment requirement of the struct is the maximum (actually, the least common multiple, but since we're talking only powers of two, that amounts to the same) of the alignment requirements of all members. However, attribute packed for a struct applies to each element separately, and thus each element has alignment requirement 1, which is then also the maximum.
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
-- Dr. Ulrich Weigand | Phone: +49-7031/16-3727 STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E. IBM Deutschland Research & Development GmbH Vorsitzende des Aufsichtsrats: Martina Koederitz | Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294
On Thursday 23 February 2012 05:17 PM, Ulrich Weigand wrote:
Aneesh Vaneesh@ti.com wrote on 23.02.2012 11:27:40:
The "packed" attribute specifies that all struct elements ought to be considered to have alignment requirement 1 instead of their default alignment. Thus the whole struct ends up having alignment requirement
1;
and since the section contains only a single variable of such struct type, this is then also the alignment requirement of the section.
Hmm.. Thanks. Removing packed seems to help. The following also helped.
@@ -34,7 +34,7 @@ struct pad_conf_entry {
u16 val;
-} __attribute__ ((packed)); +} __attribute__ ((packed)) __attribute__ ((aligned(2)));
BTW, just for my understanding: The effect of adding __attribute__ ((packed)) to a structure is equivalent to adding it for each field in the structure, right?
Right.
And because the first field doesn't have any alignment requirement, the struct also doesn't have any alignment requirement right?
Sort of. The alignment requirement of the struct is the maximum (actually, the least common multiple, but since we're talking only powers of two, that amounts to the same) of the alignment requirements of all members. However, attribute packed for a struct applies to each element separately, and thus each element has alignment requirement 1, which is then also the maximum.
Understand. Thanks for the explanation.
br, Aneesh
On 23/02/12 10:27, Aneesh V wrote:
Ok. Agree. I never used to use %function when I wrote assembly functions earlier. I am sure a lot of code will break if this was enforced.
If you've not used %function on ARM, then your code is semantically broken even if it isn't syntactically broken.
The ABI rules for dealing with interworking and and out-of-range branches all rely on %function being used correctly.
R.
On Thursday 23 February 2012 07:26 PM, Richard Earnshaw wrote:
On 23/02/12 10:27, Aneesh V wrote:
Ok. Agree. I never used to use %function when I wrote assembly functions earlier. I am sure a lot of code will break if this was enforced.
If you've not used %function on ARM, then your code is semantically broken even if it isn't syntactically broken.
The ABI rules for dealing with interworking and and out-of-range branches all rely on %function being used correctly.
Hmm.. I learned it the hard way:)
BTW, very few assembly functions in U-Boot had %function or equivalent. I am trying to clean it up at least for armv7.
Thanks, Aneesh