Hi,
I used Linaro cross-toolchain version 4.5
(gcc-4.5-arm-linux-gnueabi) to compile linux-linaro-11.05 for beagle
board,
but got the following error messages:
----
AS arch/arm/boot/compressed/head.o
arch/arm/boot/compressed/head.S: Assembler messages:
arch/arm/boot/compressed/head.S:127: Error: selected processor does not
support requested special purpose register -- `mrs r2,cpsr'
arch/arm/boot/compressed/head.S:134: Error: selected processor does not
support requested special purpose register -- `mrs r2,cpsr'
arch/arm/boot/compressed/head.S:136: Error: selected processor does not
support requested special purpose register -- `msr cpsr_c,r2'
make[2]: *** [arch/arm/boot/compressed/head.o] Error 1
make[1]: *** [arch/arm/boot/compressed/vmlinux] Error 2
make: *** [uImage] Error 2
----
The .config file I used for kernel build is
"config-2.6.38-1003-linaro-omap" which is from
hwpack_linaro-omap3-x11-base_20110526-5_armel_supported.tar.gz
My host development platform is 64-bit Ubuntu 10.04.2 LTS (Linux
ubuntu 2.6.32-32-generic #62-Ubuntu SMP Wed Apr 20 21:52:38 UTC 2011
x86_64 GNU/Linux).
Is this a known bug, or did I miss anything else ?
Thanks.
--------------------------------- Email Confidentiality Notice
------------------------------------------
If you are not the intended recipient for this CONFIDENTIAL E-mail, please
delete it immediately without keeping or distributing any copy and notify
the sender.
----------------------------------------------------------------------------------------------------------------------
== GCC ==
=== Progress ===
* Panda board up again with apparently no change with software on it.
Not sure what caused the difference today ! Now chugging along with
SPEC2k.
* Back on to BRANCH_COST . SPEC2k now running fully whence panda
board was restored.
* Submitted cleaned up neon shift immediates patch for merging.
* On merge request review in Linaro this week.
* Backported arith_shiftsi patch to 4.6 branch upstream. (done)
* Identified particular patterns that have issues with scheduler
descriptions in A8 and A9 . Working on fixes.
=== Plans ===
* Submit one_cmpldi2 patch for neon upstream.
* Finish the scheduler patches.
* Investigate A8 vs A9 regressions.
* Look at EPILOGUE_USES and coremark . not sure why it regresses in
performance yet.
Meetings:
* 1-1s
* TCWG calls
Absences.
* 1st Aug - 5th August - Linaro sprint.
* 8th - 9th August - Internal training.
* 29th Aug - Sept. 2 - Vacation.
Hi,
* fixed a bug where libunwind could segfault when unwinding through a
shared library using the ARM specific unwind tables
* discussed the libuwind internals with Uli (thanks!) and concluded
that the best way to implement remote unwinding for ARM is to integrate
the support for the ARM.exidx* directly into the DWARF code
* otherwise the user visible remote API needs to be extended for ARM
only which seems to be a bad idea
* requires to re-implement the existent ARM code (both local and remote)
* will also benefit from libunwind's (dwarf) caching mechanism
* started to re-implement the ARM code
Regards
Ken
- Continue Spec2006 analysis:
Looking into SMS opportunities in SPEC2006/462.libquantum.
- Looking into recent bootstrap failure with SMS flags on ARM -- it
seems to be related to do-loop optimization.
Hello Michael,
We do have more and more instances of the following issues turning up in
the kernel requiring toolchain assistance to solve the problem properly.
Could you or someone from your team follow this up please?
---------- Forwarded message ----------
Date: Tue, 1 Feb 2011 12:16:48 +0000
From: Dave Martin <dave.martin(a)linaro.org>
To: binutils(a)sourceware.org
Cc: linaro-toolchain <linaro-toolchain(a)lists.linaro.org>
Subject: Generating ancilliary sections with gas
Hi all,
Every now and again I come across a situation where it would be
really useful to be able to query the assembler state during
assembly: for example, to query and do something based on the
current section name. This makes it possible to write generic
macros to do certain things which otherwise require manual
maintenance, or complex and fragile external preprocessing.
Below, I give a real-world example of the problem, and sketch out
a possible solution.
What do people think of this approach? Does anyone have any better
ideas on how to solve this?
Cheers
---Dave
EXAMPLE
An example is the generation of custom ancilliary sections.
Suppose you want to write macros which record fixup information.
Currently, there's no way to put each fixup in an appropriately
named section automatically within gas. Tellingly, gas has had
to grow the ability to do this internally at least for ARM,
since the exception handling information in .ARM.ex{idx,tab}*
must go in sections with names based on the associated section
name. However, this ancillary section generation support is
neither flexible nor exposed to the user.
By putting fixups in sections whose names are based on the name
of the section they refer to, selective link-time discard of the
fixups (and hence the code referenced by the fixups) will work;
otherwise it doesn't. This would help avoid a situation where we
have to keep dead code in the kernel because custom fixups are
applied to it: at run-time, the code gets fixed up, then is
thrown away. The fixups can't be selectively discarded because
they are all in the same section: we seem have to no good
way to separate them out into separate sections appropriately.
For context, see:
http://www.spinics.net/lists/arm-kernel/msg112268.html
PROPOSAL
To solve the problem of generating custom ancillary sections
during assembly, here's a simple proposal: introducing a new kind of
macro argument can make aspects of the assembler state available to
macros in a flexible way, with only minimal implementation
required.
Basically, the macro qualifier field could be used to identify
arguments which are filled in by the assembler with information
about the assembly state, rather than being filled in by the
invoker of the macro: e.g.:
.macro mymacro name:req, flags, secname:current_section
/* ... */
.pushsection "\secname\name", "\flags"
/* ... */
.popsection
.endm
/* ... */
mymacro .ancillary, "a"
During expansion, \name and \flags are expanded as normal.
But \secname is substituted instead with the current section name,
so the macro expansion would look like this:
/* ... */
.pushsection ".text.ancillary", "a"
/* ... */
.popsection
Without the special :current_section argument, it doesn't appear
possible to implement a macro such as mymacro in a generic way.
This surely isn't the only way to achieve the goal, and it's
probably not the best way, but it does have some desirable
features.
Principally, while a new pseudo-op(s) could have been defined to
append text to the current section name, etc., allowing the current
section name to be appear as a macro parameter avoids prejudicing
the way the text is used. So there should never be a need to
introduce additional pseudo-ops to do things with the current
section name: with this patch, the user can always implement their
own macro to do the desired thing. This gets the desired
behaviour and maximum flexibility, while keeping the implementation
in gas very simple.
Also, using the macro expansion system in this way allows the
caller a free choice of macro parameter names, and so pretty much
guarantees that existing code won't get broken by the change.
Because my hack is currently simplistic, it has shortcomings: in
particular, it's not desirable to parse an argument from the
invocation line at all to fill a :current_section argument.
Currently, an argument is read in if present, but its value is
ignored and the current section name pasted in at macro expansion
time instead. However, that should be straightforward to fix with
a bit more code.
Of course, there's no reason only to expose the current section name
in this way. Any aspect of the the assembler state (current
subsection, current section flags, current instruction set, current
macro mode, etc.) could be made available in a similar way.
USAGE EXAMPLE AND PATCH
Note that the specific implementation described here is intended
to be illustrative, rather than complete or final.
binutils$ cat <<EOF >tst.s
.macro push_ancillary_section name:req, flags, csec:current_section
.pushsection "\name\csec", "\flags"
.endm
.macro register_fixup
_register_fixup 100\@
.endm
.macro _register_fixup label:req
\label :
push_ancillary_section .fixup, "a"
.long \label\(b)
.popsection
.endm
.long 1
register_fixup
.long 2
.data
.long 3
register_fixup
.long 4
.long 5
register_fixup
.long 6
EOF
binutils$ gas/as-new -ahlms -o tst.o tst.s
ARM GAS tst.s page 1
1 .macro push_ancillary_section name:req, flags, csec:current_section
2 .pushsection "\name\csec", "\flags"
3 .endm
4
5 .macro register_fixup
6 _register_fixup 100\@
7 .endm
8
9 .macro _register_fixup label:req
10 \label :
11 push_ancillary_section .fixup, "a"
12 .long \label\(b)
13 .popsection
14 .endm
15
16 0000 01000000 .long 1
17 register_fixup
17 > _register_fixup 1000
17 >> 1000:
17 >> push_ancillary_section .fixup,"a"
17 >>> .pushsection ".fixup.text","a"
17 0000 04000000 >> .long 1000b
17 >> .popsection
18 0004 02000000 .long 2
19
20 .data
21 0000 03000000 .long 3
22 register_fixup
22 > _register_fixup 1003
22 >> 1003:
22 >> push_ancillary_section .fixup,"a"
22 >>> .pushsection ".fixup.data","a"
22 0000 04000000 >> .long 1003b
22 >> .popsection
23 0004 04000000 .long 4
24 0008 05000000 .long 5
25 register_fixup
25 > _register_fixup 1006
25 >> 1006:
25 >> push_ancillary_section .fixup,"a"
25 >>> .pushsection ".fixup.data","a"
25 0004 0C000000 >> .long 1006b
25 >> .popsection
26 000c 06000000 .long 6
ARM GAS tst.s page 2
NO DEFINED SYMBOLS
NO UNDEFINED SYMBOLS
binutils$ arm-linux-gnueabi-objdump -rs tst.o
tst.o: file format elf32-littlearm
RELOCATION RECORDS FOR [.fixup.text]:
OFFSET TYPE VALUE
00000000 R_ARM_ABS32 .text
RELOCATION RECORDS FOR [.fixup.data]:
OFFSET TYPE VALUE
00000000 R_ARM_ABS32 .data
00000004 R_ARM_ABS32 .data
Contents of section .text:
0000 01000000 02000000 ........
Contents of section .data:
0000 03000000 04000000 05000000 06000000 ................
Contents of section .fixup.text:
0000 04000000 ....
Contents of section .fixup.data:
0000 04000000 0c000000 ........
Contents of section .ARM.attributes:
0000 41150000 00616561 62690001 0b000000 A....aeabi......
0010 08010901 2c01 ....,.
diff --git a/gas/macro.c b/gas/macro.c
index e392883..95c4de1 100644
--- a/gas/macro.c
+++ b/gas/macro.c
@@ -516,6 +516,8 @@ do_formals (macro_entry *macro, int idx, sb *in)
formal->type = FORMAL_REQUIRED;
else if (strcmp (qual.ptr, "vararg") == 0)
formal->type = FORMAL_VARARG;
+ else if (strcmp (qual.ptr, "current_section") == 0)
+ formal->type = FORMAL_CURRENT_SECTION;
else
as_bad_where (macro->file,
macro->line,
@@ -540,6 +542,15 @@ do_formals (macro_entry *macro, int idx, sb *in)
name,
macro->name);
}
+ else if (formal->type == FORMAL_CURRENT_SECTION)
+ {
+ sb_reset (&formal->def);
+ as_warn_where (macro->file,
+ macro->line,
+ _("Pointless default value for current_section parameter `%s' in macro `%s'"),
+ name,
+ macro->name);
+ }
}
/* Add to macro's hash table. */
@@ -734,7 +745,11 @@ sub_actual (int start, sb *in, sb *t, struct hash_control *formal_hash,
ptr = (formal_entry *) hash_find (formal_hash, sb_terminate (t));
if (ptr)
{
- if (ptr->actual.len)
+ if (ptr->type == FORMAL_CURRENT_SECTION)
+ {
+ sb_add_string (out, segment_name (now_seg));
+ }
+ else if (ptr->actual.len)
{
sb_add_sb (out, &ptr->actual);
}
diff --git a/gas/macro.h b/gas/macro.h
index edc1b6b..ea6cabb 100644
--- a/gas/macro.h
+++ b/gas/macro.h
@@ -38,7 +38,8 @@ enum formal_type
{
FORMAL_OPTIONAL,
FORMAL_REQUIRED,
- FORMAL_VARARG
+ FORMAL_VARARG,
+ FORMAL_CURRENT_SECTION,
};
/* Describe the formal arguments to a macro. */
_______________________________________________
linaro-toolchain mailing list
linaro-toolchain(a)lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain
RAG:
Red:
Amber: OMAP3 patch upstreaming is slower progress than hoped
Green:
Current Milestones:
|| || Planned || Estimate || Actual ||
||qemu-linaro-2011-07 || 2011-07-21 || 2011-07-21 || ||
Historical Milestones:
||qemu-linaro 2011-04 || 2011-04-21 || 2011-04-21 || 2011-04-21 ||
||qemu-linaro 2011-05 || 2011-05-19 || 2011-05-19 || n/a ||
||close out 1105 blueprints || 2011-05-28 || 2011-05-28 || 2011-05-19 ||
||complete 1111 planning || 2011-05-28 || 2011-05-28 || 2011-05-27 ||
||qemu-linaro-2011-06 || 2011-06-16 || 2011-06-16 || 2011-06-16 ||
== upstream-omap3-patches ==
* split and did most of the cleanup of 'overhaul onenand support' patch
* updated the omap gpio qdev patchset in response to review comments,
just about ready to send v2
* this is going more slowly than I had anticipated
== other ==
* patch review, etc
* confirmed attendance at KVM Forum and LinuxCon NA
Current qemu patch status is tracked here:
https://wiki.linaro.org/PeterMaydell/QemuPatchStatus
Absences:
1-5 August: Linaro sprint 1111
15-19 August: KVM Forum and LinuxCon NA, Vancouver
Hi
Linaro backport PPA [1] got updated to latest versions of armel cross
toolchains -- oneiric packages were used as a base.
What got changed:
- gcc 4.4 was updated to 4.4.6-3ubuntu1
- gcc 4.5 was updated to 4.5.3-1ubuntu2
- binutils was updated to 2.21.52.20110606-1ubuntu1
- eglibc was updated to 2.13-6ubuntu2
- gcc 4.6 was provided as 4.6.0-14ubuntu1 in Maverick, Natty
- gcc-defaults-armel-cross was updated to 1.6 in Maverick, Natty (uses
gcc-4.6 as default)
There is no gcc-4.6 for Lucid currently as it requires newer versions of
few libraries (mpfr, mpc) and one of rule of this PPA is "do not update
packages which may affect other packages".
Please test them and report any bugs found.
1. https://launchpad.net/~linaro-maintainers/+archive/toolchain/
== GDB ==
* Posted patch to fix shared library remote test problems (#804387).
* Started reviewing Yao's latest Thumb-2 displaced stepping patch.
== GCC ==
* Reviewed and approved Richard's mainline reload patch to fix
#803232 (ICE on code that uses vld4q_s16() NEON intrinsic).
* Followed up on gcc-patches to address concerns about Julian's
unaligned access patch.
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E.
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk
Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294
== This week ==
* Looked at why the fix for #721531 wasn't working on the Linaro branches.
Wrote follow-up patches for both. Fix now committed to 4.5 and 4.6.
* Looked at #736007. Submitted and committed patch upstream.
* Looked at some "odd" ivopts behaviour. It turned out that this was
working mostly as expected. I'm still wondering about a couple of tweaks.
* Looked briefly at a miscompilation of vector code that turned out to
occur during predictive commoning. I haven't yet checked whether the bug
is there or not. For the time being, I'm using -fno-predictive-commoning
so that I can get on with other stuff.
* Looked at why the auto-inc-dec stuff wasn't as effective with current
mainline. It turns out that the new misaligned load/store patterns
are using overly weak predicates, so it appears to the RTL optimisers
as though we support reg+offset addresses.
* Reviewed the shrink-wrap patch.
* More auto inc/dec.
== Next week ==
* Backport the fix for #736007.
* More loop stuff.
Richard
Achieved:
* HW in place. Borrowed another Panda board that I can use permanently.
* linaro-media-create installed and working. Flashed sd-card with 11.05 for
Panda board.
* Played around with the Panda board. Networking is on the way.
I can ssh to my computer from the Panda but not the other way around yet.
Issues:
* Serial log from Panda board not working yet. Mike suggested I try a
straight-through cable instead in of a null-modem cable. (Have not done this
yet.)
* The ST-E firewall (proxy) is causing me some headache. But at least
everyone here is facing the same issues, so there are people to talk to
about the problems.
Good to know:
* I will be on vacation 18th July - 15 August
Best Regards
Åsa
Hi,
- continued working on prevention of over-widening in vectorization -
finalizing the patch
- improvement of vectorizer peeling heuristic - merged to gcc-linaro-4.6
- vectorization of widen-mult with over-promoted operands - proposed
for merge to gcc-linaro-4.6
- fixed PR 49610
- patch reviews
I am on vacation tomorrow and on Sunday.
Ira
Hello,
I'm interested in LLVM correct performance on ARM hardware and it looks
like LLVM is kind of sensitive to what GCC version is used for its
compilation. I tested LLVM 2.9 as a reference point and LLVM
HEAD as of June 29 on ARMv7 (two boards with two different Ubuntu
versions) compiled by GCC 4.3.4, 4.4.1, 4.4.5, 4.5.2, 4.6.1 Linaro
2011.05 and 4.6.1 Linaro 2011.06. Please see
http://ghcarm.wordpress.com/2011/07/03/llvm-on-arm-testing/
It looks like LLVM HEAD does have
about 28 regressions in comparison with LLVM 2.9. But also Linaro's GCC
4.6.1s do have some regressions in comparison with older GCC 4.3.4 and
4.4.1. Also what is really interesting with LLVM is how much tests fails
when compiled with -O2 or default -O3 compilation option. I don't know
if the culprit here is LLVM code or just GCC
miscompilation/overoptimization?
Is there any testing I may do to help you fix those regressions?
Thanks,
Karel
Hi,
* continued to look into how to add remote support for libunwind using
ptrace
* reworked the lookup of the ARM specific unwind tables for local
unwinding
* re-use the existent (dwarf related) infrastructure to find the ARM
specific unwind tables rather than doing it on our own
* removes some code and the limitation of only supporting a certain
amount of unw tables
* (hopefully) smooths the way for remote unwinding via ptrace
* submitted a patch that fixes the syntax of inline assemblies for
some GCC versions
Regards
Ken
(This is a combined report having realized that I didn't manage to
send this out on Monday because of some power issues).
== GCC ==
== Progress ==
* Cleared out some backlog of backports from mainline.
* Investigating some of the BRANCH_COST results.
* Small patches to fix VFP constraints being tested and fixed on trunk
upstream.
* Upstream bugzilla triaging.
* Upstream bugzilla triaging.
* Looked at Neon 64 bit arithmetic and sent out report.
* Reviewed the unaligned access patch and the EPILOGUE_USES regression
with coremark.
* Issues with my panda board means my benchmark runs aren't happening.
Need to urgently find a way of running them. The panda board dies
mysteriously once a while. Playing around with the kernel to get
something going . Will look at it again next week.
* Merged neon shift immediates patch into linaro-4.6 but needs some rework.
* Merged the fix for LP744754 into linaro 4.5 and 4.6
== Plans ==
* Back on to BRANCH_COST .
* Resubmit neon shift immediates patch for merging.
* Look at some of the perf regressions between a8 and A9.
* Get a working panda board again. !
* Backport arith_shiftsi patch to 4.6 branch upstream.
Meetings:
* 1-1s
* TCWG calls.
== This week ==
* More on the address-of-main() bug. The original patch caused
regressions on x86_64, so I submitted a different approach,
which has now been applied upstream.
* Worked on the libnih bug. It turned out to be a problem in the
libc start routine.
* A bit of patch review.
* More on auto inc/dec.
== Next week ==
* General bug-fixing.
* More auto inc/dec.
Richard
== 64 bit atomics ==
* Submitted patches to gcc patch list
- One comment back already asking if we should really change ARM
to have a VDSO to make checks of the user helper version easier
* Added thumb ldrexd/strexd to valgrind; patch added to bug in their
bugtracker (KDE bug 266035)
* Came to the conclusion eglibc doesn't actually need any changes
- It's got a place holder for a 64bit routine, but it's unused and
isn't exposed to the libc users
- Note that neither eglibc or glibc built cleanly from the trunk on ARM.
* Started digging into QEmu a bit more to find out how to solve the
helper problem
== String routines ==
* Added SPEC2k6 string routine results to my charts; while most
stuff is in the noise it seems
the bionic routine is a bit slower overall than everything else, and
my absolutely trivially simple
~5 instruction loop is a tie for the fastest with my smarter 4
byte/loop using uadd.
== Next week ==
* Sleep, Rest, Relaxation, getting older
* (Will be polling email for any more follow ups on my gcc patches)
Dave
RAG:
Red:
Amber:
Green:
Current Milestones:
|| || Planned || Estimate || Actual ||
||qemu-linaro-2011-07 || 2011-07-21 || 2011-07-21 || ||
Historical Milestones:
(q1 milestones deleted)
||qemu-linaro 2011-04 || 2011-04-21 || 2011-04-21 || 2011-04-21 ||
||qemu-linaro 2011-05 || 2011-05-19 || 2011-05-19 || n/a ||
||close out 1105 blueprints || 2011-05-28 || 2011-05-28 || 2011-05-19 ||
||complete 1111 planning || 2011-05-28 || 2011-05-28 || 2011-05-27 ||
||qemu-linaro-2011-06 || 2011-06-16 || 2011-06-16 || 2011-06-16 ||
== upstream-omap3-patches
* analysed omap3 patchstack and identified some initial parts to pull out
* extracted and submitted upstream a 3-patch set to qdevify omap gpio
* disentangled a cluster of patches for fixing and qdevifying NAND,
ONENAND and the OMAP GPMC model; started on the cleanup process
* spotted and fixed a bug where n810 segfaults if a key is pressed
(this was already fixed in qemu-linaro but not very cleanly)
== other ==
* submitted patch for proper implementation of prlimit64 syscall
for qemu usermode
* sent a patch to fix the final gcc 4.6 write-only-variable warning
* patch review, etc
Current qemu patch status is tracked here:
https://wiki.linaro.org/PeterMaydell/QemuPatchStatus
Absences:
1-5 August: Linaro sprint 1111
(maybe) 15-16 August: QEMU/KVM strand at LinuxCon NA, Vancouver
[LinuxCon proper follows on 17-19th]
* Got introduced to most of the team members - great!
* My manager fixed a linaro laptop for med which I have installed with
Natty. (This is very good since I cannot run as root on my STEricsson
laptop. There are also other security restrictions that makes it very hard
to work in Linaro.)
* Started working a little bit with the Snowball board. I have managed to
flash it using the "riff" tool.
* Read through the material about SPEC2000.
* I have borrowed a Panda board that I can use for the next two weeks. A
question that popped up is how to get one from Linaro?
* I will be on vacation 18th July - 15 August
Best regards
Åsa
It all started with this:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/791552
basically, switching toolchain from 4.5 to 4.6, somehow broke usb on omap3.
This morning i tested with linux-linaro-natty:
[flag@newluxor linux-linaro-natty]$ git log --oneline -1
f15fd8f LINARO: Linux-linaro-2.6.38-1002.3
and i can reproduce the problem there too.
Here is some more info (dmesg, toolchain, lsusb, etcetc):
http://pastebin.ubuntu.com/620786/
--
bye,
p.
Hi,
- support of multiple uses of original pattern statements (needed for
over-promotion work) - committed upstream
- support of widen-mult of unsigned types and constants - merged to
gcc-linaro-4.6
- vectorizer peeling heuristic improvement - proposed to merge to gcc-linaro-4.6
Ira
Here's an implementation of an 8x8 integer DCT done with NEON
intrinsics -- essentially a translation of the assembly version in
libjpeg-turbo trunk:
https://github.com/mkedwards/crosstool-ng/blob/master/patches/libjpeg-turbo…
It is in a compilable (on Linaro 2011.05 GCC 4.5, anyway; a recent
Linaro 4.6 snapshot ICEs) but otherwise untested state. Still, it's
interesting to compare the assembly that it generates against the
hand-written version. I thought I'd give linaro-toolchain a heads-up
in case y'all could use a test case that generates plenty of pressure
on the VFP/NEON register bank. (I intend to use it to see how much
performance difference there really is, on the A8 and A9, between NEON
code compiled for 16 vs. 32 registers.)
Cheers,
- Michael
I've added a pico-SAM9G45 board to my home office setup. It's called
libra1 and runs Debian 6.0. The nice thing about this board is it has
256 MB of RAM and should be able to run SPEC 2000.
I'll start using it to check that our GCC changes don't cause
significant performance regressions on earlier architectures. Dave,
you could use it to test the 64 bit primitives if you want.
There's more information on the configuration at:
http://bazaar.launchpad.net/~michaelh1/+junk/hardware/view/head:/libra/r1/R…
-- Michael
Hello,
I tried to build the gcc-linaro cross compiler tool on the x86_64
ubuntu-10.04 machine.
The build and host machine is the x86_64 ubuntu-10.04, the target
is arm-eabi.
But failed and got the following error messages:
-----
checking host system type... arm-unknown-eabi
checking for arm-eabi-ar... arm-eabi-ar
checking for arm-eabi-lipo... arm-eabi-lipo
checking for arm-eabi-nm... /home/minslin/ET0001A/build-gcc/./gcc/nm
checking for arm-eabi-ranlib... arm-eabi-ranlib
checking for arm-eabi-strip... arm-eabi-strip
checking whether ln -s works... yes
checking for arm-eabi-gcc... /home/minslin/ET0001A/build-gcc/./gcc/xgcc
-B/home/minslin/ET0001A/build-gcc/./gcc/
-B/home/minslin/linaro-gcc/arm-eabi/bin/
-B/home/minslin/linaro-gcc/arm-eabi/lib/ -isystem
/home/minslin/linaro-gcc/arm-eabi/include -isystem
/home/minslin/linaro-gcc/arm-eabi/sys-include
checking for suffix of object files... configure: error: in
`/home/minslin/ET0001A/build-gcc/arm-eabi/libgcc':
configure: error: cannot compute suffix of object files: cannot compile
See `config.log' for more details.
make[1]: *** [configure-target-libgcc] Error 1
make[1]: Leaving directory `/home/minslin/ET0001A/build-gcc'
make: *** [all] Error 2
-----
The configure options I used were:
-----
../gcc-linaro-4.5-2011.06-0/configure --target=arm-eabi
--enable-languages=c,c++ --enable-shared --prefix=/home/minslin/linaro-gcc
--with-gmp=/home/minslin/gmp --with-mpfr=/home/minslin/mpfr
--with-mpc=/home/minslin/mpc
-----
I downloaded and built the GMP, MPFR and MPC packages. The
versions I used are:
gmp-5.0.2
mpfr-3.0.1
mpc-0.8.2
Please help me solving this problem.
Thanks.
Best regards,
------------------------------------------------------------
Min-Shong Lin (林敏雄)
Engineering Division
Global UniChip Corp. (創意電子)
EMAIL : mins.lin(a)globalunichip.com
TEL : +886-3-5646600 ext. 6937
--------------------------------- Email Confidentiality Notice
------------------------------------------
If you are not the intended recipient for this CONFIDENTIAL E-mail, please
delete it immediately without keeping or distributing any copy and notify
the sender.
----------------------------------------------------------------------------------------------------------------------
== Atomics ==
* Testing the libgcc fallback code with Nicholas's kernel patch -
and then fixing my initialisation code to use init_array's (thanks
Richard for the hint)
* Tidying stuff up after a review of my patch by Richard - the
sync.md is now smaller than the original before I started.
* Discussing sync semantics with Michael Edwards - he's spotted that
the gcc ARM sync routines need to move their final memory barrier
for the compare-exchange case where the compare fails.
* Looking at valgrind; it looks like it should be OK with the
commpage changes; but it doesn't currently support ldrexd and strexd;
there is a
patch for it to do ARM mode but not thumb yet.
Dave
== This week ==
* Catching up on email.
* More experiementation with the auto inc/dec stuff. TBH, this has taken
longer than expected, but I think it's close now.
* Wrote a dejagnu testcase for PR 49196. Tested it on trunk and submitted
it upstream.
== Next week ==
* Backport fix for PR 49196.
* Look at NEON reload failure.
* More auto inc/dec.
Richard
RAG:
Red:
Amber:
Green:
Current Milestones:
|| || Planned || Estimate || Actual ||
||qemu-linaro-2011-07 || 2011-07-21 || 2011-07-21 || ||
Historical Milestones:
||finish qemu-cont-integn || 2011-01-25 || 2011-01-25 || handed off ||
||first qemu-linaro release || 2011-02-08 || 2011-02-08 || 2011-02-08 ||
||qemu-linaro 2011-03 || 2011-03-08 || 2011-03-08 || 2011-03-08 ||
||qemu-linaro 2011-04 || 2011-04-21 || 2011-04-21 || 2011-04-21 ||
||qemu-linaro 2011-05 || 2011-05-19 || 2011-05-19 || n/a ||
||close out 1105 blueprints || 2011-05-28 || 2011-05-28 || 2011-05-19 ||
||complete 1111 planning || 2011-05-28 || 2011-05-28 || 2011-05-27 ||
||qemu-linaro-2011-06 || 2011-06-16 || 2011-06-16 || 2011-06-16 ||
== other ==
* wrote a number of patches fixing issues identified by exhaustive
testing of the ARM decoder. Still some parts of the Thumb decoder
to deal with.
* discussion about 1176 (somebody has some patches to add support for
it) and what set of feature switches are needed to support this and
the 1136r1 (they have most but not all of the v6K feature set)
* sent some patches which deal with the "VLDM/VSTM generate too many
TCG ops" bug by raising the limit on number of TCG ops
* sent a pull request for some ARM patches that had been languishing
on the mailing list
* tracked down a regression making vexpress crash when run on upstream
QEMU to a recent Xen-related patch
* working on AFDS (annual review) paperwork
Meetings: toolchain, standup
Lots of interrupts/bugfixing/review recently means I'm drifting slightly
behind schedule on blueprints. Need to refocus on those next week.
Current qemu patch status is tracked here:
https://wiki.linaro.org/PeterMaydell/QemuPatchStatus
Absences:
1-5 August: Linaro sprint 1111
(maybe) 15-16 August: QEMU/KVM strand at LinuxCon NA, Vancouver
[LinuxCon proper follows on 17-19th]
== GDB ==
* Completed setup and baseline run for remote gdb testsuite.
This involved tracking down and working around a variety
of problems including:
- issues with the cross-compiler packages
- board files and sysroot for the debugger to use
- timing problems in the dejagnu harness
- multiple problems in the GDB testsuite itself
At this point, I'm down to about a dozen extra FAILs and
about 2000 tests (out of 16000) tests that are not executed
in the remote testsuite for one reason or the other. Next
step will be to analyze those and create bug reports.
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E.
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk
Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294
Hi,
- investigating detection of general over-widening cases in the vectorizer
- improvements of widen-mult - proposed for merge to gcc-linaro-4.6
- fixed PRs 49443 and 49478
Ira
Hi,
* in order to have the android's debuggerd use libunwind I looked at
libunwind's remote interface and especially the libunwind-ptrace lib
that sits on top of that.
* the remote interface seems a bit awkward to me. The user provides
a set of callbacks to access the inferior memory or registers. Instead
of using these callbacks to obtain the actual unwind information
(eh_frame fro example) it requires the user implement another callback
(find_proc_info) to lookup the unwind info himself.
* the libunwind-ptrace currently deos not support the ARM specific
unwind tables
* started to look into how to improve the situation
* not straight forward as it's tightly bound to eh_frame unw info
and libunwinds DWARF parsing mechanism
* attending a class this afternoon
* Note: I'm on vacation starting in a few hours and I'll be back on
Tuesday next week.
Regards
Ken
== Progress ==
* Backported A5 / A15 tuning to Linaro GCC. Waiting for test results.
* T2 perf. meeting.
* Backported the neon length patch back.
* Patch for PR49385 being tested.
* Bootstraps broken yet again / upstream maintenance / test regressions.
* Waiting on Branch_cost results .
* Minor binutils patches as a result of upstream maintenance.
== Plans ==
* Finish BRANCH_COST tuning
* Look at VFP moves for some more .
* Backport some of the upstream bug fixes that need to be done.
Meetings:
* 1-1s
* TCWG call.
* At Google unconference 17-19 June
Chaired the Toolchain Working group call. Michael H was unavailable (but
OK) following yet another earthquake in Christchurch.
Continued working on my widening multiplies patches. I did think for a
while there must be a logic flaw because it's using the wrong sized
inputs to instructions, but on closer inspection that was taken care of
in the RTL transformations. The changes I already had seem good in all
the test cases I could generate. I've also identified a number of
additional optimization opportunities, so I've been tweaking the patch
for those.
Continued trying to figure out why my Thumb2 constants patches break the
native bootstrap build. The stage2 compiler enters an infinite loop, but
I couldn't easily identify why, as yet.
Backported Julian's unaligned access patches to a Linaro test branch.
----
Upstream patched requiring review:
* NEON scheduling patch
http://gcc.gnu.org/ml/gcc-patches/2011-02/msg01431.html
== This week ==
* Experimented more with A8 and A9 tuning for auto inc/dec addresses.
* More work on the auto inc/dec pass itself.
* Compared the assembly output in the GCC testsuite for a range
of targets (avr, bfin, h8300, ia64, m32c, m32r, m68k, am33,
pa, pdp11, powerpc, rx, score, sh, xstormy16 and vax).
* Looked at a few bug reports. Will submit patches when I get back.
== Next week ==
* Holiday!
Richard
== 64 bit Atomics ==
* Wrote more test cases; now have a nice 3 thread test that passes -
and more importantly, it fails if I replace one of the atomic ops
by a non-atomic equivalent.
* Modified existing atomic helper code in libgcc to do 64bit
* Added init function to 64bit atomic helper to detect presence of
new kernel and fail if an old one is present.
That last one is a bit of a pain; it now correctly exits on existing
kernels and aborts; qemu user space seg faults
because access to the kernel helper version address is uncaught. So
first thing I need to do is try the early kernel patch Nicolas
sent around, and then I really need to see if qemu can be firmly
persuaded to run it.
== String routines ==
* Ran denbench with sets of strlen; started running some spec as well.
== QEmu ==
* Tested Peter's prelease tarball in user space and a bunch of
system emulations
- successfully managed to say hello to #linaro from an emulated
overo board using USB keyboard.
== Other ==
* Booked 4th July week off.
Dave
Hi,
* short week (Monday -> public holiday, Wednesday -> attended a class)
* tested the gcc-linaro-4.5-2011.06 with linaro-android on my panda
* works! no noticeable differences to 4.5-2011.05 for me
* libgui.so apriori prelink issue remains:
* realized that apriori works quite different from GNU prelink
* checked the prelink map against LOAD segment size - looks good
* build with -DLINKER_DEBUG=1 - still doesn't give hints
* give up for know and moving on
* made a patchset to be able to build whole android with -DDEBUG
* except of v8 and webkit
* zygote segfaults for unknown reason
* understood the basics of the current backtracing mechanism on android
* the bionic linker registers signal handlers on SIGSEGV that opens
a socket
* the debuggerd listens on that socket and gets its regs etc. using
ptrace
* Note: I'll be on vacation starting on Thursday next week
Regards
Ken
== GDB ==
* Committed support for NEON registers in core dumps (bug #615972)
to mainline GDB and binutils repositories.
* Added support to readelf (mainline binutils) to correctly display
NEON register core file notes.
* Started looking into remote gdb testsuite.
== GCC ==
* Investigated reload failure when building kernel with Linaro GCC 4.5
(discovered by Arnd).
* Investigated stray function references due to partial inlining
breaking kernel build with Linaro GCC 4.6 (discovered by Arnd).
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E.
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk
Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294
Hi Michael,
This thread about how to generate ancillary sections using gas has
resurfaced again. Do you know who might be available from the
toolchain group to take a look at this?
It appears that this issue can best be solved by a change to gas
(or, possibly, to gcc).
Cheers
---Dave
References:
Dynamic patching in discarded sections
<http://thread.gmane.org/gmane.linux.kernel/1152142>
Generating ancilliary sections with gas
<http://thread.gmane.org/gmane.linux.linaro.toolchain/686>
There is a ftbfs on armel bug for postler:
https://bugs.launchpad.net/ubuntu/+source/postler/+bug/791319
Attached test compiles fine on amd64 but fails on armel:
20:16 hrw@malenstwo:postler-0.1.1$ gcc _build_/.conf_check_0/test.c
_build_/.conf_check_0/test.c: In function ‘main’:
_build_/.conf_check_0/test.c:5:1: error: incompatible type for argument
3 of ‘vasprintf’
/usr/include/stdio.h:396:12: note: expected ‘__gnuc_va_list’ but
argument is of type ‘char *’
Can someone explain me why this happens?
Ubuntu armel and armhf cross compilers, CSL 2011.03-42 have same
problem.
Hi
Yesterday I looked at bug [1] in chromium-browser but checked oneiric
version: 12.0.742.91~r87961. It failed on linking to CUPS libraries so I
looked at upstream repository and found fix [2]. With patch applied it
failed on same place:
LINK(target) out/Release/chrome
[keep-alive] czw, 16 cze 2011, 18:21:39 CEST (441 min)
/usr/bin/ld.bfd.real:
out/Release/obj.target/third_party/cacheinvalidation/../../cacheinvalidation_proto_cpp/gen/protoc_out/google/cacheinvalidation/internal.pb.o(.text._ZN12invalidation21ClientToServerMessage27MergePartialFromCodedStreamEPN6google8protobuf2io16CodedInputStreamE+0x7d2): unresolvable R_ARM_THM_CALL relocation against symbol `operator new(unsigned int)@@GLIBCXX_3.4'
/usr/bin/ld.bfd.real: final link failed: Nonrepresentable section on
output
collect2: ld returned 1 exit status
make[1]: *** [out/Release/chrome] Error 1
make[1]: Leaving directory
`/home/hrw/devel/porting-jam/chromium-browser-12.0.742.91~r87961/build-tree/src'
make: *** [debian/stamp-makefile-build] Error 2
dpkg-buildpackage: error: debian/rules build gave error exit status 2
According to bugs [3][4] package may work when built without -fPIE but
(as Loic wrote in [3]) "-fPIE is needed for some security features and
web browsers are a critical piece of software that we want to protect".
May someone take a look at this issue?
One warning: this fail happens after 440 minutes of build on PandaBoard
with usb hdd (took 1.8GB of space). If needed I can provide account on
my panda with access to build directory (but this rather tomorrow then
today) - IPv6 address only.
1.
https://bugs.launchpad.net/ubuntu/+source/chromium-browser/+bug/791283
2.
http://git.chromium.org/gitweb/?p=chromium/chromium.git;a=patch;h=c2b99f3fe…
3. https://bugs.launchpad.net/binutils-linaro/+bug/641126
4. http://code.google.com/p/chromium/issues/detail?id=55439
Hi,
- fix vectorizer testsuite failures on ARM - committed
- committed a fix of a bug in the vectorizer revealed by the widen-mult patch
- committed an improvement of peeling heuristic
- reduce over-widening in case of multiplication by a constant
(improves vectorized rgbyiq by almost 2x) - committed
- started backporting to gcc-linaro-4.6
Ira
The Linaro Toolchain Working Group is pleased to announce the release
of Linaro QEMU 2011.06.
Linaro QEMU 2011.06 is the latest monthly release of qemu-linaro.
Based off upstream (trunk) QEMU, it includes a number of ARM-focused
bug fixes and enhancements.
This release introduces two new features; these are still experimental
so please report any issues:
- A model of the Gumstix Overo board; this is an OMAP3 based system
similar to Beagle but with the advantage of having supported
onboard ethernet.
- USB keyboard and mouse support, if your kernel includes support for
the OMAP3 OHCI controller (not just EHCI). Try adding
"-device usb-kbd -device usb-mouse" to your QEMU command line for
Beagle or Overo models.
Other interesting changes include:
- A fix for the lack of graphics output on the Beagle model when
running the Linaro 11.05 final release image
- Suppression of the "Bad register 0x000000f8" warnings provoked by
the Linaro 11.05 final release kernels
- As usual, various minor correctness fixes and other upstream changes
Known issues:
- The beagle and beaglexm models still do not support USB networking
- There are some gcc 4.6 warnings about "variable set but not used"
which have not yet been resolved; Ubuntu Oneiric's gcc makes these
non-fatal, but if you're building with an upstream gcc 4.6 you may
need to add the "--disable-werror" option to configure
The source tarball is available at:
https://launchpad.net/qemu-linaro/+milestone/2011.06
Binary builds of this qemu-linaro release are being prepared and
will be available shortly for users of Ubuntu. Packages will be in
the linaro-maintainers tools ppa:
https://launchpad.net/~linaro-maintainers/+archive/tools/
More information on Linaro QEMU is available at:
https://launchpad.net/qemu-linaro
The Linaro Toolchain Working Group is pleased to announce the release
of both Linaro GCC 4.6 and Linaro GCC 4.5.
Linaro GCC 4.6 is the fourth release in the 4.6 series. Based off the
latest GCC 4.6.0+svn174261, it adds new optimisations and vectoriser
improvements.
Interesting changes include:
* Updates to 4.6.0+r174261
* Blocks can now vectorise into ORN and BIC instructions
* Support for half word to double word multiply and accumulate operations
* Better support for other widening multiply operations
* Further performance improvements in NEON strided loads and stores
* Performance improvements targeted at EEMBC CoreMark
Fixes:
* PR target/48454: Set the lengths correctly for the case with Quad vectors.
Known issues:
* Building Python 2.7 with -mfpu=neon exposes a bug in vmov.i64 in
binutils 2.20.51. Please use 2.21 or later.
The strided load/store improvements allow the vectoriser to
efficiently access values that occur at every n'th address, such as
all of the red values in a RGB image or all of the left channel
samples in a interleaved audio array. For example, a plain C function
that converts between RGB and CYMK now runs 7.3 x faster on an A9.
Linaro GCC 4.5 2011.06 is the eleventh release in the 4.5 series.
Based off the latest GCC 4.5.3+svn174250, it is a maintenance focused
release.
Interesting changes in 4.5 include:
* Updates to 4.5.3+r174250
Fixes:
* LP: #744754: ICE in reload_cse_simplify_operands, at
postreload.c:402 with neon optimized code
* LP: #748138: ICE in redirect_jump, at jump.c:1443
The source tarball is available from:
https://launchpad.net/gcc-linaro/+milestone/4.6-2011.06-0https://launchpad.net/gcc-linaro/+milestone/4.5-2011.06-0
Downloads are available from the Linaro GCC page on Launchpad:
https://launchpad.net/gcc-linaro
Mailing list: http://lists.linaro.org/mailman/listinfo/linaro-toolchain
Bugs: https://bugs.launchpad.net/gcc-linaro/
Questions? https://ask.linaro.org/
Interested in commercial support? inquire at support(a)linaro.org
-- Michael
Hi there,
I tried to use android-toolchain-eabi-4.5-2011.05-0-linux-x86.tar.bz2
to enable hard float ABI.
But it built failed on Android 2.3.3. Most errors are like address
alignment issues.
Did you have any patch for that? Successfully tested?
Thanks
Richard
My testing for the widening multiplies patches came back clean. I've
committed it upstream, and merged it to Linaro GCC 4.6.
The testing for my Thumb2 constants patches failed (bootstrap failure on
ARM), so can't commit that right away. The failure is a SIGSEGV in the
stage 2 compiler building it's own libgcc. This is going to be tricky to
pin down!
Continued work on widening operations in the GCC middle-end. I now have
it doing all the arbitrary width widening operations that I want. I've
tested that it didn't do the wrong thing for just about all permutations
of input and output types, but I've yet to do anything like a bootstrap
test or anything. There's still quite a lot more work to do in tidying
it up.
----
Upstream patched requiring review:
* NEON scheduling patch
http://gcc.gnu.org/ml/gcc-patches/2011-02/msg01431.html
Tracked a bootstrap failure with SMS flags on ARM machine exposed in
recent trunk.
Fixed and tested a patch for that.
Tested another SMS patch following comments received from gcc ml@
(http://gcc.gnu.org/ml/gcc-patches/2011-05/msg02294.html)
Following conversation with Micheal, added SPEC2006 analysis info to
the benchmarks wiki page
-https://wiki.linaro.org/Internal/ToolChain/Benchmarks
Discussed with Ramana weird RTL pattern generated for
thumb2_movhi_insn with -march=armv7-a -mthumb using trunk. Probably
will open a PR for that.
Hi,
* learning more on andoid (repo tool, some branches, the basics of the
android build system)
* finished to setup my environment to build the android sources
* sucessfully build linaro-android using linaro gcc 4.4, 4.5 and 4.6
* tracked down the libgui.so linaro android issue to the apriori
prelinker
* workaround: disable prelinking for this module (patch -> linaro-dev)
* I still don't understand what exactly is supposed to be prelinked
and what's exactly causing the fail
* Note: next Monday is a public holiday in Germany
Regards
Ken
== String Routines ==
* Completed gathering the SPEC2k6 memcpy results, graphed them, sent them out
* Gathered SPEC2k6 memset results, graphed them, sent them out
== 64bit Atomics ==
* Modified gcc backend to do 64bit Atomic ops - the code looks good,
but I've not
done much testing yet.
== Other ==
* Upstreamed a small ltrace patch
Next week:
Plan is to get gcc tests done and attack libgcc for the pre-v7
fallbacks (the tricky
bit there is runtime deciding what to use)
Also run spec and denbench for strlen and some other string routines
== Progress ==
* Spent some time on the VFP moves and look at ivopts for a bit.
Analysing a couple of options here.
* Committed the DImode moves patch upstream.
* Fixed PR49335 where GCC was generating rsb ip, sp, ip lsl #2
* Proposed a fix for PR48454 (finally being able to reproduce it) -
was a case of a missing length attribute for vec_pack_trunc.
* Investigated some regressions in v1 of a popular embedded benchmark
- leads to BRANCH_COST tuning.
* Merged by neon-vorn-vbic patch into linaro-gcc :4.6
== Plans ==
* Spend some time on the VFP moves and look at ivopts for a bit.
* Merge the fix for lengths to linaro-gcc:4.6 and 4.5 if applicable.
* Merge fix for PR49335 to linaro-gcc:4.5 and 4.6 .
* Find some time for some upstream patch review.
* T2 performance review meeting/
Meetings:
* 1-1s
* TCWG call.
RAG:
Red:
Amber:
Green: USB kbd+mouse finally working on QEMU beagle model
Current Milestones:
|| || Planned || Estimate || Actual ||
||qemu-linaro-2011-06 || 2011-06-16 || 2011-06-16 || ||
Historical Milestones:
||finish qemu-cont-integn || 2011-01-25 || 2011-01-25 || handed off ||
||first qemu-linaro release || 2011-02-08 || 2011-02-08 || 2011-02-08 ||
||qemu-linaro 2011-03 || 2011-03-08 || 2011-03-08 || 2011-03-08 ||
||qemu-linaro 2011-04 || 2011-04-21 || 2011-04-21 || 2011-04-21 ||
||qemu-linaro 2011-05 || 2011-05-19 || 2011-05-19 || n/a ||
||close out 1105 blueprints || 2011-05-28 || 2011-05-28 || 2011-05-19 ||
||complete 1111 planning || 2011-05-28 || 2011-05-28 || 2011-05-27 ||
== upstream-omap3-patches ==
* started on disentangling the patchstack: submitted patches upstream
for a few standalone fixes. First few steps in a big job...
== omap3-usb-model ==
* added QEMU's USB OHCI model to the omap3/beagle, and (after some
debugging and submitting a couple of OHCI bugfixes upstream)
got USB keyboard and mouse working
== linaro-qemu-11.11 ==
* rebased on master and identified gcc 4.6 compiler patches we need
== other ==
* added and tested patches for an overo board model
* added Beagle board support for returning EDID data from a fake
monitor so the kernel will actually turn the display on
* discussions about Android emulator (which looks likely to take the
upstreamed ARMv7 translator with the fixes we've worked on over
the last six months)
* office move
* QEMU 0.15 is not too far in the future: need to make sure all the
ARM stuff we want is in it
Meetings: standup, GSoC student, Christoffer Dall (working on KVM for ARM)
Current qemu patch status is tracked here:
https://wiki.linaro.org/PeterMaydell/QemuPatchStatus
Absences:
1-5 August: Linaro sprint 1111
(maybe) 15-16 August: QEMU/KVM strand at LinuxCon NA, Vancouver
[LinuxCon proper follows on 17-19th]
Hi,
- vectorization of widening multiplication of unsigned types and
constants - committed to mainline
- fix vectorizer testsuite failures on ARM - submitted
- testing a patch to fix a bug in the vectorizer revealed by the
widen-mult patch
- testing a patch to fix bad peeling heuristic that causes
gcc.dg/vect/vect-72.c to fail on ARM
Ira
I've done a quick write-up on the (almost) continious builds done in
the toolchain group:
https://wiki.linaro.org/WorkingGroups/Builds
It's high level and includes things like what branches we watch, how
often they get built, where the results go, and how things like
testsuite results are shared. This is in follow up to the email on
testsuite diffs yesterday.
-- Michael
Public Holiday on Monday.
Learned that Linaro are reducing their funding to just one CodeSourcery
engineer, myself. Spoke to Chung-Lin to break the news and reassign him
to other work. Chung-Lin will now be working on MIPS16 Eglibc porting.
Pinged my ADDW/SUBW patch, again. Ramana finally reviewed it, so I've
addresses his concerns and reposted. The corrected patch was approved,
so I've set it to test before committing.
Continued work on widening multiplies tree optimizations in GCC. Bernd
made it sound quite easy, but changing the type of some operations means
quite a lot of tweaking and reworking in the rest of the compiler expand
routines. In particular, the widening stuff needs to be broken out of
expand_binop, and recast.
Merged, tested and committed the latest patches from FSF 4.5.
Merged, tested and committed the latest patches from FSF 4.6.
Richard Earnshaw approved my widening multiply RTL patch, so I've set
that to test in the Linaro test system.
Richard also approved my SMLALTB/SMLALTT patch. Set that testing also.
Responded to a question on ask.linaro.org.
----
Upstream patched requiring review:
* NEON scheduling patch
http://gcc.gnu.org/ml/gcc-patches/2011-02/msg01431.html
Fixed an SMS patch following comments received in the gcc@ ml.
While testing the fix I discovered another issue-- latest mainline
ICEs with SMS flags while building libgcc on ARM configured with
--with-arch=armv7-a.
This new failure does not seem to be related to the above fix and I'm
now investigating it.
Looked at code generated for spec2006's libquantum, hmmer and
cactusADM_base benchmarks.
== String routines ==
* Wrote a hybrid ARM/Neon memcpy - it uses Neon for non-aligned cases or
large (>=128k) cases
* polished up and sent out write up of workload analysis of denbench and spec
* Ran denbench with all memcpy and memset varients, graphed up results
- SPEC 2k6 is now cooking with the memcpy set - it'll take all weekend.
== 64 bit atomics ==
* Started looking through the Gcc code at the existing non-64bit atomic code;
I need to understand how registers work in DI mode and what's going to be
needed in terms of temporaries.
Dave
== Progress ==
* Finished breaking down the Thumb2 performance blueprint
* Some patch review and bugzilla maintenance.
* Canonicalized vorn and vbic. Bootstrap failure reported . Fixed upstream
* Rewrote parts of the DImode expanders and combined them to two
patterns with alternatives that get enabled based on the architecture
variant. While looking at the bug with adr's possibly going out of
range, it looks like there is a bug in const_ok_for_op with respect to
how it attempts to generate code for a DImode move of 0xffffffff which
can be implemented as a simple mvn but gets split into 3 instructions
More explanations in the patch when it comes out.
* Thumb2 performance meeting this week.
* Talked to RichardS about A8 and Neon / auto-increment issues he was
seeing with scheduler descriptions and looked again at the A8 TRMs and
the examples.
* Looked at lrint and lrintf which are C99 functions for rounding and
created a prototype lrint and lrintf patch for GCC that now appears to
generate the vcvtr instructions.
== Plans ==
* Spend some time on the VFP moves and look at ivopts for a bit.
* Finish testing and submit upstream my other patch with DImode moves
and cases where we are splitting more than necessary.
* Start looking at some of the T2 performance work items.
* Patch review. Finish TLS patch review .
* Try to get vcvtr working and tested with eglibc.
* Look at RichardS's comments and testcase for the A8.
Meetings:
* 1-1s
* Linaro calls.
[Short week: bank holiday]
RAG:
Red:
Amber:
Green:
Current Milestones:
|| || Planned || Estimate || Actual ||
||qemu-linaro-2011-06 || 2011-06-16 || 2011-06-16 || ||
Historical Milestones:
||finish qemu-cont-integn || 2011-01-25 || 2011-01-25 || handed off ||
||first qemu-linaro release || 2011-02-08 || 2011-02-08 || 2011-02-08 ||
||qemu-linaro 2011-03 || 2011-03-08 || 2011-03-08 || 2011-03-08 ||
||qemu-linaro 2011-04 || 2011-04-21 || 2011-04-21 || 2011-04-21 ||
||qemu-linaro 2011-05 || 2011-05-19 || 2011-05-19 || n/a ||
||close out 1105 blueprints || 2011-05-28 || 2011-05-28 || 2011-05-19 ||
||complete 1111 planning || 2011-05-28 || 2011-05-28 || 2011-05-27 ||
== upstream-omap3-patches ==
* started on disentangling the patchstack: submitted patches upstream
for a few standalone fixes. First few steps in a big job...
== omap3-usb-model ==
* added QEMU's USB OHCI model to the omap3/beagle; the kernel detects
the USB controller and hub but not any attached devices; more
debugging required
== other ==
* discussions about Android emulator
* office move
* QEMU 0.15 is not too far in the future: need to make sure all the
ARM stuff we want is in it
Meetings: standup, GSoC student
Current qemu patch status is tracked here:
https://wiki.linaro.org/PeterMaydell/QemuPatchStatus
Absences:
1-5 August: Linaro sprint 1111
(maybe) 15-16 August: QEMU/KVM strand at LinuxCon NA, Vancouver
[LinuxCon proper follows on 17-19th]
== This week ==
* Spent about half of the week on auto increment/decrement. There are two
execution failures left.
* Looked at assembly comparisons between the old pass and various forms
of the new pass. The results look reasonable.
* Ran DENbench and my libav microbenchmarks to measure the difference
in performance. Saw that some tests were repeatably worse.
* Looked into those tests and realised that they were being hit by the
lack of an address writeback model in the scheduler (a known limitation).
Dependent stores were being scheduled in a block at the end of the loop
because we said that the dependencies had 0 latency.
* Spent most of the rest of the week on fixing that limitation. One of the
difficulties is that define_bypass currently requires a complete list
of instruction reservations. This is difficult for things like writeback
because the result could in principle be used by many different instructions.
Decided to generalise define_bypass so that it can handle filename-style
globs.
* Wrote a patch to model writeback in NEON.
* Wrote a patch to model writeback in core instructions. However,
while doing this, I noticed that the behaviour I'm seeing on our
Cortex-A8 doesn't match what I'd expected from GCC's A8 scheduler
description (or the documentation). Talked with Ramana about it.
Distilled a benchmark.
* These scheduler changes didn't improve the DENbench and libav
scores much by themselves, but the combination of the scheduler
and auto inc/dec changes did produce noticeable improvements
in some libav benchmarks and rather smaller improvements in
some DENbench ones.
== Next week ==
* Finish scheduler work, in light of observed behaviour.
* More testing prior to submission.
I'm away the week of 13th June.
Richard
Hi,
- bug fixes: PRs 49222, 49199, 49239, 49093
- widening multiplication: submitted a patch to support widen-mul for
unsigned types and constants in the vectorizer's pattern recognizer.
Now considering to move optimize_widening_mul pass before loop
optimizations and improve it to support unsigned and constants
Next week: holiday on Tuesday (half day) and Wednesday.
Ira
2011/5/29 Fathi Boudra <fathi.boudra(a)linaro.org>:
> Hi,
>
> The Linaro Team is pleased to announce the release of Linaro 11.05.
>
> 11.05 is the second public release that brings together the huge amount of
> engineering effort that has occurred within Linaro over the past 6 months.
>
> This is the first release delivering Android, Ubuntu and the Working Group
> components nicely bundled into one release. We will continue to pick up more
> Working Group and Landing Team outputs in the upcoming monthly releases.
>
> We encourage everybody to use the 11.05 release. The download links for all
> images and components are available on our release page:
>
> http://wiki.linaro.org/Cycles/1105/Final
>
> Highlights of this release:
>
> * Linaro GCC 4.5, GCC 4.6 and GDB 7.2 2011.05, recently released components
i have been wondering why always two versions are released at the same
time. what kind of users are expected to use 4.5, and what kind of
users are expected to use 4.6?
my another question is whether we have a policy to maintain old
realease. for example, in case1105 has some bugs, is it possible
linato toolchain team fix those bugs in the old version later. many
users are using old version with bugs, if they move to new version
directly, new feature maybe import new bugs. so people maybe want to
use old version with bug fixes, but without new features.
> created by the Toolchain Working Group.
> * Linaro Kernel 2011.05-2.6.38, the first source tarball release of Linux
> Linaro done by the Kernel Working Group.
> * Linaro Evaluation Builds (LEBs) for Android and Ubuntu on PandaBoard with
> 3D graphics acceleration.
> * Android cross toolchain based on latest gcc-linaro and gdb-linaro
> * Host development tools (cross compiler, image builders) readily integrated
> for the Ubuntu distribution users (Lucid, Maverick and Natty support).
> * And many more...
>
> Using the Android-based images
> ==============================
>
> The Android-based images come in three parts: system, userdata and boot.
> These need to be combined to form a complete Android install. For an
> explanation of how to do this please see:
>
> http://wiki.linaro.org/Platform/Android/ImageInstallation
>
> If you are interested in getting the source and building these images
> yourself please see the following pages:
>
> http://wiki.linaro.org/Platform/Android/GetSource
> http://wiki.linaro.org/Platform/Android/BuildSource
>
> Using the Ubuntu-based images
> =============================
>
> The Ubuntu-based images consist of two parts. The first part is a hardware
> pack, which can be found under the hwpacks directory and contains hardware
> specific packages (such as the kernel and bootloader). The second part is
> the rootfs, which is combined with the hardware pack to create a complete
> image. For more information on how to create an image please see:
>
> http://wiki.linaro.org/Platform/DevPlatform/Ubuntu/ImageInstallation
>
> Getting involved
> ================
>
> More information on Linaro can be found on our websites:
>
> * Homepage: http://www.linaro.org
> * Wiki: http://wiki.linaro.org
>
> Also subscribe to the important Linaro mailing lists and join our IRC
> channels to stay on top of Linaro developments:
>
> * Announcements:
> http://lists.linaro.org/mailman/listinfo/linaro-announce
> * Development:
> http://lists.linaro.org/mailman/listinfo/linaro-dev
> * IRC:
> #linaro on irc.linaro.org or irc.freenode.net
> #linaro-android irc.linaro.org or irc.freenode.net
>
> Known issues with this release
> ==============================
>
> For any errata issues, please see:
>
> http://wiki.linaro.org/Cycles/1105/Final#Known_Issues
>
> Bug reports for this release should be filed in Launchpad against the
> individual packages that are affected. If a suitable package cannot be
> identified, feel free to assign them to:
>
> http://www.launchpad.net/linaro
>
> Cheers,
>
> Fathi Boudra
> --
> Linaro Release Manager | Platform Project Manager
>
> _______________________________________________
> linaro-announce mailing list
> linaro-announce(a)lists.linaro.org
> http://lists.linaro.org/mailman/listinfo/linaro-announce
>
Hi,
* finished the measuring of the overhead of the ARM specific unwind tables
https://wiki.linaro.org/KenWerner/Sandbox/libunwind#overhead_of_the_ARM_spe…
* started to get an environment up and running in order to build the
linaro-android sources
* encountered some build issues (I'm in the process to sort out some
issue with pfalcon of the android team)
* finshed 11.11 cycle planning
* I'll be out of office for the rest of the week (public holiday +
vacation)
Regards
Ken
Progress:
* Some trouble building the SPEC2k tools in the new multiarch world in
natty. Perl refuses to link libm and a number of other things
also end up failing . Appears to be a real pain with the new multiarch
world and SPEC2k's curious build system for its' tools . Will fall back
to an older chroot and get the tools built natively.
* Tried breaking down the T2 performance blueprint - initial breakdown
now available.
* Looked at the binutils vmov.i64 issue again. Looks like natty-updates
will now pick this up.
* Some patch review and bugzilla maintenance.
Plans:
* Get SPEC tools building.
* Look more at the T2 performance blueprint
* Spend some time on the VFP moves and look at ivopts for a bit.
* Merge review duty.
Meetings:
* 1-1s
* Linaro calls.
== Last week ==
* Investigated the CoreMark numbers posted by Michael Hope, mainly the
oddities of a significant Linaro 4.6 regression versus FSF 4.6. Later
verified to be a false alarm.
* Pushed a merge of some of my upstream CoreMark patches to Linaro 4.6.
* Did archeology for PR42017. Traced some history of the ARM prologue
from 2000 to 2007 (DF branch), posted upstream. Hope this clarification
gets my patch an approval soon.
* Tried the above PR42017 patch (which is supposed to release the use of
LR as a general register in leaf functions) on CoreMark, using Linaro
4.6, and was surprised to find that despite many reductions in spill
code and epilogue (now more often directly return by ldmfd), the
generated code still regresses in performance (!).
* Continuing above, suspecting something from experience (cough) added
-falign-functions=8 to the CoreMark compile options. Finally produced a
small improvement, while causing a regression for the
without-PR42017-patch case (victory?).
* Worked on PR48808, PR48792 over the weekend, which are cases where
paradoxical subregs caused ICE in reload. Posted an ARM backend patch
upstream, though now mostly taken over by Richard Sandiford :)
== This week ==
* Some other PRs, ideas, still work in progress.
* Started using the porter boards, will try to get LP:689887 over with
this week.
* Set-up SPEC2006 profile runs on PowerPC with trunk.
* Looked at SPEC2006's 462.libquantum.
* PR745743 - compared different versions mentioned in the PR.
* Wrote a patch to fix another issue related to how SMS handles debug_insn.
== String routines ==
* Finally finished the ltrace analysis of the whole of SPEC 2k6 and
have written it up - I'll proof read it next week and then send it out
to the benchmark list.
* Ran memset and memcpy benchmarks of larger than cache sizes on A9
* memcpy on larger than cache sizes (or probably mainly cache miss
data) does come back to Neon winning over ARM; my suspicion is that
with cache hits we run out of bandwidth on Neon, but that doesn't
happen in the cache miss case; why it's faster in that case I'm not
sure yet.
* memset is still not faster for Neon even on large sizes where
the destination isn't in the cache.
== Other ==
* Started looking at 64 bit atomics
* Looking at the pot of QEmu work with Peter.
Dave
Hi,
* the overhead of the ARM specific unwind tables for some binaries:
https://wiki.linaro.org/KenWerner/Sandbox/libunwind#overhead_of_the_ARM_spe…
* sometimes the size of the .text section differs which worries me a
bit (not necessarily a GCC issue, could be related to the build system)
* tested a couple of linaro-android images on my panda board
* ran into a l-i-t issue (now fixed) and discussed with asac and friends
* and finally got the network up and running :)
* some 11.11 cycle planning (libunwind work items, "in distributions"
spec)
Regards
Ken
RAG:
Red:
Amber:
Green: 1111 QEMU planning complete
Current Milestones:
| Planned | Estimate | Actual |
complete 1111 planning | 2011-05-28 | 2011-05-28 | 2011-05-27 |
qemu-linaro-2011-06 | 2011-06-16 | 2011-06-16 | |
Historical Milestones:
finish qemu-cont-integration | 2011-01-25 | 2011-01-25 | handed off |
first qemu-linaro release | 2011-02-08 | 2011-02-08 | 2011-02-08 |
qemu-linaro 2011-03 | 2011-03-08 | 2011-03-08 | 2011-03-08 |
qemu-linaro 2011-04 | 2011-04-21 | 2011-04-21 | 2011-04-21 |
qemu-linaro 2011-05 | 2011-05-19 | 2011-05-19 | n/a |
close out 1105 blueprints | 2011-05-28 | 2011-05-28 | 2011-05-19 |
== other ==
* Completed planning work for 1111; all blueprints now created, fleshed
out with work items and assigned:
https://blueprints.launchpad.net/qemu-linaro
[Note that as expected some items under consideration have not made
the list; this includes the trustzone work]
* Some interesting upstream QEMU discussions (list and IRC) on
(a) performance improvements [good to see general interest in
this] and (b) overhauling the memory API [very long thread
but I think the proposed API should be OK for ARM system emulation
purposes]
* LP:768650: QEMU warnings on recent Linaro OMAP3 kernels: tracked down
to the kernel deliberately reading a register it knows doesn't exist
on OMAP2/3. Sent a query via Arnd about whether we can get this changed.
* rebased linaro-qemu to current master
* Sent patchset which starts ARM QEMU moving towards getting rid of the
implicit global CPUState pointer
* sent patch fixing a configure bug causing it to create recursive
symlinks
* sent a patchset which tightens up the compile time TCG value type
checking; this would have detected the build-breaking patch I sent
earlier this week...
* sent patch adding support for active-low interrupts to the LAN9118
model; this is needed when it is used in the Overo OMAP3 board model
Meetings: toolchain, standup, GSoC student, doughnuts
Current qemu patch status is tracked here:
https://wiki.linaro.org/PeterMaydell/QemuPatchStatus
Absences:
1-5 August: Linaro sprint 1111
(maybe) 15-16 August: QEMU/KVM strand at LinuxCon NA, Vancouver
[LinuxCon proper follows on 17-19th]
Hi,
* PR 49087 - fixed
* PR 49038 - opened by Richard - fixed on 4.7, to be backported to 4.5 and 4.6
* working on widening multiplication for unsigned types and constants
(the signed case works fine)
Ira
Posted a new patch for 16 -> 64 bit multiply and accumulate:
http://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg05794.html
Pushed the same patch to a Launchpad branch for testing.
Pinged my addw/subw patch as a review didn't seem forthcoming.
Worked on a canonical form for HImode to DImode multiple-and-accumulate.
The problem isn't too hard to fix, but it's hard to do it in a nice way.
Attended Nathan S's reorg call. Followed up by talking to Nathan F about
what he's been working on with Wind River. Read up on the Wiki.
Looked at why the ARM smlal{tb,bt,tt} instructions are not generated.
I've added the proper patterns, but combine doesn't match them, and I've
run out of time this week to check why.
----
Upstream patched requiring review:
* NEON scheduling patch
http://gcc.gnu.org/ml/gcc-patches/2011-02/msg01431.html
* ARM Thumb2 addw/subw support.
http://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg03783.html
* Multiply and accumulate:
http://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg05794.html
== Last week ==
* Took Monday off, flew back to Taiwan on Tues., got home Wed. night.
* LP:689887, ICE in get_arm_condition_code(). Finally have some new
progress on this. Found my code was rejecting DImode comparisons,
causing uses of __aeabi_lcmp, etc. in expanded RTL. While this still
does not fully explain a bootstrap fail, it may be related, and it's
good I found this here rather then scratch heads on performance
regressions later... :)
* LP:771903: invalid ubfx asm produced by GCC. Mostly got down to the
bottom of this. This bug is rather well hidden, first avoided due to
some inlining heuristic changes after FSF 4.5 was branched (hence 4.6
and trunk doesn't show on the testcase), then hidden again later by
-ftree-bit-ccp. Was able to reproduce on mainline trunk after some
changes to testcase and options. Will send patch later.
* Talked with Ramana on IRC and mail about the '+' constraint modifiers
in the VFP fmul/fdiv patterns. Mostly concluded that these are typos,
and should be fixed.
== This week ==
* Continue with issues.
Hi there. The next two weeks is where we take the technical topics
from the TSC and the discussions had during the summit and turn them
into the concrete engineering blueprints for this cycle. I've created
a page at:
https://wiki.linaro.org/MichaelHope/Sandbox/1111Blueprints
listing all of the TRs. Could you please have a look through these,
find any with your name on them, and fill in the wiki page. I've put
more notes on the page itself. Some of the topics may warrant
specifications.
Let me know if you have questions on what the topics actually mean.
-- Michael
* Profiling SPEC 2k6 still; about 3/4 of the latrace files are
generated but it's taking some hand holding with some of them
(e.g. finding one that makes millions of calls to a library function
that we're not interested in but generates a huge log, and hence
needs it excluding).
* Working through the ones that I have with analysis scripts and
writing the interesting things up.
* Submitted ARM test suite fix for latrace (unsigned characterism)
* Verified Richard's binutils fix in natty-proposed fixed the vtk FTBFS
* Blueprint for 64bit sync primitives.
Dave
Hi,
* started to measure the overhead of -funwind-tables
* libunwind text size increase < 5%
* firefox4 is still building... :)
* found a small glitch when cross compiling the binutils deb package
* made a small patch, talked with doko, fix upstream
* installed android on the pandaboard
https://wiki.linaro.org/KenWerner/Sandbox/AndroidOnPanda
* setup an android development environment on my thinkpad
Regards
Ken
RAG:
Red:
Amber:
Green: 1105 work item status 100% complete
Current Milestones:
| Planned | Estimate | Actual |
qemu-linaro 2011-05 | 2011-05-19 | 2011-05-19 | n/a |
close out 1105 blueprints | 2011-05-28 | 2011-05-28 | 201--05-19 |
complete 1111 planning | 2011-05-28 | 2011-05-28 | |
Historical Milestones:
finish qemu-cont-integration | 2011-01-25 | 2011-01-25 | handed off |
first qemu-linaro release | 2011-02-08 | 2011-02-08 | 2011-02-08 |
qemu-linaro 2011-03 | 2011-03-08 | 2011-03-08 | 2011-03-08 |
qemu-linaro 2011-04 | 2011-04-21 | 2011-04-21 | 2011-04-21 |
== merge-correctness-fixes ==
* last few work items for this blueprint either completed or postponed
[For the record, postponed work:
setting Cortex A8r2 device ID etc regs -- moved to omap3 upstreaming
trustzone -- may get its own blueprint this cycle
VCVT fp exception flags -- postponed as rather tricky and an
obscure corner case that is unlikely to be noticed by users]
== other ==
* tracked down bug with QEMU loading of Google Go produced ELF files,
submitted patch
* talked to our local trustzone expert, very useful
* reworked and resent FPSCR exception flags patches based on review
comments
* reviewed a patch for setting IFSR right for BKPT
* more planning effort
* sent patch to suppress SD card model warnings generated when Linux
probes to see if it's an SDIO card
* redid the "check for unused -nic options" patch as it turned out to
cause regressions with NICs created via -device.
Meetings: toolchain, standup, 1-2-1
Current qemu patch status is tracked here:
https://wiki.linaro.org/PeterMaydell/QemuPatchStatus
Absences:
(maybe) 15-16 August: QEMU/KVM strand at LinuxCon NA, Vancouver
[LinuxCon proper follows on 17-19th]
== This week ==
* Spent almost all the week on GCC's auto inc/dec pass. I first
continued with the incremental "clean ups" and recoding that I'd
started during free time at Budapest, with the idea of bolting the new
optimisations on top of that. However, in the end, I decided it would
be better to rewrite the pass entirely, using a different approach.
I've now got an early prototype of that rewrite, and it seems to be
working as expected on the test cases I've tried so far. I'm running
a regression test over the weekend, although TBH, I expect it to fail
at this stage.
* Tested the fix for vzip, vunz and vtrn. Went well, so I'll submit
next week.
* Blueprints.
== Next week ==
* More auto inc/dec:
* Round off some known rough edges in the prototype.
* Fix bugs.
* Run benchmarks.
* Run code comparison tests (diffing assembly code), both on ARM and
on other targets of interest.
Richard
The Linaro Toolchain Working Group is pleased to announce the release
of Linaro GDB 7.2.
Linaro GDB 7.2 2011.05-0 is the sixth release in the 7.2 series. Based
off the latest GDB 7.2, it includes a number of ARM-focused bug fixes.
This release fixes:
* LP: #615972 Neon registers missing in core files
* LP: #615978 Failure to software single-step into signal handler
* LP: #615996 gdb.cp/templates.exp failures
The source tarball is available at:
https://launchpad.net/gdb-linaro/+milestone/7.2-2011.05-0
More information on Linaro GDB is available at:
https://launchpad.net/gdb-linaro
-- Michael
Can somebody please explain how development happens regarding qemu-linaro ?
I've taken a look here [0] and If I'm not mistaken, there's no code in the
repo. I can see a lot of blueprints, but I don't understand how work is
being done regarding those blueprints or when will it be done! Oh, and what
exactly is the 'qemu-linaro' tarball in the repo ?
I'm not sure how newbie this question is, but please bear with me. :D
Thanks in advance.
[0] https://launchpad.net/qemu-linaro
--
Karim Allah Ahmed.
LinkedIn <http://eg.linkedin.com/pub/karim-allah-ahmed/13/829/550/>
Hello,
* Sent 5 SMS related patches for review upstream.
* Backported two SMS patches from mainline to gcc-linaro and
gcc-linaro/4.6 (fixes for unfreed memory)
Thanks,
Revital
Hi,
* committed a patch that supports reductions in SLP (upstream)
* continued analyzing benchmarks: ffmpeg, EEMBC telecom, office, networking
* started to look into implementation of reverse accesses for Neon
* blueprints
Ira
The Linaro Toolchain Working Group is pleased to announce the release
of both Linaro GCC 4.5 and Linaro GCC 4.6.
Linaro GCC 4.5 2011.05 is the tenth release in the 4.5 series. Based
off the latest
GCC 4.5.3+svn173417, it adds new optimisations, much improved support
for strided load/stores, and fixes for many of the issues found in the
last month.
Interesting changes in 4.5 include:
* Updates to 4.5.3+r173417
* Performance improvements in NEON strided loads and stores
* Performance improvements targeted at EEMBC CoreMark
* Precompiled header support on recent Linux kernels
Fixes:
* LP: #660156: Heap randomisation causes PCH testsuite failures
* LP: #784375: vset_lane_u8 intrinsic generates wrong lane number
* LP: #759409: Profiled bootstrap fails in FSF GCC 4.5
* LP: #723086: Test regressions in the Fortran test suite
The strided load/store improvements allow both NEON intrinsics and the
vectoriser to efficiently access values that occur at every n'th
address, such as all of the red values in a RGB image or all of the
left channel samples in a interleaved audio array. Previous versions of GCC
would unpack the values onto the stack instead of using the registers
directly.
The CoreMark improvements improve the code generation for the hot
functions in benchmark. This release is now on par with Linaro GCC
4.4 and significantly ahead of other FSF or Linaro 4.5 based
compilers. This fixes the long-standing problems of ARMv5 being
faster than ARMv7 and 4.4 based compilers being faster than 4.5 based
ones.
Linaro GCC 4.6 is the third release in the 4.6 series. Based off the
latest GCC 4.6.0+svn173480, it adds new optimisations, vectoriser
improvements, and continues with the merge of many ARM-focused
changes.
Interesting changes include:
* Updates to 4.6.0+r173417
* Brings forward more of the performance improvements from Linaro GCC 4.5
* Adds support for swing-modulo scheduling
* Fixes precompiled header support on recent Linux kernels
* Changes the default NEON vector size to quads
* Adds auto-detection of the best vector size
* Adds vectorisation improvements due to better if-conversion
Fixes:
* LP: #714921: Uses an unreasonable amount of memory to compile QEMU on armel
* LP: #723086: Test regressions in the Fortran test suite
The source tarball is available from:
https://launchpad.net/gcc-linaro/+milestone/4.5-2011.05-0https://launchpad.net/gcc-linaro/+milestone/4.6-2011.05-0
Downloads are available from the Linaro GCC page on Launchpad:
https://launchpad.net/gcc-linaro
Mailing list: http://lists.linaro.org/mailman/listinfo/linaro-toolchain
Bugs: https://bugs.launchpad.net/gcc-linaro/
Questions? https://ask.linaro.org/
Interested in commercial support? inquire at support(a)linaro.org
-- Michael
Hi All,
This is based upon gcc version 4.5.3 (20110221 pre-release)
Any help appreciated
This shows a bug in the Linaro gcc compiler with the Arm NEON
vset_lane intrinsic
Note in the objdump that the vmov.8 instruction that places the
value in the vector for the non-q version uses 1 where it should use
2 and 3:
18: ee410bb0 vmov.8 d17[1], r0
1c: ee420bb0 vmov.8 d18[1], r0
20: ee400b90 vmov.8 d16[0], r0
3c: ee440bb0 vmov.8 d20[1], r0
For the q version the vmov.8 instructions are correct:
40: ee420bf0 vmov.8 d18[3], r0
54: ee420bd0 vmov.8 d18[2], r0
64: ee400b90 vmov.8 d16[0], r0
70: ee420bb0 vmov.8 d18[1], r0
/* Source code */
#include <arm_neon.h>
static uint8x8_t vec[5]
static uint8x16_t qvec[5];
void set(uint8_t value)
{
vec[1] = vset_lane_u8(value, vec[0], 3);
vec[2] = vset_lane_u8(value, vec[0], 2);
vec[3] = vset_lane_u8(value, vec[0], 1);
vec[4] = vset_lane_u8(value, vec[0], 0);
qvec[1] = vsetq_lane_u8(value, qvec[0], 3);
qvec[2] = vsetq_lane_u8(value, qvec[0], 2);
qvec[3] = vsetq_lane_u8(value, qvec[0], 1);
qvec[4] = vsetq_lane_u8(value, qvec[0], 0);
}
Thx
Lee
Hi there. The 2011.05 release has been spun and is testing up well.
The 4.5 and 4.6 branches are now open so feel free to commit any
approved patches.
-- Michael
Progress:
* Attended LDS from 9th -14th May.
Plans:
* Look at Thumb2 performance blueprint and break it down.
* Investigate more headroom for SPEC2k starting this week.
* Thumb2 performance call this week.
Meetings:
* 1-1s
* T2 performance.
Hello,
- Attended Linaro@UDS.
- SMS patches to support ARM do-loop pattern got approved in mainline
and merged into gcc-linaro 4.6 and 4.5.
- Sent merge request for two patches in trunk. (SMS_fixes_for_unfreed_memory)
- Implemented an optimization for the stage-count and now testing it.
Thanks,
Revital
== Last week ==
* At Linaro@UDS; I am still typing this in Budapest. Sparingly did some
work between sessions.
* PR42017, ARM LR register not being used. Discussed the patch with
Richard Sandiford at LDS. Re-tested a bit and about to resend a revised
patch according to his suggestion.
* LP:748138, redirect_jump() ICE. Committed patch to CS stable and
trunk. Submitted merge request to Linaro 4.5 branch.
* LP:689887. Got some suggestions from Revital on how to debug the
bootstrap failure caused by my patch, will look into applying it.
== This week ==
* Taking Monday off, I'll be flying back to Taiwan on Tuesday.
* Continue with issues after getting home.
RAG:
Red:
Amber:
Green: 1105 work item status 99% complete with 2 weeks to go
Current Milestones:
| Planned | Estimate | Actual |
qemu-linaro 2011-05 | 2011-05-19 | 2011-05-19 | n/a |
close out 1105 blueprints | 2011-05-28 | 2011-05-28 | |
complete 1111 planning | 2011-05-28 | 2011-05-28 | |
Historical Milestones:
finish qemu-cont-integration | 2011-01-25 | 2011-01-25 | handed off |
first qemu-linaro release | 2011-02-08 | 2011-02-08 | 2011-02-08 |
qemu-linaro 2011-03 | 2011-03-08 | 2011-03-08 | 2011-03-08 |
qemu-linaro 2011-04 | 2011-04-21 | 2011-04-21 | 2011-04-21 |
== merge-correctness-fixes ==
* some of my pending patches have been applied; a number of others are
still under discussion or need further work/testing
== other ==
* We won't be making a qemu-linaro 2011-05 release, since there are no
changes since the 2011-04 release (due to a combination of the Easter
holiday and UDS week).
* Attended UDS
* almost all 1105 work items either complete or confirmed postponed
to next cycle
* Good progress on fleshing out blueprints for next cycle:
https://wiki.linaro.org/PeterMaydell/Qemu1111
Current qemu patch status is tracked here:
https://wiki.linaro.org/PeterMaydell/QemuPatchStatus
Absences:
(maybe) 15-16 August: QEMU/KVM strand at LinuxCon NA, Vancouver
[LinuxCon proper follows on 17-19th]
Last week, Ramana pointed me at an upstream bug report about the
inefficient code that GCC generates for vzip, vuzp and vtrn:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48941
It was filed not longer after the Neon seminar at the summit;
I'm not sure whether that was a coincidence or not.
I attached a patch to the bug last week and will test it this week.
However, a cut-down version shows up another problem that isn't related
specifically to intrinsics. Given:
#include <arm_neon.h>
void foo (float32x4x2_t *__restrict dst, float32x4_t *__restrict src, int n)
{
while (n--)
{
dst[0] = vzipq_f32 (src[0], src[1]);
dst[1] = vzipq_f32 (src[2], src[3]);
dst += 2;
src += 4;
}
}
GCC produces:
cmp r2, #0
bxeq lr
.L3:
vldmia r1, {d16-d17}
vldr d18, [r1, #16]
vldr d19, [r1, #24]
vldr d20, [r1, #32]
vldr d21, [r1, #40]
vldr d22, [r1, #48]
vldr d23, [r1, #56]
add r3, r0, #32
vzip.32 q8, q9
vzip.32 q10, q11
subs r2, r2, #1
vstmia r0, {d16-d19}
add r1, r1, #64
vstmia r3, {d20-d23}
add r0, r0, #64
bne .L3
bx lr
We're missing many auto-increment opportunities here. I think this
is due to the limitations of GCC's auto-inc-dec pass rather than to
a problem in the ARM port itself. I think there are two main areas
for improvement:
- The pass only tries to use auto-incs in cases where there is a
separate addition and memory access. It doesn't try to handle
cases where there are two consecutive memory accesses of the
form *base and *(base + size), even if the address costs make
it clear that post-increments would be a win.
- The pass uses a backward scan rather than a forward scan,
which makes it harder to spot chains of more than two accesses.
FWIW, I've got fairly specific ideas about how to do this.
Unfortunately, the pass is in need of some TLC before it's
easy to make changes. So in terms of work items, how about:
1. Clean up the auto-inc pass so that it's easier to modify
2. Investigate improvements to the pass
3. Submit the changes upstream
4. Backport the changes to the Linaro branches
I wrote some patches for (1) last week.
I'd estimate it's about 2 weeks' work for (1) and (2). (3) and (4)
would hopefully be background tasks. The aim would be for something
like:
.L3:
vldmia r1!, {d16-d17}
vldmia r1!, {d18-d19}
vldmia r1!, {d20-d21}
vldmia r1!, {d22-d23}
vzip.32 q8, q9
vzip.32 q10, q11
subs r2, r2, #1
vstmia r0!, {d16-d19}
vstmia r0!, {d20-d23}
bne .L3
bx lr
This should help with auto-vectorised code, as well as normal core code.
(Combining the vldmias and vstmias is a different topic. The fact that
this particular example could be implemented using one load and one
store is to some extent coincidental.)
Richard