* Linaro
Continued looking at the lower-subreg problem (that the lowering is only
really intended for 32-bit targets that don't have 64-bit operation).
Without NEON this is only really a problem for "ldrd/strd" 64-bit loads
and stores: the patterns are all (or should be all) written such that
they only access DImode regs via SUBREG, so it all works well. When
adding NEON 64-bit this becomes less clear: many operations must remain
in DImode without SUBREG until after register allocation. The
lower-subreg passes mostly cope with this well enough, but it has some
features that attempt to lower zero_extends, certain shifts, and
pseudo-reg copies, unconditionally. I've been investigating what happens
if I disable the pseudo-reg copy "optimization" in the first
lower-subreg pass. As I expected, in many cases it actually leads to
smaller code (more use of LDRD/STRD) without NEON, and even better
results with NEON. Unfortunately I have found so counter-examples, so
I'm trying to figure out what's going on there: for some reason reload
goes crazy and starts spilling things to the stack, even though I can't
see that more registers are required. More investigation required.
Richard E approved my NEON-immediates patch for upstream. I don't have
time to commit it with proper care this week, so that's delayed to next
week.
* Other
Vacation Monday and Tuesday. Had a fun long weekend in Cornwall with my
family.
Progress
Away last week - nothing to report. Will be in BST + 4:30 timezone
this week.
Plans
* Finish off the VFP addressing modes patch.
* Follow up on iterations idiom patch upstream
* Pursue backporting gnu_unique_object upstream.
* Look at some of the existing blueprints and start discussions around
prioritizing this.
* Investigate some of the SEGVs with h-c partitioning in the future.
Hi Zhenqiang. Ubuntu Precise is now out and has switched to hard
float by default. I want to do the same for the next binaries
release. Here's the work that needs to be done:
* Bring in the new sysroot
* Change the triplet to arm-linux-gnueabihf
* Change GCC's configure so it recognises the new triplet
* Change the default float ABI to hard
We should include a soft float (not softfp) multlib libgcc for those
who use the binary toolchain to build bare metal programs like u-boot
or the kernel. They don't use floating point, but the linker will
complain about mixed calling conventions.
I've updated make-sysroot.sh and spun an experimental sysroot at
http://people.linaro.org/~michaelh/incoming/precise-sysroot-armhf-r0.tar.bz2.
Hopefully we'll use Marcin's ones instead.
Matthias has patches for many of these changes. Let's talk about it
at tonight's meeting.
Could you start on these? I'd like the changes done within two weeks
so we have plenty of time to test.
-- Michael
Summary:
* Linaro binary toolchain 2012.04 release.
* Code size benchmark analysis.
Details:
1. Validate and bug fix for linaro binary toolchain 2012.04 release.
2. Investigate code size regressions in 4.7
Find more regression cases due to loop invariant hoisting, and tests
should we can reduce some codesize with option
-fno-move-loop-invariants. Some regressions are due to
pass_reorder_blocks, which is disabled for -Os in 4.7. For some cases,
ivopt will introduce to more codes and function inline might lead to
more spilling. We also try linaro 2012.04 baremetal build. But there
is a few bytes regression compared with 4.7 trunk.
3. Setup the qemu env following Michael’s instructions
(https://wiki.linaro.org/MichaelHope/Sandbox/QEMUCrossTest) and tests
show it work.
4. Try to build gcc-linaro-2012.04 configured with "--with-fpu=neon
--with-float=hard" based on a precise sysroot
(precise-sysroot-armhf-r0.tar.bz2) from
http://people.linaro.org/~michaelh/incoming
* Linux build is OK without any change.
* Mingw32 build reports error. After removing the code, the build PASS.
[ERROR] .../libc/usr/include/arm-linux-gnueabihf/sys/types.h:117:19:
error: two or more data types in declaration specifiers
Plans:
* Investigate other code size regressions in 4.7.
Planed leaves:
* Labor Day holiday: April 30 and May 1.
Best regards!
-Zhenqiang
Hi,
OpenEmbedded-Core/meta-linaro:
* pushed support for Linaro GCC 4.6.4 2012.04 and for the
2012.03-20120326 binary toolchain
* updated the wiki
* created a branch to support GCC 4.7
* built several images using several GCC 4.7 based toolchains (OE,
linaro 4.7.1, binary toolchain 2012-04)
* minimal images are working
* sato image shows splash screen forever (won't bring up the
matchbox window manager)
* only when using the binary toolchain 2012-04 ? -> needs to be
analyzed
* Qt 4.8 requires small patches (posted and merged into
oe-core/master-next)
Libunwind:
* User reports issues when backtracing through signal frame
* reason: the lib falls back on APCS frame parsing which usually
segfaults on EABI systems
* provided debugging hints and insights on how libunwind handles
signal frames
* posted a reduced testcase
Regards,
Ken
== GCC ==
* Fixed regression (ICE when building EEMBC) with patch to use vld1/vstd1
instead of vldm/vstdm for vector moves. Updated merge request and
restarted testing.
* Implemented patch to fix LP #959242; backported to Linaro GCC 4.7 and
created merge request for regression testing and benchmarking.
* Ongoing discussions on -fsched-pressure; worked on patch to enable it
by default on s390.
* Ongoing work on improving end-of-loop value computation.
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E.
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk
Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294
Current Milestones:
|| || Planned || Estimate || Actual ||
||cp15-rework || 2012-01-06 || 2012-06-23 || ||
Historical Milestones:
||initial-a15-system-model || 2012-01-27 || 2012-01-27 || 2012-01-17 ||
||qemu-kvm-getting-started || 2012-03-04?|| 2012-03-04 || 2012-02-01 ||
== cp15-rework ==
* have had no review feedback on this patchset yet, and we're now
into QEMU 1.1 hardfreeze. I'm hoping to get review during the
month we're in freeze and then commit early June when 1.2 opens.
However since that coincides with me being on holiday I've set the
Estimate date to allow for that plus a week or so of buffer.
Actual further work required is probably 1 week max, most of this
is waiting-for-other-people or being-away.
== kvm-boot-wrapper ==
* pushed dtb support changes to git repo
== other ==
* basic QEMU side support for the VGIC kernel implementation
Marc Z has been working on. I have something that seems to
work but it's still a bit prototype and missing features.
* investigated why my guest kernel was causing kvm to exit:
turns out that we haven't implemented the kernel support for
gracefully not using KVM if not booted in hyp mode, so trying
to boot a KVM-aware kernel as a KVM guest doesn't work yet.
* sent off the final few patches which I want in QEMU 1.1 before
hardfreeze at the start of next week
* tested a beagle board linaro snapshot image (it panics on
bootup, LP:989737)
* trying to clarify the KVM todo list...
-- PMM
Greetings,
I successfully built and booted Linux 3.1 for the beaglebone (TI am335x) using the 4.5.2 toolchain. I rebuild the same kernel using the same config with the 4.6.3 toolchain, but the board hangs at "Uncompressing Linux... done, booting the kernel."
I halt the beaglebone at the U-Boot prompt and have it download and run uImage from an NFS mounted RFS.
The Linaro 4.5.2 toolchain was installed in my Ubuntu 11.04 distro using aptitude.
The Linaro 4.6.3 toolchain binaries, downloaded via Launchpad, were installed into my own tools directory.
4.5.2 INFO:
./gcc-linaro-arm-linux-gnueabi-2012.01-20120125_linux/bin/arm-linux-gnueabi-gcc --version arm-linux-gnueabi-gcc (crosstool-NG linaro-1.13.1-2012.01-20120125 - Linaro GCC 2012.01) 4.6.3 20120105 (prerelease) Copyright (C) 2011 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Built with: $ make ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- uImage
4.6.3 INFO:
./bin/arm-linux-gnueabi-gcc --version arm-linux-gnueabi-gcc (crosstool-NG linaro-1.13.1-2012.01-20120125 - Linaro GCC 2012.01) 4.6.3 20120105 (prerelease) Copyright (C) 2011 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Built with: $ make ARCH=arm CROSS_COMPILE=<mytoolsdir>/gcc-linaro-arm-linux-gnueabi-2012.01-20120125_linux/bin/arm-linux-gnueabi- uImage
Could it have anything to do with my binutils and/or U-Boot version? Any debugging tips?
U-Boot INFO:
U-Boot# version
U-Boot 2011.09-00010-g81c8c79 (Feb 13 2012 - 14:48:03) arm-angstrom-linux-gnueabi-gcc (GCC) 4.5.4 20111126 (prerelease) GNU ld (GNU Binutils) 2.20.1.20100303
Thanks for reading this far.
We use QEMU to test programs built by the toolchain binary release for
correctness. I've written up the instructions for spinning up your
own at:
https://wiki.linaro.org/MichaelHope/Sandbox/QEMUCrossTest
It's focused on simplicity - getting a running, SSH only Cortex-A9 up
and going as soon as possible. It's not the latest, not graphical,
and doesn't replace the deeper documentation at:
https://wiki.linaro.org/Resources/HowTo/Qemu
-- Michael
The Linaro Toolchain Working Group is pleased to announce the 2012.04
release of the Linaro Toolchain Binaries, a pre-built version of
Linaro GCC and Linaro GDB that runs on generic Linux or Windows and
targets the glibc Linaro Evaluation Build.
Uses include:
* Cross compiling ARM applications from your laptop
* Remote debugging
* Build the Linux kernel for your board
What's included:
* Linaro GCC 4.7 2012.04
* Linaro GDB 7.4 2012.04
* A statically linked gdbserver
* A system root
* Manuals under share/doc/
The system root contains the basic header files and libraries to link
your programs against.
Interesting changes include:
* Switches to the new GCC 4.7 based Linaro GCC
* Adds native language support to most of the programs
* Adds the mudflap, ssp, and gomp runtime libraries
* Enables gnu_unique_object support in GCC
Please see the README about running 4.7 based programs on a system
with 4.6 based runtime libraries.
The Linux version is supported on Ubuntu 10.04.3 and 11.10, Debian
6.0.2, Fedora 16, openSUSE 12.1, Red Hat Enterprise Linux Workstation
5.7 and later, and should run on any Linux Standard Base 3.0
compatible distribution. Please see the README about running on
x86_64 hosts.
The Windows version is supported on Windows XP Pro SP3, Windows Vista
Business SP2, and Windows 7 Pro SP1.
The binaries and build scripts are available from:
https://launchpad.net/linaro-toolchain-binaries/trunk/20yy.mm
Need help? Ask a question on https://ask.linaro.org/
Already on Launchpad? Submit a bug at
https://bugs.launchpad.net/linaro-toolchain-binaries
On IRC? See us on #linaro on Freenode.
Other ways that you can contact us or get involved are listed at
https://wiki.linaro.org/GettingInvolved.
-- Michael
Hi,
GDB for Android:
* Wrote patch for bionic adding .note.ABI-tag to the crtbegin
object files. Sent to Google engineers, They think it's
going in the right direction and I will submit via gerrit.
* Isolated Android-related changes in diff between AOSP's
GDB 7.3.x and FSF GDB 7.3. There are a lot of unrelated
changes there.
* Sent e-mail asking for comments about the Android extension
to .note.ABI-tag to the LSB and binutils mailing lists.
Got only one e-mail of feedback.
--
[]'s
Thiago Jung Bauermann
Linaro Toolchain Working Group
Hi,
* catching up with emails
* rebased against current OE-core
* OE is planning a release in april (following the yocto schedule)
* noticed the libc of our binary toolchain is lacking i18n
* caused a packaging issue for meta-linaro but easy to workaround
* contents of the i18n folder are only used at runtime (not relevant
for compiliation time)
* updates in order to support for the 2012.03-20120326 binary toolchain
* locations of sibgcc_s.so and libstdc++.so are different
* added support for linaro gcc 2012.04
* looked at the meta-linaro patches made by Khem
Regards,
Ken
The EEMBC supplied build system has a couple of bugs with library
order (putting -lrt at the start of the command line instead of the
end) and the harness library (depending on THOBJS but linking against
THLIB). I've fixed these and pushed to our private branch.
Ulrich, I've spawned builds of your vld1.64 branch and its ancestor.
Once those are done I'll spawn a benchmark run against them.
-- Michael
Summary:
* Code size benchmark analysis.
* Linaro binary toolchain 2012.04 release.
Details:
1. Tuning the heuristic to assign register for copies.
* Take the CONFLICT_HARD_REGS and HARD_REG_COSTS of copies into
account when conflict_costs is NULL in
update_conflict_hard_regno_costs, which handles the following case:
a = ...
...
b = a // a can be assigned with r3 or r5 which have the same min_cost.
... // b is conflicted with r3 or the cost of r3 is very high
= b
In this case, if a is assigned with r3, b can not be assigned with
r3, so the copy "b = a" can not be optimized. When taking the
CONFLICT_HARD_REGS or HARD_REG_COSTS of b into account, we can assign
a with r5.
2. Linaro binary toolchain 2012.04 release.
* Update gdb/TOOLCHAIN_PKGVERSION/README to 2012.04.
* Test workaround localization patch to fix lp:918926.
* Local build and tests show the toolchain can find the
corresponding .mo file.
* But if the host system does not have the corresponding font
packages, it will show some mess characters.
* gdb does not have gdb.mo.
3. Investigate code size regressions in 4.7.
* Loop invariant hoisting might increase register pressure, which
leads to much more spilling.
Plans:
* Finalize Linaro binary toolchain 2012.04 release
* Investigate other code size regressions in 4.7.
Planed leaves:
* Labor Day’s holiday: April 30 and May 1.
Best regards!
-Zhenqiang
Hi,
I have been using CodeSourcery tool-chains since forever, and I
decided to try Linaro's tool-chain. So far I haven't had any issues,
except one:
libgcc_s.so.1 is located in arm-linux-gnueabi/lib, which is not
available from arm-linux-gnueabi/libc. The CodeSourcer tool-chain has
all those files inside the 'libc' directory, which is useful for
Scratchbox2, because you need to specify a "target root" which is
basically the libc directory which acts in a similar fashion to /.
I've managed to create a rule to workaround this issue, but I wonder
why are those files in that location, why not in 'libc/lib'?
Cheers.
--
Felipe Contreras
=== Progress ===
* Worked on the VFP addressing modes patch upstream. Handled most
comments. Final version has finished testing and looks almost ready to
commit.
* Investigated an issue with min type transformations for loop
terminating conditions. Wrote up a small patch which appears to do the
right thing - passed a bootstrap on x86 but that probably means it
never got triggered :( .
The particular case of interest was not vectorizing :
#define min(x,y) ((x) <= (y) ? (x) : (y))
int a[256] __attribute__((aligned (16)));
int b[256] __attribute__((aligned (16)));
int c[256] __attribute__((aligned (16)));
void foo (int x, int y)
{
int i;
for (i = 0;
i < x && i < y;
// i < min (x, y);
i++)
a[i] = b[i] * c[i];
}
but vectorizing the commented region. I've tentatively worked out a
fix in tree-loop-im.c which looks like a bit of a grotesque hack ....
* Attended LLVM devcon in Week 15. Useful and interesting and conference.
== Plans ==
* Pursue backporting gnu_unique_object upstream.
* Look at some of the existing blueprints and start discussions around
prioritizing this.
* Investigate some of the SEGVs with h-c partitioning.
* Finish off the VFP addressing modes patch.
Absences.
* 03 May 2012 - 08 May 2012.
* Linaro Connect Q2.12 - May 28 - June 1 -
== GCC ==
* Checked in backport patch to fix LP #972648
into Linaro GCC 4.6.
* Checked in fix for incorrect vld alignment hints to
FSF mainline and 4.7 branch.
* Investigated options to fix stack re-alignment.
* Ongoing investigation of LP #959242: design problem in vectorizer
pattern detection logic causes ICE in certain cases where an
original sequence is recognized as part of *two* potential patterns
simultaneously.
* Ongoing work on improving end-of-loop value computation.
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E.
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk
Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294
Current Milestones:
|| || Planned || Estimate || Actual ||
||cp15-rework || 2012-01-06 || 2012-04-10 || ||
Historical Milestones:
||initial-a15-system-model || 2012-01-27 || 2012-01-27 || 2012-01-17 ||
||qemu-kvm-getting-started || 2012-03-04?|| 2012-03-04?|| 2012-02-01 ||
== cp15-rework ==
* sent out v1 patchseries for cp15 cleanup to the mailing list, finally
* handled review comments on drop-cpu-reset-model-id patchseries
== kvm-boot-wrapper ==
* imported libfdt device tree library and Dave Martin's code from
the big.LITTLE boot-wrapper, to add support to the KVM boot wrapper
for handling device tree blobs. Patch series sent, will probably
commit next week unless there are review issues
* that will more or less wrap this blueprint up
== other ==
* usual upstream patch review; QEMU 1.1 soft feature freeze was start
of this week, hardfreeze will be beginning of May, people (including
me :-)) are trying to squeeze things in under the wire...
-- PMM
* Linaro GCC
Investigated the latest test failure for the neon-extend patch. The test
for GCC Bugzilla 43137 is failing again. It turns out to be because my
switching sign_extend to use a DImode output, rather than two SImode
subreg outputs has exposed a bug in the lower-subreg pass. I've spent
most of the week trying to figure what can be done about this and
discussing the problem with Richard Sandiford upstream.
Richard Earnshaw approved my neon-negate patch. That patch depends on
the neon-immediates patch which is not approved yet, so I'll have to
wait to commit it.
Continued looking at NEON-v-core register allocation. Not much progress
this week though.
* Other
Created a patch to make the GCC RTL dumps easier to diff. Posted it
upstream and discussed it on the list.
Vacation on Friday.
* Next week
Vacation Monday and Tuesday
Hello,
I've been following up on the discussion we had on Monday regarding stack
alignment, and noticed that I had mis-remembered the current state of
affairs. Ramana asked me on Tuesday to provide a write-up of the actual
status, so here we go ...
To summarize the background of the problem: on ARM, the incoming stack
pointer is only guaranteed to be aligned to an 8 byte boundary. This means
that objects on the stack (local variables, spill slots, temporaries etc.)
cannot easily be aligned to more than 8 bytes. This can potentially cause
problems in two situations:
1) The object's default alignment (according to its type) is larger than 8
bytes
2) The object has a forced non-default alignment that is larger than 8
bytes
The first situation should in theory never appear, since according to the
ARM ABI all types have a default alignment of at most 8 bytes. However,
due to the current mix-up in GCC, vector types actually are considered to
have a 16-byte alignment requirement in GCC.
The second situation can only appear with local variables that are declared
using attribute ((aligned)).
We had discussed on Monday that we need to fix the second situation, since
this can always occur and is supported on other platforms. By doing so,
we would then automatically fix the first situation as well.
However, this reasoning turns out to be incorrect. There are currently in
GCC *two* completely separate mechanisms that can be used to align objects
on the stack to larger than the ABI guaranteed stack pointer alignment:
A) Re-alignment of the full stack frame. This is what is used by the Intel
back-end (and only the Intel back-end). At function entry, generated code
will align the stack pointer itself to whatever is necessary to fulfil
alignment requirements of all objects on the stack. This may necessitate
follow-on changes: the frame pointer, if there is one, will likewise need
to be aligned at runtime. Also, since incoming stack arguments are now no
longer at a fixed offset relative to the stack pointer *or* frame pointer
in some cases, we might need an extra register as argument pointer. This
method allows extra alignment for *any* object on the stack, but needs
significant back-end support in order to be enabled on any non-Intel
architecture.
B) Dynamic allocation of selected stack variables. This is implemented by
common code with no involvement of the back-end. In effect, the code in
cfgexpand.c:expand_stack_vars that decides on how to allocate local
variables on the stack will remove all variables that require extra
alignment and place them into an extra structure. Generated prologue code
will then in effect dynamically allocate and align that structure on the
stack, and just store a pointer to it as "variable" into the normal stack
frame. All other areas of the frame are unaffected. Since this method
just simulates code the programmer could have written themselves using
alloca, it does not require *any* back-end support and is enabled by
default everywhere. However, it only works for regular local variables,
and not for any other objects on the stack.
Objects on the stack *except* local variables always use default alignment.
Since on most platforms, except Intel and *currently* ARM, the ABI stack
pointer alignment is sufficient to implement default alignments, method B)
as above is able to fulfil all stack alignments. Intel uses method A), so
they're also OK. In effect, it's only ARM due to the vector type
alignment problem that runs into the situation that neither method works.
Under those circumstances, given that:
- we want to fix vector type alignment in order to become ABI compliant
- once we've fixed this, we're in the same situation as other platforms and
method B) already fixes stack alignment problems
- implementing method A) is therefore both quite involved *and* actually
superfluous
I'd now rather recommend that we *don't* try to implement method A) (full
stack-frame re-alignment) on ARM.
Comments?
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E.
IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz | Geschäftsführung: Dirk
Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294
Hi,
GDB for Android:
* Continued conversation about .note.ABI-tag.
* Committed upsream the second of three patches for building
gdbserver on Android.
* Continued working on a crtbrand.c for bionic.
2012.04 release:
* Tested the gdb-linaro/7.4 branch on armv5tel, x86 and x86_64, both
natively and remotely. Compared results against the 2012.02 release.
* Ran the release process, made the release.
--
[]'s
Thiago Jung Bauermann
Linaro Toolchain Working Group
Hi,
GDB for Android:
* Investigated .note.ABI-tag generation in glibc and FreeBSD.
Wrote crtbrand.c for bionic based on FreeBSD's crtbrand.c.
Tried to integrate it with bionic's build system.
* Contacted Google engineers about adding .note.ABI-tag to Android
binaries.
2012.04 release:
* Started working on the release. Applied the latest patches from
the FSF 7.4 branch to gdb-linaro/7.4 and also my patches to compile
gdbserver with the Android toolchain.
--
[]'s
Thiago Jung Bauermann
Linaro Toolchain Working Group
* Linaro GCC
Spun release tarballs for Linaro GCC 4.6 and 4.7. Pushed them to
Michael's servers and launched the testing.
Continued trying to get the 64-bit NEON stuff to work. The negdi2 patch
needed some reworking following upstream review, and the extend patch
has mysteriously reintroduced a performance regression when the
operation is done in core-registers, which needs to be solved, but the
other patches seem ok at the moment, although the shift stuff is blocked
on benchmarking.
I've begun looking at how my changes affect spec2000, and whether the
register allocation could be done better. Unfortunately, given that the
test systems are currently giving spec results with a low level of
consistency (and therefore confidence), this has mostly been by visual
inspection, so far.
* Other
Public Holiday on Monday.
I have a cross toolchain I configured with "--with-arch=armv7-a --with-cpu=cortex-a9 --with-tune=cortex-a9" and I want the linker to emit armv4t compatible thumb interworking, but I can't seem to get it to.
I noticed that if I create a armv4t toolchain with "--with-arch=armv4t --with-cpu=arm7tdmi --with-tune=arm7tdmi" and then I pass "--use-blx" to the linker it will emit armv7 thumb interworking. There doesn't seem to be any inverse "--no-use-blx" type switch though. Is this a bug/limitation of the linker or am I misunderstanding something?
-Allen
nvpublic