[ Also posted to debian-arm; not cross-posted to avoid subscription
complaints... ]
Hi folks,
We're currently carrying patches in glibc in Debian (and Ubuntu) that
I wrote which are used to work out whether an ELF binary is hard-float
or soft-float. We're using these to allow us to do the right thing on
a multi-arch system, which is to pick a consistent set of binaries
(programs and libraries) at runtime; if you try to mix binaries using
different ABIs, you're prone to all kinds of weird and wonderful
results but generally badness occurs.
Upstream glibc have generally not been welcoming of these patches, and
I understand this; the approach taken (reading ARM-specific build
attributes) is far from clean and doesn't fit well in the design of
ld.so in particular. So, I've been looking into alternative methods
for achieving the goal of identifying ABI. After a couple of false
starts and discussion with some of the helpful toolchain and ABI folks
in ARM, I think we have a solution that will work well in the long
term. I just wish we'd thought about this *way* back when we first
started the armhf port, as it would have been much easier to work on
and standardise this back then. Modulo availability of time machines,
there's not much we can do on that front... :-)
What I'm proposing is to use two new values in the OSABI field in the
ELF header:
#define ELFOSABI_LINUX_ARM_AEABI_SF 65
#define ELFOSABI_LINUX_ARM_AEABI_HF 66
and use these values in the future for soft- and hard-float binaries
so that can unambiguously identify them.
There's already precedent for binaries using different values in this
field, with support in glibc for parsing and understanding
them. Adding more possible values is quite easy, assuming that the
maintainers are amenable. I'm about to post a similar message there.
I have a plan of attack for how to make a staged switch over,
deliberately to minimise any potential compatibility problems. See the
attached doc for that. It's deliberately not very specific in terms of
timeline, as that's something I'm hoping to get feedback
about. Comments very welcome; please point out if you think there are
problems with this approach, or if there are any more implementations
of toolchain / linker that will need to be addressed.
Cheers,
--
Steve McIntyre steve.mcintyre(a)linaro.org
<http://www.linaro.org/> Linaro.org | Open source software for ARM SoCs
For reference, if you see link time errors about a missing
'__dso_handle' symbol when building Android, then check if you're
using any global class instances in your multimedia libraries.
Each shared library has a __dso_handle symbol which is filled in on
load by the dynamic loader. Global class instances use this unique
value to make sure the destructor is called when the library is
unloaded. The symbol itself is defined in crtbegin_so.o, but the
multimedia rules forbid using this for an unknown reason. Either
create your global instances in a different way or change the
multimedia rules :)
-- Michael
== Progress ==
* Fixed PR54051
* Improved neon intrinsics testsuite. While still not an execution
based testsuite atleast we get compile time tests that are sensible C.
Exposed issues - wrote patches.
that improve vabal , vaba intrinsics. Fix an issue with costs,
fixed an issue with splitters for large mode moves for Neon with
hardfp port etc.
* Some upstream patch and bug review.
* Fixed a minor testism for vld1q_s64 tests.
== Plans ==
* Write a patch to check md5sums between local tarball and uploaded
tarball in the release script.
* Look at auto-inc-dec patches more and investigate benchmark results.
* Submit intrinsics work upstream and sheperd it through.
* Finish looking at PR53664 and clean up testsuite further.
* Follow-up on my intrinsics patches upstream.
== Absences ==
* 17th Sept - 5th Oct - Vacation approved.
== GCC ==
* Checked in fix fix for incorrect pool placement with -O0
by splitting all insns in machine-dependent reorg.
* Created blueprint to investigate -funroll-loops and
-fvariable-expansion-in-unroller.
* Took over patch to change vector alignment to 8 from
Richard; reworked according to review comments; found
and fixed two vectorizer bugs triggered by the change;
submitted for mainline approval.
* Continued investigation of reload bug reported by ARM.
Posted potential fix to gcc-patches for discussion.
== GDB ==
* Worked on fixing HW breakpoint/watchpoint regressions.
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E.
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk
Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294
Current Milestones:
|| || Planned || Estimate || Actual ||
||cp15-rework || 2012-01-06 || 2012-06-23 || 2012-06-24 ||
||a15-lpae-support || 2012-07-13 || 2012-07-20 || 2012-07-20 ||
||clean-up-kvm-patches || || || ||
||track-kvm-abi-changes || || || ||
||fake-trustzone || || || ||
Overall KVM plan for 'do by end August': QEMU parts of this are a mix
of clean-up-kvm-patches and track-kvm-abi-changes blueprints, mostly.
http://cards.linaro.org/browse/CARD-167
== clean-up-kvm-patches ==
* sent patch series to try to clean up some QEMU kvm x86isms
that block cleanup of some of the ARM KVM support code;
dealt with review comments and sent v2
== other ==
* started on cleaning up the QEMU benchmarking setup so we
can put it on a server machine somewhere
* fixed a crash in the QEMU ARMv7M models which was introduced
by one of my earlier GIC/NVIC refactoring series
* upstream review/maintainer duties
KVM blueprint progress tracker:
http://apus.seabright.co.nz/helpers/backlog?group_by=topic&colour_by=state&…
-- PMM
FYI GCC trunk r189808 fails to build with a bootstrap comparison error:
Comparing stages 2 and 3
warning: gcc/cc1-checksum.o differs
warning: gcc/cc1plus-checksum.o differs
warning: gcc/cc1obj-checksum.o differs
warning: gcc/cc1objplus-checksum.o differs
Bootstrap comparison failure!
arm-linux-gnueabi/libgcc/unwind-arm.o differs
arm-linux-gnueabi/libgcc/unwind-arm_s.o differs
189575 was fine on hard float. 189745 is fine on softfp.
-- Michael
---------- Forwarded message ----------
From: Linaro Toolchain Builder <michael.hope+cbuild(a)linaro.org>
Date: 25 July 2012 15:59
Subject: [cbuild] gcc-4.8~svn189808 armv7l failed
To: "michael.hope+notify(a)linaro.org" <michael.hope+notify(a)linaro.org>
ursa3 finished running job gcc-4.8~svn189808 on
armv7l-precise-cbuild348-ursa3-cortexa9hfr1.
The results are here:
http://builds.linaro.org/toolchain/gcc-4.8~svn189808
This email is sent from a cbuild (https://launchpad.net/cbuild) based
bot which is administered by Michael Hope <michael.hope(a)linaro.org>.
Hello Ramana,
For your PGO list:
* please note that I've been working on PGO for switch code, and also
for chains of if-statements with a common condition variable (with Tom
de Vries)
* turning conditional execution off will not make a difference, your
profile information will be exactly the same. Profile instrumentation
happens very early in the pipe line (on purpose, PGO is more
accurately "coverage guided optimization", not profiling in the
prof/gprof/oprofile sense). And the parts of the CFG that have profile
instrumentation cannot be if-converted anyway.
* you can use the script "analyze_brprob" in contrib/ to measure the
accuracy of the branch predictors. The script needs some TLC, fixing
it is on my TODO list but let me know if linaro folks are going to
take care of that. You'll find that the predictors are heavily tuned
towards the original Opteron, I'm not aware of much tuning for other
architectures.
* The heuristics for profile-guided optimizations are also not tuned
for arm. In the past we found that some params have more influence
than others (the TRACER* parameters for example).
Hope this helps,
What do you mean with "Only conditionalise those parts that benefit"?
Ciao!
Steven
== Progress ==
* Looking at auto-inc-dec patches.
* sched-pressure now on by default in FSF 4.8
* Background look into neon costs and vdup improvements.
* Some upstream patch review.
* Discovered http://gcc.gnu.org/PR54051 while testing a neon
intrinsics patch and wrote a patch to fix it.
== Plans ==
* Write a patch to check md5sums between local tarball and uploaded
tarball in the release script.
* Look at auto-inc-dec patches more and investigate benchmark results.
* Finish submitting PR54051 patch upstream.
* Finish vdup folding patch.
The Linaro Toolchain Working Group is pleased to announce the 2012.07
release of the Linaro Toolchain Binaries, a pre-built version of
Linaro GCC and Linaro GDB that runs on generic Linux or Windows and
targets the glibc Linaro Evaluation Build.
Uses include:
* Cross compiling ARM applications from your laptop
* Remote debugging
* Build the Linux kernel for your board
What's included:
* Linaro GCC 4.7 2012.07
* Linaro GDB 7.4 2012.06
* A statically linked gdbserver
* A system root
* Manuals under share/doc/
The system root contains the basic header files and libraries to link
your programs against.
Interesting changes include:
* Change c++, gcc and ld to symlinks in Linux package
The Linux version is supported on Ubuntu 10.04.3 and 12.04, Debian
6.0.2, Fedora 16, openSUSE 12.1, Red Hat Enterprise Linux Workstation
5.7 and later, and should run on any Linux Standard Base 3.0
compatible distribution. Please see the README about running on
x86_64 hosts.
The Windows version is supported on Windows XP Pro SP3, Windows Vista
Business SP2, and Windows 7 Pro SP1.
The binaries and build scripts are available from:
https://launchpad.net/linaro-toolchain-binaries/trunk/2012.07
Need help? Ask a question on https://ask.linaro.org/
Already on Launchpad? Submit a bug at
https://bugs.launchpad.net/linaro-toolchain-binaries
On IRC? See us on #linaro on Freenode.
Other ways that you can contact us or get involved are listed at
https://wiki.linaro.org/GettingInvolved.
We've just started running a weekly benchmark of GCC trunk and Linaro
GCC tip. I've written a short script that compares against a baseline
and spits out a graph:
http://ex.seabright.co.nz/benchmarks/gcc-4.8~svn.pnghttp://ex.seabright.co.nz/benchmarks/gcc-linaro-4.7%2bbzr.png
I'll switch the baseline to GCC 4.7.0 once the build and benchmark run
completes. The gcc-linaro results need more data before they'll make
sense.
Part way there. An automatic email would be next. We should check
the graphs before each performance call.
-- Michael, who needs to get moving on LAVA
== GCC ==
* Checked in fix to LP bug 1020601 (missed optimization with
multiple __builtin_unreachable calls) to Linaro GCC 4.7.
* Implemented and tested alternative fix for incorrect pool
placement with -O0 by splitting all insns in machine-
dependent reorg.
* Continued investigation of reload bug reported by ARM.
== GDB ==
* Tested GDB 7.5 branch on ARM, found a couple of regressions.
Worked on fixing HW breakpoint/watchpoint regressions.
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E.
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk
Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294
Current Milestones:
|| || Planned || Estimate || Actual ||
||cp15-rework || 2012-01-06 || 2012-06-23 || 2012-06-24 ||
||a15-lpae-support || 2012-07-13 || 2012-07-20 || 2012-07-20 ||
||clean-up-kvm-patches || || || ||
||track-kvm-abi-changes || || || ||
||fake-trustzone || || || ||
Overall KVM plan for 'do by end August': QEMU parts of this are a mix
of clean-up-kvm-patches and track-kvm-abi-changes blueprints, mostly.
http://cards.linaro.org/browse/CARD-167
== a15-lpae-support ==
* LPAE patches now merged upstream
* v2 of vexpress-large-ram-size sent upstream, code reviewed
and put into arm-devs pullreq. Hasn't hit master yet but
I expect that to happen over the next week.
== clean-up-kvm-patches ==
* squashed together some kvm patches in the qemu-linaro tree
* sent upstream a few patches where we can avoid an ARM-KVM
specific change by instead generalising the upstream code not
to have an explicit list of KVM supporting architectures
* started looking at how best to clean up some working-but-ugly
code handling interrupts in the QEMU KVM-ARM patchset. Among
other problems, this is messy to fix because at the moment
upstream is overloading "is there an in kernel irqchip?" to
mean both "should we use QEMU's irqchip model or not?" and
"is the interrupt injection model synchronous or asynchronous?"
because on x86 they are (for historical reasons) the same.
For ARM we only want to decide which irqchip model to use,
not anything else...
== other ==
* upstream review (various exynos patches, mostly)
* some patches fixing problems with compiler warnings in
configure test fragments
* arm-devs pullreq
KVM blueprint progress tracker:
http://apus.seabright.co.nz/helpers/backlog?group_by=topic&colour_by=state&…
-- PMM
Hi Ramana, Ulrich. Could I have some help with an unexpected
testsuite failure while backporting Carrot's adddi patch?
testsuite/gcc.misc-tests/gcov-7.c builds and runs but aborts during
leave() due to unexpected results.
The merge request is here:
https://code.launchpad.net/~michaelh1/gcc-linaro/core-adddi/+merge/113111
The testsuite diff is here:
http://ex.seabright.co.nz/build/gcc-linaro-4.7+bzr115001~michaelh1~core-add…
The build tree is at:
cbuild@tcpanda02.v:/scratch/cbuild/slave/slaves/tcpanda02/gcc-linaro-4.7+bzr115001~michaelh1~core-adddi/gcc/default/build
The failing and working versions are on tcpanda02 as ~/gcov-7.exe and
~/gcov-7-ok.
Here's the details:
* The test is fine when built from the command line
* The test is fine on the hard float Precise build
* The failing binary works fine when run on Precise
* The disassembled body (not libraries) is identical modulo changes
in addresses
* The fault goes away with a static linking via adding "--tool_opts '-static'"
* The fault persists with binutils 2.22
* The fault persists with the eglibc 2.15 loader
I assume the testsuite picks up a different libgcc and libgcov somehow
which gives a different executable. It's strange that the static
linked version is fine, and that the failing binary works fine on a
different host.
Could you have a poke in the build tree?
-- Michael
== GCC ==
* Tom de Vries fixed root cause of LP bug 1020601 (missed
optimization with multiple __builtin_unreachable calls)
on mainline. Backported to Linaro GCC 4.7 and tested.
Fixed bug exposed by backport (latent in mainline).
* Continued investigation of reload bug reported by ARM.
== Misc ==
* Attended GNU Tools Cauldron in Prague. Presented on
GDB remote/native feature parity work.
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E.
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk
Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294
Current Milestones:
|| || Planned || Estimate || Actual ||
||cp15-rework || 2012-01-06 || 2012-06-23 || 2012-06-24 ||
||a15-lpae-support || 2012-07-13 || 2012-07-20 || ||
||clean-up-kvm-patches || || || ||
||track-kvm-abi-changes || || || ||
||fake-trustzone || || || ||
Overall KVM plan for 'do by end August': QEMU parts of this are a mix
of clean-up-kvm-patches and track-kvm-abi-changes blueprints, mostly.
http://cards.linaro.org/browse/CARD-167
== a15-lpae-support ==
* LPAE patchset in latest target-arm pullreq sent upstream
* Kernel patch to get it not to throw away high bits of RAM
size Acked by Will and sent to RMK's patch system
* vexpress patchset for large RAM sizes had a few review
issues which I think I've sorted; need to roll a v2
* push back estimate date a week to account for: large-ram-size
work wasn't in my original list of work here; code review wait
times [ie not much real work remaining, but some time delay]
== other ==
* updated to new fast model and kernel and rechecked that my local
setup still works OK
* qemu-linaro 2012.07 released
* usual upstream patch review
* finally bit the bullet and upgraded my ancient Ubuntu desktop
KVM blueprint progress tracker:
http://apus.seabright.co.nz/helpers/backlog?group_by=topic&colour_by=state&…
The Linaro Toolchain Working Group is pleased to announce the 2012.07
release of both Linaro GCC 4.7 and Linaro GCC 4.6.
Linaro GCC 4.7 2012.07 is the fourth release in the 4.7 series. Based
off the latest GCC 4.7.0+svn189098 release, it includes performance
improvements around choice of auto-increment based addressing modes
for floating point values.
Interesting changes include:
Updates to GCC 4.7.0+svn189098
Implements improvements to ivopts selection of addressing modes of
floating point values.
Fixes:
LP: #1010826 - Invalid unaligned loads in vectorized code.
Linaro GCC 4.6 2012.07 is the seventeenth release in the 4.6 series.
Based off the latest GCC 4.6.3+svn189058 release, this is the fourth
release after entering maintenance.
Interesting changes include:
Updates to 4.6.3+svn189058
Fixes:
LP: #1010826 - Invalid unaligned loads in vectorized code. LP:
#1013209 - Internal compiler error when building neon intrinsics.
The source tarballs are available from:
https://launchpad.net/gcc-linaro/+milestone/4.7-2012.07https://launchpad.net/gcc-linaro/+milestone/4.6-2012.07
Downloads are available from the Linaro GCC page on Launchpad:
https://launchpad.net/gcc-linaro
More information on the features and issues are available from the release page:
https://launchpad.net/gcc-linaro/4.7/4.7-2012.07https://launchpad.net/gcc-linaro/4.6/4.6-2012.07
Mailing list: http://lists.linaro.org/mailman/listinfo/linaro-toolchain
Bugs: https://bugs.launchpad.net/gcc-linaro/
Questions? https://ask.linaro.org/
Interested in commercial support? Inquire at support(a)linaro.org
The Linaro Toolchain Working Group is pleased to announce the release of
Linaro QEMU 2012.07.
Linaro QEMU 2012.07 is the latest monthly release of qemu-linaro. Based
off upstream (trunk) QEMU, it includes a number of ARM-focused bug fixes
and enhancements.
There are no major changes in this month's release, though
it has been updated to track the latest upstream QEMU changes.
Known issues:
- Graphics do not work for OMAP3 based models (beagle, overo)
with 11.10 Linaro images.
- Audio may not work on Versatile Express models with the latest
Linaro kernel/hardware packs (LP:977610).
The source tarball is available at:
https://launchpad.net/qemu-linaro/+milestone/2012.07
More information on Linaro QEMU is available at:
https://launchpad.net/qemu-linaro
== GCC ==
* Investigated bootstrap comparison failure with neon-shifts
branch; tracked down root cause to pre-existing bug in GCC
common code. Fix checked in to FSF mainline and 4.7 branch.
* Investigated di-sync-multithread test case failure with
neon-shifts branch; root cause was missing length attributes
for sync.md insn&split patterns. Implemented fix and
restarted tests.
* Investigated LP bug 1020601, missed optimization with multiple
__builtin_unreachable calls. Tracked down root cause and
started discussion of possible fixes on gcc-patches.
* Investigated potential reload bug reported by ARM.
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E.
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk
Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294
RAG
Amber: 4.7 2012.07 source release for reasons described below.
Green : 4.6 2012.07 source release done.
== Progress ==
* Worked on auto-inc-dec scheduler changes.First cut patch looking reasonable.
* Committed the neon permute intrinsics upstream.
* Release week : release tarballs prepared for 4.6 . The 4.6 release
is GREEN. I will upload the release on Thursday morning after I am
back in the office.
The 4.7 release had some issues - I had to rerun the release script
because the merge contained some artifacts which were a result of
merge conflicts. Having respun the release it turned out that the
rsync from my machine to cbuild failed, which meant that I ended up
testing the same snapshot twice.Given that the 2 tarballs only differ
in the .rej and the .orig files and nothing more I'm not too worried
about this because it should ideally just work. The good news is that
ubutest and everything else went through ok. The bad news is that the
oe build appears to be borked. We need to investigate that further.
Having looked inside the tarballs and seen that the .rej and .orig
files were the only things different between the 2 tarballs I don't
think it's a huge problem for the release. I;ve spawned off another
set of builds to be absolutely sure .
* Looked at neon costs and vdup improvements . The neon cost changes
could cause regressions with 64 bit arithmetic and hence need to be
looked at carefully. The vdup improvements cause carnage in
gcc.target/arm/neon and tests for intrinsics have to be improved.
== Plans ==
* GNU Tools cauldron next week.
* Deal with release week fall-out.
* Write a patch to check md5sums between local tarball and uploaded
tarball in the release script.
* Look at auto-inc-dec patches more.
* Background look into improving some of the tests that now fail with
the vdup patches.
== Absences ==
* 8th - 11th July - GNU Tools Cauldron.
* 17th Sept - 5th Oct - Vacation planned, yet to be approved.
Current Milestones:
|| || Planned || Estimate || Actual ||
||cp15-rework || 2012-01-06 || 2012-06-23 || 2012-06-24 ||
||a15-lpae-support || 2012-07-13 || 2012-07-13 || ||
||clean-up-kvm-patches || || || ||
||track-kvm-abi-changes || || || ||
||fake-trustzone || || || ||
Overall KVM plan for 'do by end August': QEMU parts of this are a mix
of clean-up-kvm-patches and track-kvm-abi-changes blueprints, mostly.
http://cards.linaro.org/browse/CARD-167
== a15-lpae-support ==
* did the basic benchmarking of the LPAE series; using 64 bits for
guest physical addresses has between 0 and 0.5% hit to performance,
which IMHO is sufficiently minimal that it is not a problem.
* wrote a set of follow-up patches which allow the vexpress-a15
model to accept large RAM sizes (mostly turning off the "too big"
user-error message, fixing some over-small types in the QEMU boot
loader and adding support for handling device tree blobs with
64 bit address/size fields).
* discovered that Linux will happily throw away the top 32 bits
of a device tree memory node's size field. Wrote a patch for this,
which works but needs redoing to fix in a cleaner way.
== other ==
* qemu-linaro 2012.07 release prep: bug triage, investigation,
rolling tarball, testing
* arm-devs pullreq
KVM blueprint progress tracker:
http://apus.seabright.co.nz/helpers/backlog?group_by=topic&colour_by=state&…
-- PMM