== Progress ==
* Worked on building binary tarballs from Jenkins. (TCWG 383 - 4/10)
* Continuing work on regression test analysis and reporting. (TCWG
448 - 3/10)
- now copying the check.log from make check to toolchain so
tcwgweb.sh can scan it for build errors in the test cases.
- Got the delimiter for concurrent Jenkins builds changed, now
glibc build problems are gone.
- Track down and try to fix testsuite build issues.
* Meetings and Misc. (3/10)
- Worked on making remote testing go faster. Setting
ControlPersist seems to help.
- Much debugging of Jenkins related issues.
== Plan ==
* Continuing work on validating test results and fixing testsuite
build issues. (TCWG 448)
* Probably more work on binary tarballs. (TCWG 383)
* More tracking down and fixing Jenkins related issues.
== Issues ==
* For some reason libgloss isn't getting built for *-elf via
Jenkins/Cbuildv2.
Hi,
I have tried out a prototype of using binfmt_misc, and it does not appear to be a worthwhile solution at this point.
Bottom line: with a nice fast multi-core x86 server and a pool of ARM boards we can test GCC in ~20min.
Testing setup:
- Core i3 2-core host
- Chromebook 2-core target, SSD disk
- WiFi network
- GCC mainline built from sources with same flags both natively and cross: C, C++, Fortran.
General observations about testing ARM toolchains:
- When testing natively target is busy 100%: around 50% of time is spent compiling testcases, 40-45% in dejagnu/expect, 5-10% on actual test execution. Target is the bottleneck.
-- 21633.22user 4400.13system 4:13:11elapsed
- With testing cross using standard rsh_prog=ssh, rcp_prog=scp, target is busy 40% of the time: 30% on ssh, 10% on actual test execution. Host is the bottleneck.
--
- When testing cross (using method below) target is busy only 15-20% of the time: 10% on ssh and 10% on actual test execution. Host is the bottleneck.
-- 9490.16user 2882.54system 1:10:57elapsed
I've got a prototype implementation of parallelized cross-testing of GCC that I'm happy with using rsh_prog and rcp_prog dejagnu wrappers:
General observations on dejagnu cross-testing process:
- All communication with the target is done by Dejagnu via rsh_prog and rcp_prog hooks. These are normally defined to ssh and scp respectively.
- Dejagnu copies every testcase to /tmp/ on the target with scp.
- Dejagnu executes every testcase via ssh off target's /tmp/. In total there is 1 scp and 3 ssh invocations per testcase.
The idea of below scripts is to assume shared filesystem between host and a pool of target boards, and skip copying executables to target's local filesystem. Since there is no longer local state (local state == contents of /tmp) on the target boards, testcases can be executed on any of the boards in the pool. This happens transparently to dejagnu: dejagnu issues "rsh_prog chromebook-pool ./test" and then rsh_prog converts this to "ssh chromebook-XX ./test", where XX chosen at random.
Script implementation notes:
- For some unholy reason there is no way of correctly parse command line from dejagnu and give it to bash. The reason why myssh script ssh'es onto host itself is because that's the only way I've found to reliably execute the command. Hey! The command line was intended for ssh anyway!
- Myssh script translates hostname of foobar-pool-01-03-08 into foobar01, foobar03 or foobar08 chosen at random. If there is no "-pool-" mentioned in hostname, then it is used verbatim.
--
Maxim Kuvyrkov
www.linaro.org
== Progress ==
* GDB arm v8 record/replay
-- Completed implementation of system call recording [TCWG-409] [4/10]
-- Support for recording A64 Loads and stores [TCWG-409] [2/10]
-- Bug Fixing to reduce failures in gdb.reverse testsuite [TCWG-484] [2/10]
* Miscellaneous
-- Day off on Friday [2/10]
== Plan ==
* GDB arm v8 record/replay
-- Submission of patches upstream.
-- Bug fixing to reduce failures.
-- Advance SIMD load/store instruction recording support.
== Issue ==
* None.
== Progress ==
* Investigate PR61220, 61225 and 61278, which are triggered by my
previous commits. Patches are in review. (9/10)
* Investigate codes generated by shrink-wrapping interrupt routes. It
seams no dwarf info issue.
* Misc update for Linaro crosstool-ng to make the build work in case
someone wants to use it.
- Down grade gdb to 7.6.
- Disable multilib for 4.9.
- Move local patches at binutils/linaro-2.24.0-2014.03 to
binutils/linaro-2.24.0-2014.05.
== Plans ==
* Push pending patches.
== Planed leaves ==
* June. 2.
Hi,
I've run into some compile errors after updating to 4.9 -- usually getting
undefined references to symbols defined in helper static libraries.
It turns out this is triggered by gcc -flto now creating slim object files
by default (-ffat-lto-objects "fixes" it) - but I think it is actually an
ld bug that should be fixed at some point. ld (regardless of whether I use
-fuse-linker-plugin, -fuse-ld=gold or -fuse-ld=bfd) doesn't seem to see LTO
bytecode in object files that are inside an ar wrapper.
It deals with the library just fine if I use "ar x" to extract its object
files and link to them individually as opposed to the .a file.
I've attached a small test case to demonstrate ("make broken" shows the
error, "make works" shows the workaround).
Is there any reason why ld should behave the way it does, or is this a bug
that needs fixing?
ttyl
bero
== Progress ==
* Reload - IRA bug fix (3/10)
Not able to reproduce in trunk, r210538 masks the bug again :(
Discussed with maxim on extending the macro ,Likely spilled class for thumb2.
Decided that it will lead to performance regressions. Conservative fix
is to allow the pattern for ARM target alone. Verfying the fix by on
armhf schroot
* Testing GCC Linaro compiler on Hardware (4/10)
Completed GCC Linaro compiler 4.8 and 4.9 correctness tests on
hardware. Completed running SPEC 2006 for -O3. Completed running
SPEC2006 for -O3 -ftlo and -mcpu=cortex-a57. Triggered PGO runs on
hardware.
Looked at bootstrap failure with BOOT_CFLAGS="-mcpu=cortex-a57".
Changed from system assembler to Linaro assembler solved it as system
assembler is old.
* Misc (3/10)
- Completed installing ubuntu, set up chroot and migrate to toolchain
64 environment. (2/10)
- 1-1 meetings (Ryan, Christophe and Maxim) (1/10)
- AMD internal support work and meetings
== Plan ==
* Continue bug fixing.
* LTO bootstrap failure
* Testing GCC Linaro compiler on hardware.
* UK VISA processing.
== Issues ==
* None
== Progress ==
* CARD-1162 : Linaro GCC 4.9 and CARD-1355 : stabilization and
optimization effort for ARMv8-a (8/10)
- Looked at Jenkins build/failures/reportin
- Review the backporting process and scripted it
- 40 backports are in review and need validation
* LP #1169164 : including signal.h exposes various PSR_MODE #defines
- Committed upstream.
* Misc:
o Various meetings (2/10)
o LCU'14: Register and booked flights
== Next ==
* Child care today
* Improve the backport script and document it's usage
* Continue backports
* Continue feedback and help with the validation
== Progress ==
* GCC trunk cross-validation (4/10)
- build broken last week-end, because of a
new optimization that broke glibc build.
- glibc fixed by Joseph mid-week, updated
- to help diagnose build failures earlier, I have setup
a reduced version of the validation framework,
which only performs a build of binutils+glibc+gcc,
at every commit on gcc trunk for 16 arm+aarch64
configurations. [ yes, another buildbot of sorts ]
- restarted builds+validations to last known successful
status (i.e. before last week-end)
- builds are catching up
* Neon intrinsics tests (3/10)
- continuing conversion (about 40 files done, out of ~140)
* Misc (meetings, conf-calls, ...) (3/10)
* Backports for 4.9:
- started reviewing candidate backports
== Next ==
* GCC trunk cross-build/cross-validation:
- monitor and report regressions
* Neon intrinsics tests:
- continue conversion
- prepare a cleaner branch for upstream submission
* Backports:
- more reviews
- process improvements
== Progress ==
* Kernel (CARD-1246 4/10)
- Named registers committed in Clang
- GCC seems to break on local named regs, too.
- Trying to change the kernel code to use only globals for non-GPRs
- Adding support for pointer types, and structure fields in GNRVs
* Benchmarks (CARD-716 0/10)
- Re-enabling perf reports for LNT bot (ARM fixed reporting)
* Background (6/10)
- Code review, meetings, discussions, etc.
- Removing *all* buildbots' batteries after failure
- Testing D01 box, not stable yet for toolchain testing
- Moving development to git.linaro.org (for backup)
- Planning TCWG rack migration
- Drafting an LLVM white paper
== Plan ==
* Continue with named register extra work (http://llvm.org/PR19837)
* Start TCWG rack migration
* Discussions about LLVM white paper
== Progress ==
* Investigate and fix building glibc for ARM with -mtls-dialect=gnu2 (3/10)
* Investigate ld TLS behaviour for Huawei (1/10)
* Refactor scripts to enable benchmarking postgresql malloc
performance (2/10, TCWG-441)
* Patch review and testing (1/10)
* Diagnose and fix glibc testsuite failures on aarch64 (2/10)
* Meetings, admin (1/10)
== Issues ==
* None
== Plan ==
* More malloc application benchmarking
--
Will Newton
Toolchain Working Group, Linaro
== Progress==
lowlevellock performance bugs - TCWG-435 [5/10]
* Tried various methods to build/test glibc for aarch64
* Eventually succeeded (tests passed)
cbuild benchmarking - TCWG-360 [3/10]
* cbuildized spec2xxx scripts working as far as 'run'
Meetings/mail/etc [2/10]
== Plan ==
Holiday for one week
After that:
* Clean up cbuildized spec2xxx scripts, cbuildize them some more &
discuss with Rob
* Send lowlevellock patch upstream
* If time, put together some more experimental memset implementations
I have been thinking how to simplify cross-testing our toolchain for both automated and development/debugging builds, and among various options the most universal I came up with is ARM hardware + ssh + binfmt_misc + sshfs. I wonder if anyone has already tried this or can suggest alternatives which are as universal.
Given:
- host x86_64 development machine
- cross-compiler
- target hardware with fast network to the host
- host and target have ssh
- testsuite (gcc/glibc/gdb/etc)
Here is how it is going to work
1. On host we create a simple wrapper script that will pass through its arguments as command to execute on target via ssh:
===
#!/bin/sh
ssh -p 22NN $TARGET_BOARD "$@"
===
2. We register this script in binfmt_misc to be used as interpreter for target binaries. Value of $TARGET_BOARD will be picked up from the environment and can be set to different boards for different testsuite runs.
3. The target board needs to be prepared for a particular testsuite run:
-- Runtime libraries need to be either copied or mounted via sshfs from the host. It is an open question how best to install several sets of libraries (for parallel runs) so that each set appears to be main system libraries. My current thinking is a separate ssh server inside chroot per each test run.
-- Test directory needs to be sshfs mounted on target from host so that the target could see test executables.
-- Preparation/finalization of the board can either be done explicitly before/after testing. Or it can be done on demand by the aforementioned script: the script checks whether a multiplexed ssh socket exists, and, if not, it prepares the board and starts a multiplexed ssh connection.
4. Testing is fired up as if it is normal "native" testing. Whenever kernel is given an ARM binary to execute -- it passes it off to wrapper, which passes it off to the target board via ssh. The board sees same filesystem as host and happily executes binaries against toolchain runtime libraries.
Comments or rotten tomatoes?
Thank you,
--
Maxim Kuvyrkov
www.linaro.org
= Progress ==
* Worked on the LLVM branch of Cbuildv2 (TCWG - 1/10).
* More work on regression test analysis and reporting. (TCWG 448 - 5/10)
* Meetings and Misc (4/10)
- Produced lots of test results via Jenkins, need to verify
they're not having remote target problems.
== Plan ==
* Verify test runs aren't having problems with remote targets.
* Start training the Jenkins Failure Analysis plugin.
* Install lava-tool and get it working on all the tcwgbuild* machines.
* Continuing work on regression test analysis and reporting. (TCWG
448 - 5/10)
== Progress: ==
Holiday [2/10]
Rewrite of division optimisation changes following review - TCWG293 [8/10]
== Plan ==
Mostly on holiday this week. I may be working sporadically
Back full time
== Progress ==
* resumed 1:1 calls with Zhenqiang, Venkat, Charles.
* GCC trunk cross-validation (2/10):
- monitored results
- a few improvements/cleanups
* Neon-intrinsics tests (5/10)
- continuing conversion
- needs to add support AArch64 Neon overflow flag
* Misc (meetings, conf-call, ..) (3/10)
* Successfully tried OpenNX setup put in place by Maxim
(on office computer, despite firewall and no root access)
== Next ==
* GCC trunk cross-validation:
- monitor and report results
- use this system to pre-validate a patch from Kugan
- share scripts with Kugan
* Neon intrinsics tests:
- continue conversion
- hopefully push a preliminary version upstream
== Progress==
lowlevellock performance bugs - TCWG-435 [3/10]
* Trying to build/test aarch64 on a foundation model
cbuild benchmarking - TCWG-360 [4/10]
* Integrating Maxim's spec scripts into Kugan's benchmarking branch
* Began modifying the branch to use existing cbuild functions where possible
Meetings/mail/etc [3/10]
== Plan ==
Holiday next week (w/c 26th May)
This week:
* Try testing glibc on system qemu rather than foundation model
* Carry on with cbuild benchmarking
* If time, put together some more experimental memset implementations
== Progress ==
* GDB arm v8 record/replay
-- Bug fixing: Improve gdb.reverse testsuite results on armv8
[TCWG-451] [2/10]
-- core files issue submitted bfd patch upstream [TCWG-451]
-- Support for recording Data processing - Advanced SIMD and
Cryptographic [TCWG-405] [TCWG-407] [3/10]
-- Support for recording A64 Data processing - Floating point
instructions [TCWG-404] [TCWG-406] [2/10]
* Miscellaneous
-- UK visa application submission [3/10]
== Plan ==
* Continue work on issues related to GDB arm v8 record/replay
The Linaro Toolchain Working Group (TCWG) is pleased to announce the 2014.05
stable release of the Linaro GCC 4.9 source package.
Linaro GCC 4.9 2014.05 is the second Linaro GCC source package release in the
4.9 series. It is based on FSF GCC 4.9.1+svn210052 and includes performance
improvements and bug fixes.
With the imminent release of ARMv8 hardware and the recent release of the
GCC 4.9 compiler the Linaro TCWG will be focusing on stabilization and
performance of the compiler as the FSF GCC compiler approaches version 4.9.1.
The Linaro TCWG will provide monthly stable[1] source package releases until
FSF GCC 4.9.1 is released. At that point Linaro GCC 4.9 will merge in
FSF GCC 4.9.1 and, release Linaro GCC 4.9.1, and then return to a schedule of
stable quarterly releases and monthly engineering[2] releases.
Interesting changes in this GCC source package release include:
* Updates to GCC 4.9.1+svn210052
* Backport of the Ada AArch64 support
Feedback and Support
Subscribe to the important Linaro mailing lists and join our IRC channels to
stay on top of Linaro development.
** Linaro Toolchain Development "mailing list":
http://lists.linaro.org/mailman/listinfo/linaro-toolchain
** Linaro Toolchain IRC channel on irc.freenode.net at @#linaro-tcwg@
* Bug reports should be filed in Launchpad against "Linaro GCC project":
http://bugs.launchpad.net/gcc-linaro/+filebug.
* Questions? "ask Linaro":
http://ask.linaro.org/.
* Interested in commercial support? inquire at "Linaro support":mailto:
support(a)linaro.org
[1] Stable source package releases are defined as releases where the full Linaro
Toolchain validation plan is executed.
[2] Engineering source package releases are defined as releases where the
compiler is only put through unit-testing and full validation is not
performed.
== Progress ==
* Reload - IRA bug fix (5/10)
- In thumb2 mode, we get a pattern "*ior_scc_scc" for the third
argument expression by the combiner pass.
- Expression Class:foo(x,0,((y==x)||(z==x))) x gets register r1 and second r2 .
- The class object this pointer is passed in r0. r7 is used for stack pointer.
- The pattern "*ior_scc_scc" demands more LO_REGISTERS. It needs 5
LO_registers for destination and 4 source operands. But we are left
with r3,r4,r5,r6 only.
- Such situation is handled for Thumb1 only using
TARGET_CLASS_LIKELY_SPILLED_P. Thumb2 should accept HI registers also.
- Changing the pattern to accept general registers for destination
operation is also not helping.
- Need to explore secondary reload macros.
* Misc
- AMD meetings and internal tasks (2/10)
- 1-1 meetings (Ryan, Christophe and Maxim) (1/10)
* Testing: Installed packages and ran GCC Linaro compiler 4.8
correctness tests on hardware. Completed running SPEC 2006 for -O3
-mcpu=cortex-a57 flag (2/10).
== Plan ==
* Continue bug fixing.
* LTO bootstrap failure
* Testing GCC Linaro compiler on hardware.
* New laptop install ubuntu, set up chroot and migrate to toolchain
64 environment.
* UK VISA processing.
== Progress ==
* TCWG-413 (8/10) sha1 performance
- Looked at IRA dumps and aarch64 target hooks.
- GCC now uses FP registers as register class and this results in lots
of fmovs for the test-case.
- Discussed in list and tried spill_class hook for aarch64. This helps
sha1.
- Regression tested the change.
- Ran Spec2000 with the changes and 168.wupwise, 187.facerec are failing.
- Investigation continues.
* TCWG-468 (1/10)
- Continuing with benchmarking.
* Set-up NX and started using it (1/10)
== Plan ==
* Benchmarking.
* Upstream zero/sign extension elimination activities.
* sha1 performance.
== Issue ==
* None.
== Progress ==
* Commit three patches to enhance shrink-wrap for loop. But community
reports an ICE in dwarf info with the patches. (TCWG-133, 5/10)
* Update/test shrink-wrap for apcs patch according to comments (TCWG-482, 2/10)
* Loop-invariant heuristic tuning (TCWG-763, 2/10).
* Investigate Linaro crosstool-ng gdb build fail for 2014.05 config.
But have not find an easy way to fix lsbcc build fail. (1/10).
== Plans ==
* Fix the ICE triggered by shrink-wrap changes.
* gdb build fail issue if Linaro still use crosstool-ng for release
* Continue loop-invariant heuristic tuning
== Planned leaves ==
* June 2.
== Week of May 12th ==
- Rolled out TCWG development environment. (TCWG-483, 4/10)
-- https://collaborate.linaro.org/display/TCWG/TCWG+Development+Environment
-- Demo'ed it both inside and outside TCWG.
-- Finished up configuration of environment and setup backups.
- STREAM performance regression (TCWG-388, 2/10)
-- Prepared first batch of patches for upstream submission
- Various discussions, including ... (4/10)
-- register allocation and reload with Venkat
-- register allocation with Kugan
-- benchmarking with Kugan
--
Maxim Kuvyrkov
www.linaro.org