== Progress ==
Very short week, was doing some AMD internal tasks and attend local meetings.
* gprof support work for Aarch64.
Not much progress this week.
Working on GCC side to support gprof for Aarch64.
== Plan ==
* Continue gprof support work for Aarch64
== Summary ==
- http://cards.linaro.org/browse/TCWG-13
- performance regression is due to alignment of function.
- there is an increase in runtime of core_state_transition even
though there is no difference in the code generated with the patch
- Adding nops seems to improve the locality and performance; it gets
better than without the patch for THUMB2.
- Get spec2000 results for http://cards.linaro.org/browse/TCWG-14 to
decide on the next step
- Couldn’t get SPEC2000 results with CBUILD
- set-up spec benchmarks in chromebook and now running locally.
-
https://blueprints.launchpad.net/gcc-linaro/+spec/better-end-of-loop-counte…
- Initial investigation shows that the code generated is same as
expected.
== Plan ==
- http://cards.linaro.org/browse/TCWG-13 - follow it up
- Get spec2000 results for http://cards.linaro.org/browse/TCWG-14 to
decide on the next step
- Look for improvement in VRP for zero/sign extension
== Progress ==
* Running around JM/lencode bug
- caused by a codegen opt (ICMP fold) that had repercussions only on A9
and A15 code generation
- spent three days trying to reduce the case when the problem fixed itself
miraculously >:(
* Planning for the future
- Agreeing on short-term plans for Q2
* Buildbot
- Working on self-hosting bot
- Moved local buildmaster to hackbox
* Investigating LLVMLinux
- Building Android kernel with LLVM
- Investigating breakage in Debug mode
== Plans ==
* Continue self-hosting bot
* Try running a CBuild benchmark with LLVM
* Start putting up together the infrastructure for release 3.3
* Try to extract useful information from perf database
Progress:
* qemu maintenance
** sent arm-devs and target-arm pullreqs now we're in softfreeze
* VIRT-4
** received Arndale board, confirmed it works (took several
hours mostly due to bonkers power switch design)
** VIRT-49
*** making progress; updated card with a list of sub-subtasks
Plans:
* keep pushing on with VIRT-49
* finish config of arndale board, test running KVM on it
* book travel/hotel for Connect Dublin
* office move Fri 26/Mon 29
-- PMM
All,
http://cbuild.validation.linaro.org/helpers/recent was reporting an Internal
Server Error earlier today, and after looking at the logs the resultant
cause was because the gcc-4.8+svn198079 Lava job
(https://validation.linaro.org/lava-server/scheduler/job/52224) decided that
it was an a9 (as opposed to a9hf) job and that it didn't no which version of
Ubuntu it was running on.
The caused the logs to be put in
http://cbuild.validation.linaro.org/build/gcc-4.8+svn198079/logs/armv7l--cb…
The tcwg-web app then fell over because it couldn't pass the
armv7l--cbuild-panda-es06-cortexa9r1 name.
I fixed the issue by manually renaming the build log directory to:
http://cbuild.validation.linaro.org/build/gcc-4.8+svn198079/logs/armv7l-pre…
And once the cron job which scans the builds had run everything now works.
Actions:
1. Paul - do you mind taking a look at the build and seeing what went wrong
- my initial cursory glance makes me believe its the board having heat
issues causing random things to happen.
2. Paul & Matt - Looking at the code (and from something else Michael said
to me last week) I think having hostnames with '-' characters in them will
confuse the cbuild interface. I propose changing cbuild to do a s/-/_/g on
all the hostname it reads as a workaround. I don't plan on changing actual
hostnames of boards. Paul is this going to cause a problem for you in Lava?
Thanks,
Matt
--
Matthew Gretton-Dann
Toolchain Working Group, Linaro
Hi,
Some time ago I had some problems linking my project libraries for
Android using the Linaro toolchain 4.7.1 which I reported to the
list:
http://lists.linaro.org/pipermail/linaro-toolchain/2012-June/002631.html
I ended up using the 4.6.x version of the compiler because
I could not find a fix and I did not get any hints from
the mailing list.
Now I need to really switch to 4.7(for better C++11 support)
but I'm pretty much having the same issue with the 4.7.3 version:
E.g.
System/Logging/DroidLogger.cpp.o: requires unsupported dynamic reloc
R_ARM_REL32; recompile with -fPIC
/home/dev/android/android_linaro_toolchain_4.7/bin/../libexec/gcc/arm-linux-androideabi/4.7.3/real-ld:
error:
/home/marius/Development/ToolChains/Android/experimental_ndk/sources/cxx-stl/gnu-libstdc++/4.7.3/libs/armeabi-v7a/libsupc++.a(eh_globals.o):
requires unsupported dynamic reloc R_ARM_REL32; recompile with -fPIC
/home/dev/android/android_linaro_toolchain_4.7/bin/../libexec/gcc/arm-linux-androideabi/4.7.3/real-ld:
error: hidden symbol '__dso_handle' is not defined locally
/home/dev/android/android_linaro_toolchain_4.7/bin/../libexec/gcc/arm-linux-androideabi/4.7.3/real-ld:
error: hidden symbol '__dso_handle' is not defined locally
I'm using it with the Android NDK version r8e.
I have downloaded the prebuilt binaries:
android-toolchain 4.7 (ICS, JB) <http://www.linaro.org/downloads/> 4.7-2013.03
13.03
(Linaro GCC 4.7-2013.03) 4.7.3 20130226 (prerelease)
Does anyone have any hints on how to overcome the above mentioned problem?
--
Marius Cetateanu | Software Engineer
T +32 2 888 42 60
F +32 2 647 48 55
E mce(a)softkinetic.com
YT www.youtube.com/softkinetic
SK Logo <www.softkinetic.com>
Boulevard de la Plaine 15, 1050, B-Brussels, Belgium
Registration No: RPM/RPR Brussels 0811 784 189
Our e-mail communication disclaimers & liability are available at:
www.softkinetic.com/disclaimer.aspx
Hi Matt,
This week I found an error in LLVM that can only be reproduced on ARM
hardware, if the GCC that compiles it specifies --mcpu=cortex-a15. The
error is a segfault on one of the tests compiled by that Clang/LLVM. As you
can see, it's not something trivial that we'd expect people to do easily.
My question is: How do we tackle this type of bug?
One option is to do all the debugging and interface with the original patch
author until we fix the problem. This works well for simple bugs, but in
this case I'm not sure it will work.
Another option was to have a board on Linaro's DMZ with no access to
anything else internally, so that people could log in and debug the problem
in situ.
This board would have to be setup just for the debugging with a random
password given only to the author of the patch and cleaned up right after
the bug is fixed (to avoid external abuse).
We could use the same board for LLVM, GCC, GDB, etc. but it should be easy
to re-flash it to a minimum system, so that we don't spend too much time
setting it up.
Does anyone have a better idea?
cheers,
--renato
Hi,
Feel free to point me at a newer toolchain. Was building the SNU
OpenCL SDK native on my chromebook running ubuntu raring when I hit
the following:
make: Entering directory `/home/tgall/opencl/SNU/src/runtime/build/cpu'
arm-linux-gnueabihf-g++ -fsigned-char -march=armv7-a -mfloat-abi=hard
-mfpu=neon -ftree-vectorize -ftree-vectorizer-verbose=0 -fsigned-char
-fPIC -DDEF_INCLUDE_ARM -g -c -o smoothstep.o
/home/tgall/opencl/SNU/src/runtime/hal/device/cpu/common/smoothstep.c
-I/home/tgall/opencl/SNU/inc
-I/home/tgall/opencl/SNU/src/runtime/hal/device/cpu
-I/home/tgall/opencl/SNU/src/runtime/hal/device/cpu/async
-I/home/tgall/opencl/SNU/src/runtime/hal/device/cpu/atomic
-I/home/tgall/opencl/SNU/src/runtime/hal/device/cpu/common
-I/home/tgall/opencl/SNU/src/runtime/hal/device/cpu/conversion
-I/home/tgall/opencl/SNU/src/runtime/hal/device/cpu/geometric
-I/home/tgall/opencl/SNU/src/runtime/hal/device/cpu/integer
-I/home/tgall/opencl/SNU/src/runtime/hal/device/cpu/math
-I/home/tgall/opencl/SNU/src/runtime/hal/device/cpu/reinterpreting
-I/home/tgall/opencl/SNU/src/runtime/hal/device/cpu/relational
-I/home/tgall/opencl/SNU/src/runtime/hal/device/cpu/vector -O0 -g
In file included from
/home/tgall/opencl/SNU/src/runtime/hal/device/cpu/cl_cpu_ops.h:47:0,
from
/home/tgall/opencl/SNU/src/runtime/hal/device/cpu/common/smoothstep.c:34:
/home/tgall/opencl/SNU/src/runtime/hal/device/cpu/type/cl_ops_floatn.h:
In function 'float2 operator-(float, float2)':
/home/tgall/opencl/SNU/src/runtime/hal/device/cpu/type/cl_ops_floatn.h:114:1:
internal compiler error: output_operand: invalid operand for code 'P'
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-4.7/README.Bugs> for instructions.
Preprocessed source stored into /tmp/cciluYVq.out file, please attach
this to your bugreport.
Traceback (most recent call last):
File "/usr/share/apport/gcc_ice_hook", line 34, in <module>
pr.write(open(apport.fileutils.make_report_path(pr), 'w'))
File "/usr/lib/python2.7/dist-packages/problem_report.py", line 254, in write
self._assert_bin_mode(file)
File "/usr/lib/python2.7/dist-packages/problem_report.py", line 632,
in _assert_bin_mode
assert (type(file) == BytesIO or 'b' in file.mode), 'file stream
must be in binary mode'
AssertionError: file stream must be in binary mode
make: *** [smoothstep.o] Error 1
tgall@miranda:~/opencl/SNU$ arm-linux-gnueabihf-g++ --version
arm-linux-gnueabihf-g++ (Ubuntu/Linaro 4.7.2-23ubuntu2) 4.7.3
I've attached the preprocessed source as well.
FWIW, I don't hit this when building with -O3. In this case I was
compiling for debug.
Thanks!
--
Regards,
Tom
"Where's the kaboom!? There was supposed to be an earth-shattering
kaboom!" Marvin Martian
Tech Lead, Graphics Working Group | Linaro.org │ Open source software
for ARM SoCs
w) tom.gall att linaro.org
h) tom_gall att mac.com
== Progress ==
* Disable-peeling:
- some benchmarking jobs ran, thanks to the new boards in cbuild.
- Spawned other jobs for reference
* Libsanitizer:
- Built native GCC on snowball to understand the isatty() behaviour
compared to qemu.
* Neon intrinsics codegen:
- resumed looking at the vuzp crc example from Steve Capper.
* Neon intrinsics codegen:
- added fp16 support (works with RVCT, GCC does not support it yet)
* Internal support
== Next ==
* Disable-peeling: analyze results when available.
* Revert-coalesce-vars: idem.
* Libsanitizer: analyze isatty() on board
* Neon intrinsics: continue with vuzp example.
One day off this week.
== Issues ==
* None
== Progress ==
* Linaro GCC 4.8, 4.7 and 4. 2013.04 released (with CL and MG)
* Boehm-gc AArch64 support backport in GCC:
- support committed.
* Libunwind AArch64 support:
- Resumed ongoing work.
== Plan ==
* Libunwind AArch64 support:
- Fix and submit upstream
== Progress ==
* Started investigating dwarf test suite failures on ARM.
* Created a new comparison between arm native gdb and arm remote gdb.
* More investigation of test cases to figure out causes of test cases that
are not run or unsupported.
* Experimentation with screens on gateway to speed up remote debugging,
didnt work.
* 1:1 with Matt.
* Took a day off on Friday 12th April 2013 for car checkup in workshop.
* Received Invitation letter from Arwen on Friday which means now visa
application only needs hotel booking information.
*** Still No blue-print available to log work in JIRA.
== Plan ==
* Setup arm gdb to debug itself and debug dwarf problems on arm.
* Fill up the arm native vs arm remote comparison sheet.
* Submit Ireland visa application after receiving invitation letter and
hotel booking details.
* Setup screens for remote testing using toolchain cbuild infrastructure.
== Progress ==
* gc sections tests
Completed upstream of g-c section patches after updating review comments.
Closed the card associated with the work.
http://cards.linaro.org/browse/TCWG-27
* gprof support work for Aarch64
Read gprof internal documents from sourceware.org.
Working on GCC side to support gprof for Aarch64.
Misc
------
11-4-2013 was a local holiday.
== Plan ==
* Continue gprof support work for Aarch64
* Attend internal team meetings on 16 and 17th.
== Summary ==
- http://cards.linaro.org/browse/TCWG-14
Coremark ARM mode gives about 2% performance improvement with about 1%
code size reduction. Thumb2 mode however has performance regression even
though code size reduces about 0.6%. Performance regression here is like
what we are seeing in EPILOGUE_UESES
changes(http://cards.linaro.org/browse/TCWG-13). Spawned spec in CBUILD
to see the impact with spec2000.
- http://cards.linaro.org/browse/TCWG-13
Thumb2 mode performance regression is due to the percentage of time
spent in core_state_transition. Looks like an alignment issue; same asm
is generated for this function with the patch. Investigating it.
== Plan ==
- Plan to resolve http://cards.linaro.org/browse/TCWG-13 this week.
- Get spec2000 results for http://cards.linaro.org/browse/TCWG-14 to
decide on the next step
== Progress ==
* Setting up Chromebook with Ubuntu 13.04.
* Developing patch to integrate new memcpy into glibc with IFUNC.
* Debugging and submitting a patch for linker issue with IFUNC.
== Issues ==
* None.
== Plan ==
* Get newlib mempcy patch accepted.
* Follow up memcpy in bionic.
* Submit memcpy IFUNC patch to glibc list.
* Get binutils tests into cbuild.
--
Will Newton
Toolchain Working Group, Linaro
Progress:
* qemu maintenance
** rebased qemu-linaro again
** preparing for upstream softfreeze on 15th
** review virtio patches and anything else that needs
attention pre-freeze
** scan of buglist; provided analysis of problem for LP:1090038,
closed a few stale bugs
** fixed a bug in an edgecase in fused multiply-accumulate emulation
* VIRT-4 [Guest migration support for KVM]
** VIRT-51
*** patches committed upstream, work item complete
** VIRT-73
*** updated Juan's patches to use VMState for ARM CPU migration,
fixed a few bugs noted along the way, submitted upstream
[work item now just pending review & commit]
Plans:
* qemu maintenance
* VIRT-4
** VIRT-49
-- PMM
The Linaro Toolchain Working Group is pleased to announce the 2013.04
release of Linaro GCC 4.8, Linaro GCC 4.7 and Linaro GCC 4.6.
Linaro GCC 4.8 2013.04 is the first release in the 4.8 series. Based off the
latest GCC 4.8.0+svn197294 release, it includes performance improvements and
bug fixes.
Interesting changes include:
* Our first 4.8 based release
* Updates to GCC 4.8.0+svn197294
* Initial optimized support for Cortex-A53 for arm*-*-* targets
* Improved support for new ARMv8-A instructions for arm*-*-* and
aarch64*-*-* targets.
* Backport of optimizations concerning whether to use Neon for 64-bit
bitops for arm*-*-* targets.
Linaro GCC 4.7 2013.04 is the thirteenth and last development release in the
4.7 series before entering maintenance. Based off the latest GCC 4.7.2+svn197188
release, it includes ARM-focused performance improvements and bug fixes.
Interesting changes include:
* Updates to GCC 4.7.2+svn197188
* Includes arm/aarch64-4.7-branch up to svn revision 196381
* Backport vectorizer cost model
* Turn off 64-bit Bitops in Neon
Linaro GCC 4.6 2013.04 is the 26th release in the 4.6 series. Based
off the latest GCC 4.6.3+svn197511 release, this is the thirteenth
release after entering maintenance and the last regular one.
Interesting changes include:
* Updates to 4.6.3+svn197511
The source tarballs are available from:
https://launchpad.net/gcc-linaro/+milestone/4.8-2013.04https://launchpad.net/gcc-linaro/+milestone/4.7-2013.04https://launchpad.net/gcc-linaro/+milestone/4.6-2013.04
Downloads are available from the Linaro GCC page on Launchpad:
https://launchpad.net/gcc-linaro
More information on the features and issues are available from the
release pages:
https://launchpad.net/gcc-linaro/4.8/4.8-2013.04https://launchpad.net/gcc-linaro/4.7/4.7-2013.04https://launchpad.net/gcc-linaro/4.6/4.6-2013.04
Mailing list: http://lists.linaro.org/mailman/listinfo/linaro-toolchain
Bugs: https://bugs.launchpad.net/gcc-linaro/
Questions? https://ask.linaro.org/
Interested in commercial support? Inquire at support(a)linaro.org
Hi all,
Currently the binutils job that gets run via cbuild configures with
--enable-gold. I guess this could be useful to ensure the gold build
is not broken, but has the downside of slowing down the build and
causing make check to fail.
I propose we do not enable gold until such a time as we wish to
formally support gold and fix the broken make check.
Does anybody have any objections to doing that?
Thanks,
--
Will Newton
Toolchain Working Group, Linaro
== Progress ==
* Investigated gdb test cases that are failing on
arm-remote gdbserver configuration.
-- Fixed some failures by updating host cross compiler version
-- Fixed some failures by fixing environment issue where not being loaded
properly.
-- All test cases that need to build a shared library and transfer it to
remote target FAIL due to problems which seems like dejaganu limitations.
* Ran GDB test suite on x86_64 remote gdbserver configuration and compared
performance of same configuration on arm.
* Got most documentation ready for Ireland Visa application, still waiting
on invitation letter and hotel booking details.
*** Still No blue-print available to log work in JIRA.
== Plan ==
* Investigate compiler version specific and general failures on arm remote
gdbserver configuration.
* Start on investigation/fixing of arm specific failures in gdb test suite
results.
* Submit Ireland visa application after receiving invitation letter and
hotel booking details.
* Planned Holiday
-- Planned Day off on Friday 12th April 2013 for car checkup in workshop.
== Summary ==
- benchmarking coremark with VRP based extension elimination
* extension elimination in some cases affecting other optimizations
* With this improvements are marginal (details below)
== Plan ==
- study crc where extension elimination is resulting in bad code
- Find a solution
==Details==
If an assignment gimple statement has RHS expression value that can fit
in LHS type, truncation is redundant. Zero/sign extensions are redundant
in this case and rtl statement can be replaced as
from:
(insn 12 11 0 (set (reg:SI 110 [ D.4128 ])
(zero_extend:SI (subreg:HI (reg:SI 117) 0))) c5.c:8 -1
(nil))
to:
(insn 12 11 0 (set (subreg/s/u:HI (reg:SI 110 [ D.4128 ]) 0)
(subreg:HI (reg:SI 117) 0)) c5.c:8 -1
(nil))
With this change, for the following case:
short unPack( unsigned char c )
{
/* Only want lower four bit nibble */
c = c & (unsigned char)0x0F ;
if( c > 7 ) {
/* Negative nibble */
return( ( short )( c - 16 ) ) ;
}
else
{
/* positive nibble */
return( ( short )c ) ;
}
}
asm without elimination
unPack:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
and r0, r0, #15
cmp r0, #7
subhi r0, r0, #16
uxthhi r0, r0
sxth r0, r0
bx lr
.size
asm with elimination
unPack:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
and r0, r0, #15
cmp r0, #7
subhi r0, r0, #16
sxth r0, r0
bx lr
In some cases, changed rtl statement is not eliminated by later passes
and is generated as a mov instruction. Worse, it also seems to affect
the other optimization passes and resulting in worse code for crc. Not
found the cause for it yet.
== Progress ==
* gc sections tests
Completed adding gc-section test cases.
1. TLS and GOT tests.
http://sourceware.org/ml/binutils/2013-03/msg00273.html
2. PLT tests.
http://sourceware.org/ml/binutils/2013-03/msg00273.html
* Evaluate gprof work.
There is one function hook "find_call" which is machine dependent and
is written for i386.c/sparc
Then there are changes in configure.in to add architecture details.
* 1-1 with Matt.
== Plan ==
Continue gprof support work for Aarch64
== Progress ==
* Buildbots
- All ARM build-bots GREEN! Hurray!!! :D
- Improving sort comparison functions makes output reproducible on all
machines
- Fixed fpcmp, which fixes sqlite3 report
- A bad commit broke lots of tests, reverted
- Adding a self-hosting check-all Pandaboard
- In theory, it works, but config is not perfect yet
* Arndale
- Suse image is not stable enough, installed Linaro (325)
- Fathi got GCC bootstrapping with it, should work
- Installed a heat-sink on the board, should reduce problems
- Still got network issues, may be the mac address
- Someone could liaise with Dave Pigott to get a new MAC on the DHCP
* EuroLLVM 2013
- Badges, finishing touches
== Plan ==
* Holidays next week, then
* Check with Galina about Beagle bots
* Finish Panda self-host + test-suite A9 bot
* Gather info and hardware for 3.3 release tests
* Plan for the future!
== Progress ==
* Short week (2 days)
* Tested memcpy on big endian.
* Updated newlib memcpy patch.
* More digging into glibc IFUNC.
== Issues ==
* None.
== Plan ==
* Try and get newlib mempcy patch accepted.
* IFUNC...
--
Will Newton
Toolchain Working Group, Linaro
== Progress ==
* Disable-peeling:
- Still waiting for the results from cbuild. Because of many merges
(pre-release week), there are a lot of jobs in the queue :-(
* Libsanitizer:
- tried to understand why isatty(2) returns true when executing the
testsuite via qemu.
* Neon intrinsics codegen:
- benchmarking in bare-machine shows that GCC is actually ~5% faster
than RVCT on this sample codec.
- to be further discussed internally
* Neon intrinsics testsuite:
- looking at FP16 support.
* Turnoff 64bits ops in Neon:
- backport in 4.8 done by Matt during the release merges.
* Arndale: a few attempts to use it, but it's still unstable.
* Internal support
== Next ==
* Disable-peeling: Analyze bench results when available.
* Revert-coalesce-vars: Analyze bench results when available.
* Libsanitizer: contine isatty(2) study wrt qemu/expect.
* Neon intrinsics: check internally for other codecs