=== Previous ===
libvpx NEON intrinsics investigation [TCWG-429 14/10]
. if interested, you can read the report in card 429:
https://cards.linaro.org/browse/TCWG-429?focusedCommentId=22247
tested division optimisation patches against current trunk and pinged
on list [2/10]
good friday bank holiday [2/10]
=== Next ===
Resume looking at NEON scheduling TCWG-135, which ties into libvpx
Upgrade laptop to Ubuntu 14.04 LTS
== Progress ==
* GDB reverse debugging on aarch64
-- Scored through a64 instruction set details. [1/10]
-- Completed testing and implementation of all basic structures.
[TCWG-398] [1/10]
-- Started decoding aarch64 load store instructions for recording.
[TCWG-401] [2/10]
-- Completed decoding of aarch64 branch instructions. [TCWG-400] [2/10]
-- Scored through a32 instruction set and researched on a32 recording.
[2/10]
* Miscellaneous [2/10]
-- UK visa application process
-- Tools cauldron and TCWG sprint bookings etc.
-- Meetings
== Plan ==
* GDB reverse debugging on aarch64
-- Complete decoding of aarch64 load store instructions. [TCWG-401]
-- Completed decoding of aarch64 exception and system instructions.
[TCWG-400]
== Progress ==
PGO - AArch64 (TCWG-179) (4/10)
* Completed SPEC runs -O3 -mcpu=cortex-a57 in chroot + qemu
arm64-saucy with Linaro release source (4.8) march 2014.
* Built Linaro 4.9 GCC branch based tool chain under chroot + qemu arm64-saucy
* Created a PGO config file for aarch64 as a peak configure and use --tune=peak
SPEC gave warning that FDO cannot be used as base flags and will
ignore the flags.
Also setting basepeak = yes gave issues and ran base runs and
setting it to no does both base and peak runs
* Started SPEC2006 runs for PGO under chroot + qemu arm64-saucy
* Perlbench train run failed in qemu.
* Other benchmarks are progressing.
GLIBC Systemtap (2/10)
* Ran glibc make check under qemu. Seen some illegal instructions and
ntpl tests hung in qemu aarch64. Will has completed these tests in
hardware.
Misc (2/10)
* Tested cbuildv2 to build gcc natively under qemu-arm64-chroot-ubuntu-saucy.
git gave issues not able to clone and hangs when downloading
binutils and gdb repos.
* 17th was Public Holiday (2/10)
== Plan ==
* Go through the results for SPEC2006 PGO runs.
* Start functional PGO runs for SPEC2000
* Bug fixing.
== Issues ==
* Toolchain64 disk full
== Progress ==
* Linaro GCC 4.9 2014.04 (3/10)
- Created FSF Linaro 4.9 branch
- Testing FSF 4.9.0 RC
* Launchpad bugs: (3/10)
o LP #1169164 : including signal.h exposes various PSR_MODE #defines
- Two possible ways to fix the issue (to be discussed with maintainers)
* Misc:
o Cbuildv1 baby-sitting (1/10)
o Various meetings (1/10).
== Next ==
- Easter Monday off
- Continue on LP #1169164
- Linaro GCC 4.9 release (when FSF release will be made)
== Week of April 14th ==
- Benchmarked autoprefetcher patches. (TCWG-388, 2/10)
- Continued working on spec2xxx-utils scripts. These are now gaining support for AArch64 and SPEC2006. (TCWG-238, 4/10)
- Various discussions. (2/10)
- Short week due to Good Friday / Easter. (2/10)
== Week of April 21st ==
- Take Tyler B.'s LAVA test definition for SPEC benchmarking and adjust spec2xxx-utils scripts to match LAVA environment.
- Investigate regressions on gap, eon and wupwise from autoprefetcher patches.
-- SPEC2000's gap in particular is sensitive to 1st scheduling pass with register pressure, and, apparently has a 5% performance improvement potential (at -O2 Cortex-A15).
- Various discussions.
- Even shorter week due to Easter Monday and ANZAC Day on Friday. (4/10)
--
Maxim Kuvyrkov
www.linaro.org
== Progress ==
* TCWG-238 (4/10)
- Created scripts and spectools for spec2006 to work with cbuild2
* TCWG-413 Spec2006 (4/10)
- Worked out Spec2006 config and src.alt for v1.1 src we have to work
correctly
- Got it to work natively with Maxim's scripts
- Did a trail run in apm openembedded; Needs ccrypt, tar with xz
compression support and system binutils with -mabi=ilp64 support
* 18th is Public Holiday (2/10)
== Plan ==
Benchmarking and FENV support
== Progress ==
* Attempted to get qemu-aarch64 working, fails on saucy, works on
precise. Did native build, hung during 'make check'. Setup
qemu-aarch64 chroot on all TCWG x86_64 build farm
machines. (4/10)
* Tracked down problem that's kept TCWG Jenkins from
working. (1/10)
* Tried Kugan's benchmarking branch of cbuildv2. (1/10)
* More work on Neon intrinsics tests. (TCWG-322 - 1/10)
* Meetings and Misc (3/10)
- Worked on benchmarking doc with Ryan.
== Plan ==
* Get Jenkins working with matrix builds so we can utilize all LAVA
slaves.
* More experimenting with Kugan's benchmarking branch.
== Issue ==
* None.
== Progress ==
* multilib can not work with multiarch anymore, so change Linaro
crosstool-ng to make 2014.04 release (2/10).
- For 4.8 release, we will revert the change and keep it align with
previous releases.
* Linaro 4.9 toolchain binaries pre-releases (Cross & native) for
aarch64 (2/10):
http://cbuild.validation.linaro.org/binaries/4.9-prerelease-2014.04.
* Try to set up native spec2006 tests on Chromebook. But got "out of
memory" when building the tools (1/10).
* Clean up and test for shrink-wrap enhancement (TCWG-133: 2/10)
* Rebase the aarch64 fcsel patch and testing. (1/10)
* ARM internal work (2/10)
== Plans ==
* Send the shrink-wrap related patches for review.
* PING the pending patches.
== Planed leaves ==
* April 21: Team event.
* May 1-3: Labour day holiday.
== This week ==
- Resolved Launchpad bug 1305042 as user error and provided proper asm
coding [TCWG-290] [2/10]
- Investigated infinite loop bug [TCWG-290][6/10]
- Determined the infinite loop is occurring in function
vt_find_locations in the variable tracking pass
- Still debugging to determine cause of infinite loop
- Absent due to illness on April 14th
== Next week ==
Vacation
== Future ==
Getting the following cross compiling for RPi on Debian Wheezy x64 using
the gcc-linaro-arm-linux-gnueabihf-raspbian toolchain within Eclipse
Kepler. Compilation works fine using native Debian x64 GCC and native
GCC compiler on the RPi.
out of dynamic memory in yy_create_buffer()
collect2: error: ld returned 2 exit status
make: * [cwebsocket] Error 1
I'm stumped. Can't find much info on the issue. Any ideas?
== Progress ==
* One bank holiday, half day childcare (3/10)
* Submit two iterations of malloc microbenchmark for glibc (4/10, TCWG-160)
* Submitted a patch for aarch64 ld bug with SystemTap notes (1/10)
* Investigated strcmp implementations for ARM (1/10, TCWG-153)
* Patch review, bugs, support, etc. (1/10)
== Issues ==
* None
== Plan ==
* malloc benchmarking/implementation work
--
Will Newton
Toolchain Working Group, Linaro
== Progress ==
TCWG-156 cortex-strings memset (3/10)
* Got a full set of benchmarks (for my 2 targets)
* Cleaned up code
* Sped up small memsets (for A9, A15 results pending)
glibc performance bug in lowlevellock.c (1/10 - I'll make a card next week)
* Learned to build and test glibc
* Understood the bug and worked out how to fix
Misc
* Meetings/mail (1/10)
* Infrastructure/workflow fiddling (1/10)
* 2 days holiday
== Plan ==
More TCWG-156
Finish off glibc performance bug
Public holidays Friday & Monday
== Week of April 7th ==
- Made a how-to wiki page/script on how to use QEMU and schroot for system emulation (TCWG-179, 2/10).
-- This allows you to get armhf or armel or aarch64 Ubuntu or Debian system within minutes, e.g., for PGO or LTO bootstraps of GCC.
-- https://collaborate.linaro.org/display/TCWG/ARM+ubuntu+chroot+with+QEMU+use…
-- I can move this page into public wiki if get enough interest from non-Linaro people.
- Fixed various bugs in instruction scheduling patch set (TCWG-388, 5/10).
- Scripted running SPEC2000 for reproducible benchmarking (TCWG-238, 3/10).
-- The scripts are now at http://git.linaro.org/toolchain/spec2xxx-utils.git
-- Scripts support both local (native) and remote (cross) benchmarking.
== Week of April 14th ==
- Document spec2xxx-utils scripts and see how they fare in LAVA environment. Add support for SPEC2006. (TCWG-238)
- Post autoprefetcher patch set upstream (TCWG-388)
--
Maxim Kuvyrkov
www.linaro.org
== Week of March 31st ==
- Investigated and scripted how to reproducibly run benchmarks (TCWG-238, 7/10)
-- Stop all services on target board
-- Use local disk
-- Stop networking
-- Bind benchmark to a specific CPU. Bind all other processes to a different CPU.
-- Disable frequency scaling.
- Various discussions (3/10)
--
Maxim Kuvyrkov
www.linaro.org
PGO - AArch64 (TCWG-179) (4/10)
* Installed QEMU user static for aarch64 and use them from
chroot environment in ubuntiu 13.10
* Tried installing chroot + qemu arm64-saucy in another ubuntu 12.04
machine. Issue with binfmts. looks to be corrupted.
* Bootstrap GCC with PGO with --enable-languages=c,c++,fortran
completed under arm64-chroot.
* Built GCC and binutils in native mode under chroot + qemu arm64-saucy .
* SPEC runs -O3 -mcpu=cortex-a57 in chroot + qemu arm64-saucy with
linaro release source (4.8).
Calculix did not finish under chroot + qemu arm64-saucy
GLIBC Systemtap (4/10)
* Built systemtap under chroot + qemu arm64-saucy. Built glibc with
the patch and used --enable-systemtap.
* Able to see setjmp probe in gdb, but longjmp probes not loading.
Tried debugging in gdb. gdb debugging support not available yet under
qemu.
* Discussed with Will newton on my observation.
Bug fix (1/10)
* Not much progress continued looking at reload dumps.
Misc (1/10)
* Tested cbuildv2 to build gcc natively under qemu-arm64-chroot-ubuntu-saucy.
cbuildv2 reports configure error for gcc and g++multilibs. But they
are not needed for native builds. Sent note to Rob for fixing it.
* Attend 1-1 with Ryan
* Attend 1-1 with Maxim discuss PGO work.
== Plan ==
PGO runs for SPEC2006 on chroot + qemu arm64-saucy
Upstream /testing for systemtap probes in glibc
Reload bug fix.
Leave: 17-Apr local state election.
== Progress ==
* GDB reverse debugging on aarch64 [6/10] [TCWG-398]
-- Setup development tree and debug environment.
-- Verified generic reverse debugging implementation aarch64.
-- Implemented main function stubs to catch instruction for decoding
and recording.
* Work on gdb testing utility [TCWG-96] [1/10]
-- Some tweaks to scripts and wiki updates.
* Experimented gdb testing/debugging with Aarch64 QEMU + Chroot [3/10]
== Plan ==
* GDB reverse debugging on aarch64 [TCWG-398]
-- Implement main instruction decoding handler and write function
stubs for different types of instructions.
-- Dig deeper into aarch32 for handling it through armv7
implementation if possible.
== Progress ==
TCWG-156 cortex-strings memset (5/10)
* Fixed a couple of bugs in the no-VFP case
* Ran benchmarks, discarded broken benchmarks, ran more benchmarks
* Explored benchmarking scripts
* Took a hard look at the memset tests
* Experimented with ARM-internal cortex-strings benchmark (not useful right now)
Misc
* Meetings (1/10)
* Kicked an A15 target until it worked (1/10)
* 1/2 day holiday
== Plan ==
More TCWG-156
Maxim's glibc starter issue
Holiday Tuesday
Public Holidays Friday & next Monday
1 day off (Child care)
== Issues ==
* Toolchain64 disk full
== Progress ==
* Released Linaro GCC 4.7 and 4.8 2014.04 (5/10)
* Launchpad bugs: (1/10)
o LP #1169164 : including signal.h exposes various PSR_MODE #defines
- rebased and rework patch
* Misc:
o Cbuildv1 baby-sitting (1/10)
o Various meetings (1/10).
== Next ==
- Continue on LP #1169164
- Linaro GCC 4.9 release
== Issue ==
* None
== Progress ==
* 2 day off.
* Investigate lp:1304267 and close it as invalid.
* Investigate shrink-wrap bootstrap issues [4/10, TCWG-133]
- Re-implement the copy propagate part by referring cprop_hardreg pass.
- Other tests are ongoing.
* Document cross-native build on wiki:
https://wiki.linaro.org/WorkingGroups/ToolChain/cross-native
== Plans ==
* Send the shrink-wrap related patches for review.
* PING the pending patches.
== Planed leaves ==
* April 21: Team event.
* May 1-3: Labour day holiday.
== Progress ==
* TCWG-413 Spec2006 (6/10)
- Setup chroot for aarch64
- Created rootfs with 4.8/trunk and spec2006
- booted created rootfs on foundation model with ubuntu kernel
* TCWG-291 CRC (3/10)
- posted vrp patch upstream
- with that seeing expected performance improvement
- analysing crc complete and up-streaming activities pending
* LP1301335 and PR59695 back-porting (1/10)
== Plan ==
Benchmarking and FENV support
Dear Sir,
Thanks for you page. it is really very helpful to us.
We are facing a problem during compiling GCC for our ARMv7-a Cortex-a9.
We are using following option:
1.
../gcc-linaro*/configure --disable-bootstrap --enable-languages=c,c++
--with-mode=thumb --with-arch=armv7-a --with-tune=cortex-a9
--with-float=hard --with-fpu=vpfv3-d16 --prefix=$home/gcc/gcc-linaro
2.
make -j`getconf _NPROCESSORS_ONLN`
after 2 step we are getting following error:
checking whether putc_unlocked is declared... yes
checking whether getrlimit is declared... yes
checking whether setrlimit is declared... yes
checking whether getrusage is declared... yes
checking whether ldgetname is declared... no
checking whether times is declared... yes
checking whether sigaltstack is declared... yes
checking whether madvise is declared... yes
checking for struct tms... yes
checking for clock_t... yes
checking for F_SETLKW... yes
checking if mkdir takes one argument... no
Unknown CPU given in --with-arch=armv7-a.
make[1]: *** [configure-gcc] Error 1
make[1]: Leaving directory `/home/anwej/src/build'
make: *** [all] Error 2
Please suggest the solution. where is the problem and what will be our next
steps.
Thanks in advance.
-best regards
Anwej Alam
Ph: +91.995.833.3456
Hi,
The preprocessed file:
http://people.linaro.org/~rikuvoipio/qmltextgenerator.ii.gz
With compile command line:
g++ -save-temps -c -g -O2 -fstack-protector --param=ssp-buffer-size=4
-Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 -fvisibility=hidden
-fvisibility-inlines-hidden -Wall -W -D_REENTRANT -fPIC -o x.o
qmltextgenerator.ii
Will take apparently forever (north of 1600min) on a debian native gcc. -O
compiles instantly. Some bisecting for gcc optimization flags later,
disabling strict aliasing allows instant build as well:
g++ -save-temps -c -g -O2 -fno-strict-aliasing -fstack-protector
--param=ssp-buffer-size=4 -Wformat -Werror=format-security
-D_FORTIFY_SOURCE=2 -fvisibility=hidden -fvisibility-inlines-hidden -Wall
-W -D_REENTRANT -fPIC -o x.o qmltextgenerator.ii
Happens with:
gcc version 4.8.2 (Debian 4.8.2-19)
and
gcc version 4.9.0 20140405 (experimental) [trunk revision 209146] (Debian
20140405-1)
Linaro binary cross-compilers are compile the file fine with -O2.
do we do native gcc testing, or should it be just submitted upstream
bugzilla?
Riku
The Linaro Toolchain Working Group is pleased to announce the 2014.04
release of both Linaro GCC 4.8 and Linaro GCC 4.7.
As announced at Linaro Connect USA 2013 Linaro GCC moved to a pattern of
quarterly stable releases, with engineering releases in the intervening
months. This is the second stable release, and contains no known regressions
compared to the 2014.01 release.
The next stable release of GCC 4.8 will be the 2014.08 release. There will be
no engineering releases of GCC 4.8 until this release, as it enters in
maintenance.
No more releases of GCC 4.7 are planned.
Next month's release - 2014.05 - will be based off GCC 4.9 and be an
engineering build.
Linaro GCC 4.8 2014.04 is the thirteenth and last development release in the
4.8 series before entering maintenance. Based off the latest GCC 4.8.3+svn208968
release, it includes performance improvements and bug fixes.
Interesting changes include:
* Updates to GCC 4.8.3+svn208968
* Cortex-a53 support
* A fix for LP #1292489: Buggy vectorization of dot products
* A fix for LP #1268893: ICE when building kernel raid6 neon code
* A fix for LP #1273511: ICE APCS Frame & optimize-sibling-calls
Linaro GCC 4.7 2014.04 is the twenty third release in the 4.7 series. Based
off the latest GCC 4.7.4+svn209005 release, this is the tenth release after
entering maintenance and the final one.
Interesting changes include:
* Updates to GCC 4.7.4+svn209005
* A fix for LP #1129013: Internal compiler error in push_reload during
bootstrap stage 2
* A fix for LP #1292489: Buggy vectorization of dot products
* A fix for LP #1301335: Compiler segmentation fault while cross-compiling QT5
Webkit
The source tarball is available from:
http://releases.linaro.org/14.04/components/toolchain/gcc-linaro/4.8http://releases.linaro.org/14.04/components/toolchain/gcc-linaro/4.7
Downloads are available from the Linaro Releases website:
http://www.linaro.org/downloads/
More information on the features and issues are available from the
release page:
https://launchpad.net/gcc-linaro/4.8/4.8-2014.04https://launchpad.net/gcc-linaro/4.7/4.7-2014.04
Mailing list: http://lists.linaro.org/mailman/listinfo/linaro-toolchain
Bugs: https://bugs.launchpad.net/gcc-linaro/
Questions? https://ask.linaro.org/
Interested in commercial support? Inquire at support(a)linaro.org
I compiled my code with debug symbols on an BeagleBoneBlack using Debian
gcc-4.7. If I use objdump -S on my object file, I see both source lines
and disassembly. On my Ubuntu 13.10 host, using
gcc-linaro-arm-linux-gnueabihf-4.7-2013.04-20130415_linux, I do
arm-linux-gnueabihf-objdump -S on the same object file, I see disassembly
lines, but the source lines are not displayed. I¹m attempting to debug my
code with a Lauterbach JTAG debugger, but no source code is available
which makes debugging very difficult.
Is there some compatibility issue here or am I doing something wrong?
Regards,
John
== Progress ==
* Work on gdb testing utility [TCWG-96] [8/10]
-- Support to run native and native-gdbserver test via ssh on remote
machines like arm-linux.
-- Bug fix and test gdb testing utility by running it various configurations.
-- Update wiki page with testing utility and how to use it.
* Chromebook ubuntu re-install and fix network lag issue [2/10]
== Plan ==
* GDB reverse debugging on aarch64
-- Start implementation of reverse debugging infrastructure [TCWG-398]
-- Add support for running aarch64 gdb test suite in gdb testing utility.
== This week ==
a53 support [CARD-300][3/10]
- aarch64-none-elf target using cpu=cortex-a53 passed validation on
foundation model
- Resolved code review issues (formatting and unnecessary patches)
Backport 202663 - vectorizer bug passed validation and merge review for
4.7 and 4.8 [CARD-300][3/10]
GCC Bugzilla bug 60657 [TCWG-290][2/10]
- began fix by adding new conditions to pattern causing crash
- bug was fixed upstream by Jeffrey Law on April 4th
== Next week ==
- Transition from backports to bug fixing
- Create Wiki page for Aarch64 bug contingency bug fixes, feature and
performance improvements for partners
== Future ==
One week of vacation either the third or fourth week of April.
=== Progress ===
LP1296601 (ICE in push_minipool_fix) [5/10]
* completed a prototype fix
* submitted RFC patch to gcc-patches
* still awaiting review
PR60609 (Error: value of 256 too large for field of 1 bytes) [3/10]
* implemented fix and posted to gcc-patches
* approved, subject to further testing on Thumb-1
libvpx NEON assembler vs instrinsics performance investigation [2/10]
* looking at disassembly, code is not terribly aesthetically pleasing
* in some cases clang looks better
=== Plan ===
write up libvpx investigation
follow up/ping LP1296601
NEON scheduling TCWG-135
TCWG-156 (5/10)
* Hacked v7 memcpy into a memset
* Much fiddling with builds, targets
* Kicked off a benchmark run
Misc
* Meetings (1/10)
* Finding hardware/setting up working environments/figuring out workflows
(4/10)
== Issues ==
* none
== Progress ==
* Launchpad bugs:
o TCWG-422 : ICE in assign_by_spills building linux btrfs module (1/10)
- New failure after first fix reported.
- reduced new testcase.
- Fix committed by Vladimir as rev209038
o Backported "Internal compiler error in push_reload during
bootstrap stage 2" to GCC 4.7 (1/10)
- analysed validation results.
- re-spawned some jobs.
* Backports review: (5/10)
o cortex-a53 support backport:
- We are still not able to validate it on aarch64-linux-gnu target
with a compiler configured to default to cortex-a53, but no regressions
observed in the generic case and on bare_metal (with cortex-a53).
* Misc:
o Cbuildv1 baby-sitting (2/10)
- Toolchain64 disk was full.
o Various meetings (1/10).
== Next ==
- Mainly 4.7 and 4.8 April releases
- TCWG-413 Spec2006 (5/10)
- Analysed 456.hmmer
- In the process of opening performance bug reports
- Started looking at 453.povray
- TCWG-291 CRC (2/10)
- Not seeing performance improvement with redundant "and" instruction gone
- Analysing with perf to see the reason
- LP1301335 (3/10)
- SLP vectorizer ICEs for QT5 Webkit for Linaro 4.7
- Doesn’t occur in trunk/4.8/4.7 FSF
- Patch proposed for merge request which fixes
- I also see some FAIL -> PASS in the regression with this patch
- This patch is only relevant for Linaro 4.7 so we cant/don’t need to
upstream it (?)
== Plan ==
Continue with Spec2006 and crc
4 day week 31-Oct local holiday
Bug fix (2/10)
* Looking at a register allocation issue with ARMv7 hard float issue. (3/10)
Tried changing machine description pattern same as trunk in gcc 4.8 branch.
Issue does not occur with trunk and reason is arm64 moved to lra.
turning off lra bug occurs.
Trying to find out if it is easy to fix in reload or wait for LRA backport.
PGO - AArch64 (TCWG-179) (3/10)
* Native CPU2006 runs on V8 foundation model.
SPEC runs -O3 -mcpu=cortex-a57. INT benchmark failures seen with mcf
and h264ref.
rest benchmarks running.
* Tried to use ubuntu saucy core image on V8 foundation model and
mount NFS is failing.
* Trying to install QEMU user static for aarch64 and use them from
chroot environment
GLIBC Systemtap (2/10)
* Re spined libc systemtap probe patch to glibc. Will newton is
testing it in hardware.
Meetings (2/10)
* Attend 1-1 with Ryan discuss 2014 goal planning.
* Attend 1-1 with Maxim discuss PGO work.
== Progress ==
* Kernel (TCWG-417)
- Implementing named register global variables (D3261)
- Helping Milosz and Vinicius (LLVMLinux) to get a kernel ready
- First LLVM-compiled kernel booted on Versatile Express hardware
* Background
- Reviewing patches, etc.
- Apple merged their ARM64 back-end, fiddling bots
- Making the new TableGen docs official
- Jira farming
- Became code owner for the ARM Linux support
- LLVM Foundation announced
- Trying to run SPEC on AArch64
* Time
- CARD-124 6/10
- Others 4/10
== Plan ==
* Holiday for two and a half weeks
* Follow up the named register patch
== Progress ==
* glibc patch review (2/10)
* Helping out with aarch64 glibc setjmp/longjmp Systemtap probes testing (1/10)
* Investigated and submitted patch for gas ARM alignment issue (3/10)
* Committed library and script for malloc logging (1/10, TCWG-423)
* Rebased and tidied up malloc microbenchmark (2/10, TCWG-160)
* Various small binutils and glibc patches (1/10)
== Issues ==
* None
== Plan ==
* Submit patch for glibc malloc microbenchmark
--
Will Newton
Toolchain Working Group, Linaro
Hi all,
I've just filed a bug on glibc I'd love you to take a look at:
https://sourceware.org/bugzilla/show_bug.cgi?id=16796
Here's the description to save clicking:
Hi,
There is a test in glibc (tst-tls5) that tests that
((uintptr_t)pthread_self())%16 is zero. But watch this:
(t-mwhudson)mwhudson@am1:~$ cat btp.c
#include <stdint.h>
#include <stdio.h>
#include <pthread.h>
int
main(int argc, char** argv)
{
uintptr_t p = (uintptr_t)__builtin_thread_pointer();
uintptr_t q = (uintptr_t)pthread_self();
printf("p: %lx %ld\n", p, p%16);
printf("q: %lx %ld\n", q, q%16);
}
(t-mwhudson)mwhudson@am1:~$ gcc -o btp btp.c -lpthread
(t-mwhudson)mwhudson@am1:~$ ulimit -s unlimited
(t-mwhudson)mwhudson@am1:~$ ./btp
p: 2000028d88 8
q: 2000028698 8
(t-mwhudson)mwhudson@am1:~$ ulimit -S -s 8192
(t-mwhudson)mwhudson@am1:~$ ./btp
p: 7f7fd086f0 0
q: 7f7fd08000 0
So something is clearly wrong; maybe it's just that the test is too
strict, but somehow that seems a bit unlikely. FWIW, this doesn't
happen if you don't link with libpthread so maaaaybe it's a bug in
something that ends up in libpthread's .init section?
Cheers,
mwh
== Week of March 24th ==
- STREAM regression (TCWG-388, 5/10)
-- Finished prototype patch. The patch adds modeling of ARM L2 auto-prefetcher hardware to GCC scheduler (the model is very simple as auto-prefetcher is very lightly documented). Half of the patch cleans up and improves GCC scheduler, and the other half implements the auto-prefetcher model.
-- While looking into ARM scheduling support noticed that ARM doesn't use multipass lookahead scheduling, which surprised me. Enabled it (multipass scheduling) in my patches.
- Looked into lll_timed_wait Glibc/uClibc bug upstream (1/10)
-- https://sourceware.org/ml/libc-alpha/2014-03/msg00905.html
- Various discussions and reviews (4/10)
== Week of March 30th ==
- STREAM regression (TCWG-388)
-- Benchmark patches on SPEC2k and find/confirm best values for tuning parameters:
--- dfa_lookahead: should normally be issue_rate-1.
--- L2 auto-prefetcher queue depth: new tuning knob.
-- Investigate any performance regressions from the patches.
- lll_timed_wait Glibc/uClibc bug
-- Make sure it is fixed upstream. Possibly backport to Linaro branches.
--
Maxim Kuvyrkov
www.linaro.org
== Issues ==
* none
== Progress ==
* Launchpad bugs:
o TCWG-422 : ICE in assign_by_spills building linux btrfs module (1/10)
- created blueprint for :
https://bugs.launchpad.net/gcc-linaro/+bug/1296676
- Reported upstream as :
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60650
- Reduced testcase
- Fix committed by Vladimir as rev208876
- still ICE when configured for arm-none-linux-gnueabihf
o Backported "Internal compiler error in push_reload during
bootstrap stage 2" to GCC 4.7 (1/10)
- https://bugs.launchpad.net/gcc-linaro/+bug/1129013
- some testsuite regressions observed. will investigate.
* Backports review: (5/10)
o reviewed backport for pr60264 and rev202663
o cortex-a53 support backport:
- Analysed testsuite regression
- 22K Loc patch under review
* LRA on AArch32:
o TCWG-345 : Analyse performance of LRA for ARM. (1/10)
- looked at the perf tool results
* Misc:
o Various meetings (1/10).
o Various support to team members (1/10)
o Cbuildv1 baby-sitting (Calxedas nodes have to be restarted after
each upgrades !)
== Next ==
- continue cortex-a53 review
- continue on backports.
- continue on TCWG-345.
== Progress ==
* Short Week
-- Monday Day Off: Pakistan Day 23rd March public holiday roll over. [2/10]
-- Short leave on Thursday [1/10]
* Work on gdb testing utility [TCWG-96] [7/10]
-- Writing a new gdb testing utility script that automates gdb
testing in various configurations and compares testsuite results.
-- Support for all kind of native, native-gdbserver and
remote-gdbserver gdb configurations has been added.
-- Interactive and configuration file based user input mode has been added.
-- Testing configurations host alive, ssh possible, board file found
and ability to build sources using user defined configure flags has
been added.
== Plan ==
* Work on gdb testing utility [TCWG-96].
-- Add support to run native and native-gdbserver test via ssh on
remote machines like arm-linux.
-- Bug fix and test gdb testing utility by running it various configurations.
-- Update wiki page with testing utility and how to use it.
== This week ==
a53 support
- Fixed regression found in arm testing and resubmitted build and
merge requests
- arrch64 testing passed with no regressions. Testing with a53
support enabled still required
Merged 202663 (vectorizer bug) into 4.7 and 4.8 branches. Submitted
merge and build requests
== Next week ==
- Test aarch64 with a53 support enabled on qemu64
- Work on bug 60657 - [4.9 Regresssion] ICE: error: insn does not
satisfy its constraints
== Future ==
== Issues ==
* None
== Progress ==
* Create prebuilt sysroot based on Linaro eglibc 2014.04 release
(https://launchpad.net/linaro-toolchain-binaries/support/01/+download/linaro…)
(1/10).
* Enable shrink-wrap for apcs. Patch was out for community review. (1/10)
* Reinvestigate shrink-wrap enhancement (4/10, TCWG-133)
- There was improvement in ira to split_live_ranges_for_shrink_wrap
(http://gcc.gnu.org/bugzilla/show_bug.cgi?id=10474). But it still can
not handle the case in 453.povray.
- Investigate to do a simple copy-forward when prepare_shrink_wrap.
* Investigate lp:1296942 (pr60663). Patch is sent out for community
review. (4/10)
* Backporting the fix for pr60264 to Linaro 4..8.
== Plans ==
* Continue on shrink-wrap enhancement.
== Planned leaves ==
* April 7: Qingming holiday.
== Progress ==
- TCWG-413 Spec2006 (7/10)
- Investigated compiler error for 481.wrf with FSF 4.8.2. Issue is
due to aarch64_cm<cmp><mode> pattern (fcmle and fcmlt supports only #0
as third val). This is already fixed in trunk and Linaro 4.8.
- Ran profiling to analyse 4.9 regressions. Started looking into
P7Vitterbui which is one of the functions that performs badly.
- TCWG-291 CRC (3/10)
Came up with a patch for improving vrp for test-case. Some c++ test
cases are failing in regression testing with this patch. Looking into it.
- TCWG-394 / PR60034
Patch committed and card closed.
http://gcc.gnu.org/viewcvs?rev=208949&root=gcc&view=rev
== Plan ==
Continue with Spec2006 and crc
== Progress ==
* Android (no card, 10 minutes. ;)
- Implemented __builtin___clear_cache in Clang
* Kernel (TCWG-417) 6/10
- VLAIS in crypto (last place in kernel)
- __builtin__stack_pointer (new builtin)
- Discussion in GCC list, named registers might be a better option
- Discussion in LLVM list about implementing named registers
- Implementing named register variables...
- Planning on LAVA testing LLVM kernels with LLVMLinux
- __aeabi_memset/cpy/move (both Android and Kernel) will have to be fixed
* Libraries (TCWG-125) 2/10
- libc++abi's unwind routines assume (ARM == SjLj)
- libc++ doesn't work without them
- We'll need to teach it about EHABI (later)
* Background 2/10
- Installed ArchLinux on my laptop, took some time to setup
- Re-org of LLVM Jira Cards (TCWG-417)
- Reviewing some GSoC proposals
== Plan ==
* implement named register variables in LLVM, then Clang
* Continue helping LLVMLinux and Milosz to test the LLVM kernel
* Time allowing, check CBuild2 for an LLVM build
* Two long weeks of holidays...
== Progress ==
* Bugfix aarch64 setcontext patch (2/10, TCWG-410)
* malloc requirements wiki page (2/10, TCWG-414)
* Lots of creating and updating and updating JIRA cards (1/10)
* glibc patch review (1/10)
* Assorted small patches - ld relasz, gnulib obstack, glibc strtod
benchmark (2/10)
* Investigate a couple of issues raised by member services and on lists (2/10)
== Issues ==
* None
== Plan ==
* Resurrect glibc malloc benchtest
* More glibc benchmark infrastructure work
* Check status of glibc ARM port build warnings
--
Will Newton
Toolchain Working Group, Linaro
== Week of March 17th ==
- STREAM regression (TCWG-388, 4/10)
-- Investigated how to prioritize memory references instructions in GCC scheduler to take full advantage of L2 autopretch hardware in certain ARM cores.
-- Fixed -fdbg-cnt=sched_insn debug counter along the way. It appear to have been broken since GCC 4.7.
- Discussed reg_pressure instruction scheduling with Charlie. (TCWG-135, 1/10)
- Various discussions about instruction scheduling in GCC. (1/10)
- Together with Michael prepared patch list for GCC contingency plan. (3/10)
- Made first-in-series video about tips-and-trick of GCC development. Your critiques are welcome!
-- Using GCC debug counters (7m34s): https://www.youtube.com/watch?v=IWRYCOkgL04
== Week of March 24th ==
- STREAM regression (TCWG-388)
-- Get a prototype patch.
- Other expected and unexpected tasks that come up.
--
Maxim Kuvyrkov
www.linaro.org
Last week
* one day off [2/10]
* NEON scheduling investigation - TCWG-135 [3/10]
. investigated scheduler register pressure heuristics
. call with Maxim about scheduling algorithm
. still more investigation required
* NEON intrinsics vs assembler libvpx performance difference
investigation [5/10]
. GCC seems to generate poor code for address generation in NEON loads/stores
. maybe some improvements to the intrinsics code would also be possible
Plan:
. write down some conclusions about libvpx performance
. investigate PR60609
. more on TCWG-135
. investigate LP1296676 & LP1296601 ICEs when building the kernel
== Progress ==
Machine descriptions for stack smashing in Aarch64 - TCWG-23 (5/10)
* Completed building QEMU for Aarch64 on Ubuntu 13.10. Ran
regression tests on it.
* Submitted patches for libssp as per Marcus suggestions and got it approved.
PGO support for Aarch64 -TCWG- 179 (2/10)
* Installed "CPU2006" tools in foundation model running open embedded image
and started running benchmarks. Plan is to build it with -PGO next.
Bug fix (2/10)
* Working on reproducing and fixing PR60617.
Misc (1/10)
* AMD Internal meeting and work.
== Plan ==
* Bug fix PR60617.
* Build CPU006 benchmarks with -PGO flag
* Restart PGO bootstrap failure investigations
Hi,
I read in the armv8 architecture reference manual that a number of AArch32 instructions have been obsoleted. Do the current armv7 version of GCC ever generate code containing any of these, without me explicitly writing inline assembly? If it can, how can this be turned off? Just would like to make sure that a C-program (without inline assembly) compiled today for armv7 will run in AArch32 mode when armv8 boards come out.
The following are obsoleted in ARMv8:
A32 SWP and SWPB instructions.
Jazelle (only trivial implementations are supported).
VFP short vectors and asynchronous bounces.
Fast Context Switch Extension (FCSE).
Thanks: Magnus
Magnus Karlsson
Software Development Engineering Manager
LSI Corporation
Box 1024, Knarrarnäsgatan 15
SE-164 21 Kista, Sweden
TEL +46 8 594 607 09
FAX +46 8 594 607 10
CELL +46 73 80 444 88
magnus.karlsson(a)lsi.com
== Issues ==
* none
== Progress ==
* LRA on AArch32:
o TCWG-343 : Make LRA the default for the ARM backend (0/10)
- Stop progress on this card, will close it when FSF 4.9 will be released.
o TCWG-345 : Analyse performance of LRA for ARM. (4/10)
- Spec2K figures on Cortex-a15 Analysis.
- re-run benchs in console mode chrubuntu without ASLR + perf tool
* Backports review: (2/10)
o Start to prepare cortex-a53 backports review
* Misc:
o Various meetings and slideware (4/10).
- Linaro and internal ones.
== Next ==
- continue cortex-a53 review
- some backports to do.
- continue on TCWG-345