== Progress ==
* Kernel (TCWG-417)
- Implementing named register global variables (D3261)
- Helping Milosz and Vinicius (LLVMLinux) to get a kernel ready
- First LLVM-compiled kernel booted on Versatile Express hardware
* Background
- Reviewing patches, etc.
- Apple merged their ARM64 back-end, fiddling bots
- Making the new TableGen docs official
- Jira farming
- Became code owner for the ARM Linux support
- LLVM Foundation announced
- Trying to run SPEC on AArch64
* Time
- CARD-124 6/10
- Others 4/10
== Plan ==
* Holiday for two and a half weeks
* Follow up the named register patch
== Progress ==
* glibc patch review (2/10)
* Helping out with aarch64 glibc setjmp/longjmp Systemtap probes testing (1/10)
* Investigated and submitted patch for gas ARM alignment issue (3/10)
* Committed library and script for malloc logging (1/10, TCWG-423)
* Rebased and tidied up malloc microbenchmark (2/10, TCWG-160)
* Various small binutils and glibc patches (1/10)
== Issues ==
* None
== Plan ==
* Submit patch for glibc malloc microbenchmark
--
Will Newton
Toolchain Working Group, Linaro
Hi all,
I've just filed a bug on glibc I'd love you to take a look at:
https://sourceware.org/bugzilla/show_bug.cgi?id=16796
Here's the description to save clicking:
Hi,
There is a test in glibc (tst-tls5) that tests that
((uintptr_t)pthread_self())%16 is zero. But watch this:
(t-mwhudson)mwhudson@am1:~$ cat btp.c
#include <stdint.h>
#include <stdio.h>
#include <pthread.h>
int
main(int argc, char** argv)
{
uintptr_t p = (uintptr_t)__builtin_thread_pointer();
uintptr_t q = (uintptr_t)pthread_self();
printf("p: %lx %ld\n", p, p%16);
printf("q: %lx %ld\n", q, q%16);
}
(t-mwhudson)mwhudson@am1:~$ gcc -o btp btp.c -lpthread
(t-mwhudson)mwhudson@am1:~$ ulimit -s unlimited
(t-mwhudson)mwhudson@am1:~$ ./btp
p: 2000028d88 8
q: 2000028698 8
(t-mwhudson)mwhudson@am1:~$ ulimit -S -s 8192
(t-mwhudson)mwhudson@am1:~$ ./btp
p: 7f7fd086f0 0
q: 7f7fd08000 0
So something is clearly wrong; maybe it's just that the test is too
strict, but somehow that seems a bit unlikely. FWIW, this doesn't
happen if you don't link with libpthread so maaaaybe it's a bug in
something that ends up in libpthread's .init section?
Cheers,
mwh
== Week of March 24th ==
- STREAM regression (TCWG-388, 5/10)
-- Finished prototype patch. The patch adds modeling of ARM L2 auto-prefetcher hardware to GCC scheduler (the model is very simple as auto-prefetcher is very lightly documented). Half of the patch cleans up and improves GCC scheduler, and the other half implements the auto-prefetcher model.
-- While looking into ARM scheduling support noticed that ARM doesn't use multipass lookahead scheduling, which surprised me. Enabled it (multipass scheduling) in my patches.
- Looked into lll_timed_wait Glibc/uClibc bug upstream (1/10)
-- https://sourceware.org/ml/libc-alpha/2014-03/msg00905.html
- Various discussions and reviews (4/10)
== Week of March 30th ==
- STREAM regression (TCWG-388)
-- Benchmark patches on SPEC2k and find/confirm best values for tuning parameters:
--- dfa_lookahead: should normally be issue_rate-1.
--- L2 auto-prefetcher queue depth: new tuning knob.
-- Investigate any performance regressions from the patches.
- lll_timed_wait Glibc/uClibc bug
-- Make sure it is fixed upstream. Possibly backport to Linaro branches.
--
Maxim Kuvyrkov
www.linaro.org
== Issues ==
* none
== Progress ==
* Launchpad bugs:
o TCWG-422 : ICE in assign_by_spills building linux btrfs module (1/10)
- created blueprint for :
https://bugs.launchpad.net/gcc-linaro/+bug/1296676
- Reported upstream as :
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60650
- Reduced testcase
- Fix committed by Vladimir as rev208876
- still ICE when configured for arm-none-linux-gnueabihf
o Backported "Internal compiler error in push_reload during
bootstrap stage 2" to GCC 4.7 (1/10)
- https://bugs.launchpad.net/gcc-linaro/+bug/1129013
- some testsuite regressions observed. will investigate.
* Backports review: (5/10)
o reviewed backport for pr60264 and rev202663
o cortex-a53 support backport:
- Analysed testsuite regression
- 22K Loc patch under review
* LRA on AArch32:
o TCWG-345 : Analyse performance of LRA for ARM. (1/10)
- looked at the perf tool results
* Misc:
o Various meetings (1/10).
o Various support to team members (1/10)
o Cbuildv1 baby-sitting (Calxedas nodes have to be restarted after
each upgrades !)
== Next ==
- continue cortex-a53 review
- continue on backports.
- continue on TCWG-345.
== Progress ==
* Short Week
-- Monday Day Off: Pakistan Day 23rd March public holiday roll over. [2/10]
-- Short leave on Thursday [1/10]
* Work on gdb testing utility [TCWG-96] [7/10]
-- Writing a new gdb testing utility script that automates gdb
testing in various configurations and compares testsuite results.
-- Support for all kind of native, native-gdbserver and
remote-gdbserver gdb configurations has been added.
-- Interactive and configuration file based user input mode has been added.
-- Testing configurations host alive, ssh possible, board file found
and ability to build sources using user defined configure flags has
been added.
== Plan ==
* Work on gdb testing utility [TCWG-96].
-- Add support to run native and native-gdbserver test via ssh on
remote machines like arm-linux.
-- Bug fix and test gdb testing utility by running it various configurations.
-- Update wiki page with testing utility and how to use it.
== This week ==
a53 support
- Fixed regression found in arm testing and resubmitted build and
merge requests
- arrch64 testing passed with no regressions. Testing with a53
support enabled still required
Merged 202663 (vectorizer bug) into 4.7 and 4.8 branches. Submitted
merge and build requests
== Next week ==
- Test aarch64 with a53 support enabled on qemu64
- Work on bug 60657 - [4.9 Regresssion] ICE: error: insn does not
satisfy its constraints
== Future ==
== Issues ==
* None
== Progress ==
* Create prebuilt sysroot based on Linaro eglibc 2014.04 release
(https://launchpad.net/linaro-toolchain-binaries/support/01/+download/linaro…)
(1/10).
* Enable shrink-wrap for apcs. Patch was out for community review. (1/10)
* Reinvestigate shrink-wrap enhancement (4/10, TCWG-133)
- There was improvement in ira to split_live_ranges_for_shrink_wrap
(http://gcc.gnu.org/bugzilla/show_bug.cgi?id=10474). But it still can
not handle the case in 453.povray.
- Investigate to do a simple copy-forward when prepare_shrink_wrap.
* Investigate lp:1296942 (pr60663). Patch is sent out for community
review. (4/10)
* Backporting the fix for pr60264 to Linaro 4..8.
== Plans ==
* Continue on shrink-wrap enhancement.
== Planned leaves ==
* April 7: Qingming holiday.
== Progress ==
- TCWG-413 Spec2006 (7/10)
- Investigated compiler error for 481.wrf with FSF 4.8.2. Issue is
due to aarch64_cm<cmp><mode> pattern (fcmle and fcmlt supports only #0
as third val). This is already fixed in trunk and Linaro 4.8.
- Ran profiling to analyse 4.9 regressions. Started looking into
P7Vitterbui which is one of the functions that performs badly.
- TCWG-291 CRC (3/10)
Came up with a patch for improving vrp for test-case. Some c++ test
cases are failing in regression testing with this patch. Looking into it.
- TCWG-394 / PR60034
Patch committed and card closed.
http://gcc.gnu.org/viewcvs?rev=208949&root=gcc&view=rev
== Plan ==
Continue with Spec2006 and crc
== Progress ==
* Android (no card, 10 minutes. ;)
- Implemented __builtin___clear_cache in Clang
* Kernel (TCWG-417) 6/10
- VLAIS in crypto (last place in kernel)
- __builtin__stack_pointer (new builtin)
- Discussion in GCC list, named registers might be a better option
- Discussion in LLVM list about implementing named registers
- Implementing named register variables...
- Planning on LAVA testing LLVM kernels with LLVMLinux
- __aeabi_memset/cpy/move (both Android and Kernel) will have to be fixed
* Libraries (TCWG-125) 2/10
- libc++abi's unwind routines assume (ARM == SjLj)
- libc++ doesn't work without them
- We'll need to teach it about EHABI (later)
* Background 2/10
- Installed ArchLinux on my laptop, took some time to setup
- Re-org of LLVM Jira Cards (TCWG-417)
- Reviewing some GSoC proposals
== Plan ==
* implement named register variables in LLVM, then Clang
* Continue helping LLVMLinux and Milosz to test the LLVM kernel
* Time allowing, check CBuild2 for an LLVM build
* Two long weeks of holidays...
== Progress ==
* Bugfix aarch64 setcontext patch (2/10, TCWG-410)
* malloc requirements wiki page (2/10, TCWG-414)
* Lots of creating and updating and updating JIRA cards (1/10)
* glibc patch review (1/10)
* Assorted small patches - ld relasz, gnulib obstack, glibc strtod
benchmark (2/10)
* Investigate a couple of issues raised by member services and on lists (2/10)
== Issues ==
* None
== Plan ==
* Resurrect glibc malloc benchtest
* More glibc benchmark infrastructure work
* Check status of glibc ARM port build warnings
--
Will Newton
Toolchain Working Group, Linaro
== Week of March 17th ==
- STREAM regression (TCWG-388, 4/10)
-- Investigated how to prioritize memory references instructions in GCC scheduler to take full advantage of L2 autopretch hardware in certain ARM cores.
-- Fixed -fdbg-cnt=sched_insn debug counter along the way. It appear to have been broken since GCC 4.7.
- Discussed reg_pressure instruction scheduling with Charlie. (TCWG-135, 1/10)
- Various discussions about instruction scheduling in GCC. (1/10)
- Together with Michael prepared patch list for GCC contingency plan. (3/10)
- Made first-in-series video about tips-and-trick of GCC development. Your critiques are welcome!
-- Using GCC debug counters (7m34s): https://www.youtube.com/watch?v=IWRYCOkgL04
== Week of March 24th ==
- STREAM regression (TCWG-388)
-- Get a prototype patch.
- Other expected and unexpected tasks that come up.
--
Maxim Kuvyrkov
www.linaro.org
Last week
* one day off [2/10]
* NEON scheduling investigation - TCWG-135 [3/10]
. investigated scheduler register pressure heuristics
. call with Maxim about scheduling algorithm
. still more investigation required
* NEON intrinsics vs assembler libvpx performance difference
investigation [5/10]
. GCC seems to generate poor code for address generation in NEON loads/stores
. maybe some improvements to the intrinsics code would also be possible
Plan:
. write down some conclusions about libvpx performance
. investigate PR60609
. more on TCWG-135
. investigate LP1296676 & LP1296601 ICEs when building the kernel
== Progress ==
Machine descriptions for stack smashing in Aarch64 - TCWG-23 (5/10)
* Completed building QEMU for Aarch64 on Ubuntu 13.10. Ran
regression tests on it.
* Submitted patches for libssp as per Marcus suggestions and got it approved.
PGO support for Aarch64 -TCWG- 179 (2/10)
* Installed "CPU2006" tools in foundation model running open embedded image
and started running benchmarks. Plan is to build it with -PGO next.
Bug fix (2/10)
* Working on reproducing and fixing PR60617.
Misc (1/10)
* AMD Internal meeting and work.
== Plan ==
* Bug fix PR60617.
* Build CPU006 benchmarks with -PGO flag
* Restart PGO bootstrap failure investigations
Hi,
I read in the armv8 architecture reference manual that a number of AArch32 instructions have been obsoleted. Do the current armv7 version of GCC ever generate code containing any of these, without me explicitly writing inline assembly? If it can, how can this be turned off? Just would like to make sure that a C-program (without inline assembly) compiled today for armv7 will run in AArch32 mode when armv8 boards come out.
The following are obsoleted in ARMv8:
A32 SWP and SWPB instructions.
Jazelle (only trivial implementations are supported).
VFP short vectors and asynchronous bounces.
Fast Context Switch Extension (FCSE).
Thanks: Magnus
Magnus Karlsson
Software Development Engineering Manager
LSI Corporation
Box 1024, Knarrarnäsgatan 15
SE-164 21 Kista, Sweden
TEL +46 8 594 607 09
FAX +46 8 594 607 10
CELL +46 73 80 444 88
magnus.karlsson(a)lsi.com
== Issues ==
* none
== Progress ==
* LRA on AArch32:
o TCWG-343 : Make LRA the default for the ARM backend (0/10)
- Stop progress on this card, will close it when FSF 4.9 will be released.
o TCWG-345 : Analyse performance of LRA for ARM. (4/10)
- Spec2K figures on Cortex-a15 Analysis.
- re-run benchs in console mode chrubuntu without ASLR + perf tool
* Backports review: (2/10)
o Start to prepare cortex-a53 backports review
* Misc:
o Various meetings and slideware (4/10).
- Linaro and internal ones.
== Next ==
- continue cortex-a53 review
- some backports to do.
- continue on TCWG-345
== Progress ==
* Worked on getting eglibc git mirror back up and running
* Released eglibc 2.19 2014.04
* Respin of glibc aarch64 setcontext patches
* Respin of ld aarch64 RELASZ patch
* Tidied up, rebased and committed ARM ld pointer equality patch
* Work on resurrecting glibc benchmark result graphing script
* Holiday on Friday
== Issues ==
* Couple of hours power outage on Monday
== Plan ==
* Get outstanding glibc and binutils patches committed
* Reorganise JIRA cards for malloc work
* glibc benchmarking
--
Will Newton
Toolchain Working Group, Linaro
== Progress ==
* Add support for fork/vfork/exec events/catchpoints in remote
gdbserver [TCWG-263] [4/10]
-- Debugged catchpoint code and reviewed gdbserver implementation for events.
-- Still require a lot of code understanding before actual
implementation can start.
* Wrote a script that takes two gdb build trees and configuration
arguments to test them in native, remote-native and remote-target
configurations [TCWG-96] [3/10]
* Laptop OS re-install [1/10]
* GDB open cards issues review [1/10]
* Sick half day off on Friday [1/10]
== Plan ==
* Complete work on TCWG-96.
-- Write a script that tests gdb in remote-native configuration using ssh.
-- Integrate all testing scripts and test them.
-- Update wiki page with updated scripts and how to use them.
* Monday Day Off: Pakistan Day 23rd March public holiday roll over.
== Progress ==
CARD-1210 - optimizing prologue/epilogue With -fomit-frame-pointer (4/10)
- regression tested the patch and fixed issues
- Dropping the patch and closing the card as it has been worked on at arm.
TCWG-15 - zero/sign extension elimination for crc (3/10)
- Looked at crc and looked at the other optimizations listed in card 440.
- Improved the local patch to remove the missing optimization
TCWG-413 - Running spec2006 with aarch64 (3/10)
- Built and set-up spec2006.
- Ran into to some compile time and run time errors
== Plan ==
Continue with zero/sign extension elimination
Start looking at literal pool merging
== This week ==
Merged all backports related to a53 support and successfully built arm
and aarch64 cross compilers:
202333 - [AArch64, ARM] Introduce "mrs" type attribute.
202334 - [AArch64] Use neon_<ldm,stm>_2 where appropriate as "type".
202448 - [AArch64] Prevent generic pipeline description from dominating
other pipeline descriptions.
202560 - set "type" attribute properly in arm_cmpsi_insn, cleanup
203241 - Cortex-a53 use Cortex tuning
203611 – 203621 – Neon types Parts 1 to 10
204575 - [ARM, AArch64] Make aarch-common.c files more robust.
205050 - [ARM] Add missing type attribute to zero_extend on arm
204782 - [AArch64] [-mtune cleanup 2/5] Tune for Cortex-A53 by default.
204784 - [AArch64] [-mtune cleanup 4/5] Remove "example-1", "example-2"
tuning options.
204852 - [AArch64] Remove simd_type
205014 - [AArch64] Remove v8type attribute.
205105 - [AArch64] Remove "mode", "mode2" attributes
204783 - [AArch64] [-mtune cleanup 3/5] [Temporary] When asked to tune
for Cortex-A57, tune for Cortex-A15
== Next week ==
Test a53 support
Backport vectorization bug
== Future ==
== Issues ==
* None
== Progress ==
* Identify a missing instruction pattern which blocks fcsel
optimization. A patch was sent out for community review. [2/10,
TCWG-309]
* Identify a ICE when handling dwarf-info in ARM backend when testing
shrink-wrap. A fix was committed to trunk. [2/10]
* Rebase and test the shrink-wrap codes for APCS_FRAME. [5/10]
* Prepare Linaro toolchain 2014.03 binaries release [1/10].
* Investigate CRC improvement chances.
== Plans ==
* Create a prebuilt sysroot based on Linaro eglibc-2014.04.
* Continue on shrink-wrap.
== Progress ==
* AArch64 (TCWG-387)
- Implemented simple C sin/cos functions for sphereflake
- AArch64 builds, passes check-all and test-suite
- Closed Jira task
* IAS (TCWG-377)
- Updated release notes
- IAS build, passes check-all, test-suite and compiles other large codebases
- Closed Jira task
* EHABI (TCWG-124)
- Updated release notes
- EHABI builds, passes check-all, test-suite
- Waiting for last patch to go to close the Jira task
* Compiler-RT (TCWG-125)
- Re-testing with current trunk+RT, all pass
- Next step, libunwind (not ready yet)
* Background
- Patch review, etc
- Studying TableGen, writing some docs (http://llvm.org/docs/TableGen/)
- Proposing changes to debug/EH unwind function attributes in IR
- Changing Zorg to not submit LNT reports by default (avoids bot breakage)
- Planning LLVM+Linux involvement as next big task
* Time
- CARD-862 8/10
- Others 2/10
== Plan ==
* Continue pushing __builtin___clear_cache and -fno-unwind-tables upstream
* Implement the Clang part of __builtin___clear_cache and update the docs
* Try to add LLVM/Clang to cbuildv2
* Start merging libunwind from libc++ to Compiler-RT and test it on ARM
* Get the kernel task started with the LLVMLinux guys
Hello,
the attached program changes the output from "true" to "false" when the
-Bsymbolic / -Bsymbolic-functions options are passed to GCC. This
happens on ARM -- on x86-64 output is always "true".
The program involves a comparison, within a shared library, of a PMF
defined inside the shared library itself with the same PMF passed by the
application.
At this stage I still don't know if it's a GCC problem, a ld.so problem,
ABI constraints, undefined behaviour, or whatnot; but any help is
appreciated.
Compile with:
> g++ -fPIC -shared -Wall -o libshared.so -Wl,-Bsymbolic shared.cpp
> g++ -fPIE -Wall -o main main.cpp -L. -lshared
(The long story is that Qt 5 is taking PMFs in its public API, and the
comparison failing inside of Qt shared libraries is breaking code on
ARM, as -Bsymbolic is set by default there.)
Thanks,
--
Giuseppe D'Angelo | giuseppe.dangelo(a)kdab.com | Software Engineer
KDAB (UK) Ltd., a KDAB Group company
Tel. UK +44-1738-450410, Sweden (HQ) +46-563-540090
KDAB - Qt Experts - Platform-independent software solutions