== Progress ==
o Linaro GCC validation (8/10)
* Reviewed, validated and committed more backports
* New stability issues after executor number increased
* Started to script branch merge
* Developed a new tool to avoid wasting time in gerrit/jenkins/logs
navigation
o Upstream GCC (1/10)
* Looked at and updated some bugzillas
o Misc (1/10)
* Various meetings
== Plan ==
o Continue backports/validation/branch merge
== Progress ==
* Widening pass (TCWG-547) – 6/10
- Looked at “Error: unaligned opcodes detected in executable segment”
* Spent lot of time trying to understand the root cause.
* Got some suggestions from Jim and looking into it.
- Posted some of the important patches for review.
* https://gcc.gnu.org/ml/gcc-patches/2015-09/msg00399.html
* https://bugs.linaro.org/show_bug.cgi?id=1318 (2/10)
- Tried reproducing with the source provided without success.
- Could build and reproduce with emacs-24.4 release.
- Trunk GCC version 6.0.0 20150902 works.
- GCC version 4.9.3 20150209 (Linaro GCC 4.9-2015.02) fails.
* Misc - 2/10
- gcc-patches, gcc-bugs list
== Plan ==
* Continue with widening pass
* Fix Bug1318
Holiday [4/10]
Multinode wrapper - TCWG-350 [2/10]
* Merged to benchmarking branch, roughly documented
* Tested & added job definition templates
Setting up VPN [2/10]
* Much struggling with mutt, pgp, VM vs real system, Mac vs Linux
* Still not working, but wrote up what I've learned on the Collaborate page
Misc [2/10]
* ARM admin, meetings, mail, etc etc
=Plan=
Last pass through benchmarking presentation
Finish multinode/updating benchmarking docs
Back to Juno noise control experiments
Back to Jenkins - get it to drive multinode
(Reporting with new Jira numbers)
Holiday [6/10]
Investigate effectiveness of noise control - TCWG-358 [2/10]
* Learned to build OE filesystems
* Got Juno running with more or less provenance-tracked firmware,
kernel, filesystem
Misc - [2/10]
* Pushed through some updates to benchmark sources
* Struggled with Collaborate permissions to share source/results handling rules
* Remaining multinode tests passed
== This week ==
* TCWG-832 -Exploit vector multiply by scalar instructions when multiple
scalars are used as
coefficients in a loop (4/10)
- Continued investigation.
* TCWG-833 - Exploit Wide Add operations when appropriate (4/10)
- Reworked Aarch64 patch to avoid redundant moves
- Sent patch upstream for review
- Debugging Aarch64 tree-dump regression suite failures
- Bugzilla 57195 (mode iterator bug) blocked compiling new pattern (1/10)
- Separated patch from Bugzilla 67321 patch and sent upstream
- Pinged upstream for comments
* Misc (1/10)
- Conference calls
== Next week ==
- USA Labor day holiday, September 7th
- Additional investigation into TCWG-832
== This Week ==
* TCWG-120 (8/10)
- Resolved df issue in my patch, sent to tcwg list for review,
- VRP makes the issue latent at -O2. Reproducible with -fno-tree-vrp
- Taking another approach to run a "specialized" combine pass before
combine pass that
folds arm_andsi3_insn/arm_cmpsi_insn to zeroextracsi_compare0_scratch
or andsi3_compare0_scratch without relying on combine.
Patch: http://pastebin.com/gLVg7pbN
Asm diff at -O1: http://pastebin.com/yXBHHkhM
Asm diff at -O2 -fno-tree-vrp: http://pastebin.com/EKj6hXkt
* Misc (2/10)
- Meetings
- Had a look at vect test-cases failing due to LTO
== Issues ==
- No access to Juno
- Can't login to IRC via ZNC (can connect directly).
== Next Week ==
- Continue with TCWG-120
- Start looking at TCWG-80
== Progress ==
* Holidays (4/10)
* Buildbots (2/10)
- Several breakages, Clang alignment issue sill breaking
self-hosted bots...
* Maintenance (2/10)
- Backtracks on the TargetParser, code heavily modified,
discussions ensued.
- Helping Vinicius with __aeabi_memcpy in the kernel
* Releases (0/10)
- Release 3.7.0 final validated / uploaded
- Waiting for final switch
* Background (2/10)
- Code review, meetings, discussions, etc.
- Backlog of emails, reviews
== Plan ==
Look at libc++ / RT / unwind again see if I can reduce the errors in
both AArch64 and ARM to zero.
== Issues ==
The power cuts and the effect it had on my buildbots made me spend 3
days of my holidays to fix. Other buildbot breakages made me spend
another 2, and I couldn't wait until I was back, or it would have
wasted an entire week (as it has happened before).
In a nutshell, I can't have holidays. Yay!
# Progress #
* Holiday on Monday [2/10]
* TCWG-857, [4/10]. All the multi-arch work are done
(I hope) but patches can't be sent out until kernel patches
are pushed in upstream.
** Collect some arguments from kernel folks to defend my change
"32-bit CPSR to 64-bit PSTATE". Patch is posted out.
** Finish the patch to convert siginfo_t between 32-bit debuggee
and 64-bit debugger. Done. Will post it next week.
** Get right TLS base in multi-arch debugging. Patch is done.
* Happen to see we can improve GDB performance in some case by
avoid sending some packets. Patch is done, but need to collect
some performance data. [2/10]
* Misc [1/10]
** Close some tickets, TCWG-567, TCWG-876, as they are done.
* TCWG-757 [1/10], many patches review.
# Plan #
* Continue to upstream multi-arch patches.
* Upgrade my juno board kernel to git master to test kernel patches
for multi-arch debugging work properly.
* Collect some GDB performance data for my patch.
--
Yao
Hi-
It seems that there is some discrepancy between Linaro GCC 4.8 2015.06 and Linaro GCC 4.8 2014.11 with regards to precompiled headers.
It appears that 2015.06 doesn't even attempt to open the precompiled headers according to strace. I have been looking on gcc mailing list but have not been able to find any fixes related to precompiled headers.
Does anyone have any pointers as to what should I be looking at to get to the bottom of this issue?
This simple program compiles without any problems on 2014.11, but fails on 2015.06 because 2015.06 doesn't even attempt to read tst.h.gch created by "make header"
dragans@tst:~$ cat Makefile
PROJ=tst
CC=arm-linux-gcc
LD=arm-linux-ld
all: header $(PROJ)
header: $(PROJ)._
cp $(PROJ)._ $(PROJ).h
$(CC) -x c-header -c $(PROJ).h
$(PROJ):
$(RM) $(PROJ).h
$(CC) $(PROJ).c -o $(PROJ)
clean:
$(RM) $(PROJ).h.gch $(PROJ).h $(PROJ)
dragans@tst:~$ cat tst._
#include <stdio.h>
dragans@tst:~$ cat tst.c
#include "tst.h"
int main(int argc, char**argv)
{
char *s = "Test";
printf("%s\n", s);
return 0;
}
== Progress ==
LLDB development
-- Testing and bug fixing lldb on hikey AArch64 [TCWG-886] [7/10]
-- Improve error handling of AArch64 watchpoint code. Submitted,
got reviewed and committed http://reviews.llvm.org/D12328
-- Looked into ways to improve watchpoint installation lag.
-- Looked into issues pertaining to un-alligned watchpoints installation.
-- Committed upstream arm hardware breakpoint and watchpoint support
[TCWG-770] [TCWG-794] [1/10]
-- Ran lldb testsuite in various combinations to figure out Arm and
AArch64 status [1/10]
-- Testing on chromebook with precise chroot works.
Miscellaneous [1/10]
-- Meetings, emails, discussions etc.
== Plan ==
LLDB development
-- Try to find ways to debug lldb-server platform forked gdbserver instance.
-- Test, debug and fix testsuite failures on Arm and AArch64.
== Progress ==
LLDB development
-- Testing and bug fixing lldb on hikey AArch64 [TCWG-886] [2/10]
-- Looked into watchpoint tests which are still failing after
committing watchpoint support and bug fixes.
-- Modifications to arm hardware breakpoint and watchpoint support
[TCWG-770] [TCWG-794] [7/10]
-- Resubmitted the pending patch http://reviews.llvm.org/D9703
-- Testing on chromebook with precise chroot works.
Miscellaneous [1/10]
-- Meetings, emails, discussions etc.
-- Looking into slow data transfer issues on chromebook and highkey board
== Plan ==
LLDB development
-- Get patches reviewed and committ them.
-- Continue looking into tests which are still failing on Arm and AArch64
-- Run test comparison between linux x86, android arm and linux arm
o 3 days off (6/10)
== Progress ==
o Linaro GCC validation (3/10)
* Reviewed, validated and committed on-going backports
* backported more revisions
o Misc (1/10)
* Various meetings
== Plan ==
o Continue backports/validation
== Progress ==
* Annual Leave (2/10)
* Widening pass (TCWG-547) - 6/10
- Fixed all execution test failure.
- bootstrap failure due to “Drop copy-rename” is still not resolved.
Found a workaround.
- Sorted debug_stmt handling
* TACT 1/10
- Started looking to cross execution set-up
* Misc - 1/10
- gcc-patches, gcc-bugs list
== Plan ==
* Continue with widening pass
== This Week ==
* TCWG-777 (3/10)
- Fixed ICE with the patch - toolchain builds with the patch.
- Sent patch to tcwg list for review
- Investigating why combine fails to combine
arm_andsi3_insn/arm_cmpsi_insn into zeroextractsi_compare0_scratch
when the pass uses ud-chains to find def but works
when def is found using ad-hoc way.
* TCWG-871 (4/10)
- Getting familiar with firefox build system
- LTO build lto/non-lto on x86 and arm (doc)
- Figuring out how to resolve "plugin needed to handle lto object" error.
tried the following:
binutils configured with: --enable-plugin, --enable-lto
gcc configured with: --enable-lto --with-ld-plugin=<just built ld>
Doesn't appear to work.
* TCWG-835 (1/10)
- Build failure with spec (PR67399)
* Misc (2/10)
- Meetings
- Looked at rtl dataflow (df.h and df*.c) and ree.c
== Next Week ==
Continue with TCWG-777, TCWG-835, TCWG-871
== This week ==
* TCWG-832 -Exploit vector multiply by scalar instructions when multiple
scalars are used as
coefficients in a loop (2/10)
- Initial investigation.
* TCWG-833 - Exploit Wide Add operations when appropriate (4/10)
- Ramana is reviewing Aarch32 patch
- Recoded Aarch64 support to use vect_select
- Debugging Aarch64 lto, tree-dump regression suite failures
* TCWG-834 - Use non-unit stride loads by preference when applicable (1/10)
- TCWG 834 is Bugzilla 67323 upstream; Richard Biener took ownership
as a vectorizer failure
- My plan was to write a test case that failed until fixed, but
Richard indicated this is not
standard practice
- Bugzilla 57195 (mode iterator bug) blocked compiling new pattern (1/10)
- Separated patch from Bugzilla 67321 patch and sent upstream
- Pinged upstream for comments
- Bugzilla 67320 - Incorrect standard names for wide addition (1/10)
- committed upstream in trunk
* Misc (1/10)
- Conference calls
== Next week ==
- Resolve Aarch64 TCWG-833 patch, validate and upstream
- Additional investigation into TCWG-832
# Progress #
* TCWG-857 [3/10]
With one kernel fix, HW breakpoint works for unaligned address
(2-byte aligned).
Debug linux kernel with KGDB. KGDB exposes an existing GDB bug, I
fixed it, but need some time thinking about how to submit it upstream.
* TCWG-567, arm watchpoint fixes [4/10]. All fails are fixed, but need
another round of test to confirm. ARM HW watchpoint doesn't work on
4.0.0 kernel, but I don't investigate on it.
* TCWG-757 [2/10], some patches review.
* Misc, meeting, [1/10]
# Plan #
* TCWG-857.
* Upstream the leftover of aarch64 multi-arch patches.
--
Yao
== Progress ==
o Linaro GCC validation (9/10)
* Analysed x86_64 fstack-protector and sse2 issue
* This is due to "ulimit -v" usage which brakes asan testing
and gcc testsuite caching mechanism.
* Validation is now stable on Hetzner/Austin hardware
* Documented summer validation issues
* Validate and committed on-going backports
o Upstream GCC (0/10)
* Armeb OOM fix committed into gcc-5
o Misc (1/10)
* Various meetings
== Plan ==
o 3 days off
o Continue backports/validation
== Progress ==
* Widening pass (TCWG-547) - 8/10
- Fixed all but one execution test failure.
- aarch64 and x86 are clean
- arm has one but this looks like a latent issue (in expand); looking
into it
- Latest trunk with aarch64 miscompiles stage2 fwprop (-fno-forwprop
works).
This happens with the commit 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f
[PR64164] Drop copyrename, use coalescible partition as base when
optimizing
- Tried forcing the same promote_mode for x86_64 and can reproduce it
with x86_64 also.
- Looking into it to see if this is an error with the commit
* Misc - 2/10
- gcc-patches, gcc-bugs list
== Plan ==
- Continue with widening pass
Upcoming Absences
* Away from this Wednesday until next Tuesday, inclusive
Benchmark infrastructure - TCWG-360 [1/10]
* Finished filling out cards/bugs for benchmarking work
* Which makes the rest of this report more fine-grained
Multinode wrapper - TCWG-888 [2/10]
* Completed oversubscription workaround
* Herded most tests through
* Uncovered one significant bug, seems fixed
Centralized source/results storage - TCWG-722 [1/10]
* Wrote draft of source/results handling rules
* Confirmed that old repos are subsets of new repos
Noise control experiments - TCWG-897 [2/10]
* Learned more about firmware, openembedded
* Juno running a minimal filesystem, needs some tidying up
Misc [4/10]
* Including a further 1/10 of ARM management
=Plan=
* Herd remaining multinode tests through
** Depends on LAVA lab coming back from power cut
** And queues being short enough
* Progress (finish?) Juno bring-up
* Share source/results rules draft
== This week ==
* TCWG-833 - Exploit Wide Add operations when appropriate (8/10)
- Reworked patch to use vect_select instead of unspec as requested
by Ramana
- Bugzilla 57195 (mode iterator bug) blocked compiling new pattern
- Created patch which successfully bootstrapped
- Created Bugzilla 67320 - Incorrect standard names for wide
addition (documentation bug)
- Successfully regression tested Aarch32 changes
- Send Aarch32 patch upstream for review
- Reworking patch for Aarch64 to use vect_select as well
- Created Bugzilla 67321 - [ARM] Exploit wide adds when appropriate
- Created Bugzilla 67322 - [Aarch64] Exploit wide adds when appropriate
* TCWG-834 - Use non-unit stride loads by preference when applicable (1/10)
- Created Bugzilla 67323 to track until fixed in GCC 6 by Richard Biener
- Began writing test case that fails
* Misc (1/10)
- Conference calls
== Next week ==
- Respond to Aarch32 TCWG-833 upstream requests and hopefully check-in patch
- Complete Aarch64 TCWG-833 patch, validate and upstream
- Finish testcase for TCWG-834 and submit upstream
== Progress ==
* Maintenance (CARD-1833 2/10)
- Fixing libc++abi build on AArch64
- Trying to remove a hack in ARMTargetInfo about default CPUs
- Bisecting PR24292
- Working with ARM to fix it, backport
* Buildbots (CARD-1823 8/10)
- Working with Adhemerval on VMA 42bits sanitizer
- Setting up a libc++ "full" build on the second stage, since
gcc can't build it yet. 171 tests fail.
- Ubuntu Vivid (39-bits VA / 4k pages) with Uboot is
the winner of stability, speed and easiness.
- Power outage meant I had to unbreak *a lot* of broken stuff
* Background (2/10)
- Code review, meetings, discussions, etc.
- Some Android/TSAN/AArch64 shenanigans
== Plan ==
Holidays all week
== Issues ==
Yes, it does add up to 12/10, since I had to work most of Saturday and
some of Sunday to fix the mess that the power cuts made on our lab.
If we had generators, that wouldn't have happened.
== This Week ==
* TCWG-835 (2/10)
- Testing on SPEC CPU2006
- Found ICE while testing on spec unrelated to my patch, created
reduced test-case.
* TCWG-777 (2/10)
- Working on RTL pass - enhanced it so it only replicates insn when required.
- Prototype patch works for the test-case but ICE's during toolchain build.
- Trunk generates expected output for -O2, git-bisect shows from r226516
* TCWG-871 (1/10)
- Built for x86_64 non LTO
- Trying to cross compile for ARM
* Sick Leave (2/10)
* Public Holiday (2/10)
* Misc (1/10)
- Meetings
== Next Week ==
- Continue with TCWG-777, TCWG-835, TCWG-871
# Progress #
* TCWG-806, aarch64 remote debugging multi-arch
support. Patches are pushed in. Done. [2/10]
* TCWG-857, HW breakpoint/watchpoint in multi-arch. [3/10]
Both GDB and linux
kernel needs some fixes. Fix one bug in GDB. Read ARMv8 manual about
byte address select for thumb code (2-byte aligned instruction) and
the implementation in kernel.
* TCWG-757 some patches review, [1/10]
* Misc [4/10]
** ARM new starter training on Wed. afternoon and Thu.
** Read early-debug https://gcc.gnu.org/wiki/early-debug
# Plan #
* TCWG-857.
* Update the state of some old linaro tickets on GDB.
--
Yao
Hi,
I compiled the Linaro cross tool chain for arm v7 SOC and using the buildroot to create the images. When I use the downloaded precompiled tool chain from linaro releases, I am not facing any issues. But with compiled tool chain, multiple packages are failing with similar to below mentioned error. I tried building cross tool chain on both redhat and Ubuntu machines. When I see the difference of gcc -v, the cross compile is with "--host=x86_64-build_unknown-linux-gnu" and the pre compiled is with "--host=i686-build_pc-linux-gnu". I tried with "crosstool-ng-linaro-1.13.1-4.9-2014.08.tar.bz2" and "crosstool-ng-linaro-1.13.1-4.9-2014.09.tar.bz2".
configure:25287: checking for MD5 in -lcrypto
configure:25312: /projects/broadcom-linux/dhananjay/nas/iproc/buildroot/output/host/usr/bin/arm-linux-gnueabihf-gcc -std=gnu99 -o conftest -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -pipe -Os -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 conftest.c -lcrypto >&5
/projects/broadcom-linux/dhananjay/toolchain/tool9/toolchain/lib/ct-ng-linaro-1.13.1-4.9-2014.08/install/lib/gcc/arm-linux-gnueabihf/4.9.2/../../../../arm-linux-gnueabihf/bin/ld: warning: libdl.so.2, needed by /projects/broadcom-linux/dhananjay/nas/iproc/buildroot/output/host/usr/arm-buildroot-linux-gnueabihf/sysroot/usr/lib/libcrypto.so, not found (try using -rpath or -rpath-link)
-bash-4.1$ uname -a
Linux lab-rtp-gw-05 2.6.32-358.23.2.el6.x86_64 #1 SMP Sat Sep 14 05:32:37 EDT 2013 x86_64 x86_64 x86_64 GNU/Linux
Can somebody help me here.
Thanks
Dhananjay
Hi Linaro Toolchain Group,
I am trying to compiling cross aarch64-linux-gnu toolchain using local
directory (tar file) for gcc.
I am using file:/// as mentioned in the https://wiki.linaro.org/ABE.
../abe/abe.sh --target aarch64-linux-gnu --build all --release
20150819 --tarbin
gcc=file:///home/vpathak/arm/toolchain/build/snapshots/gcc-2015.11-5.tar.bz2
But I get following error:
ERROR (#146): get_URL (not supported for .tar.* files.)
ERROR (#533): get_source
(file:///home/vpathak/arm/toolchain/build/snapshots/gcc-2015.11-5.tar.bz2
not a valid sources.conf identifier.)
TRACE(#190): checkout ()
ERROR (#193): checkout (No URL given!)
ERROR (#161): checkout_all (Failed checkout out of gcc.)
Am I missing something ? How can we use a local directory (e.g. gcc
source code) for building toolchain using abe ?
Please help.
Thanks.
--
with regards,
Virendra Kumar Pathak
== Progress ==
LLDB development
-- Testing and bug fixing lldb on hikey AArch64 [TCWG-886] [7/10]
-- Submitted following fixes and got them reviewed
-- http://reviews.llvm.org/D11899
-- http://reviews.llvm.org/D11902
-- http://reviews.llvm.org/D11987
Miscellaneous [3/10]
-- Meetings, emails, discussions etc.
-- 14th August Public Holiday
== Plan ==
LLDB development
-- Further progress on fixing test failues on hikey AArch64.
-- Testsuite status compilation for armel and armhf
== Progress ==
* Short week due to GNU Cauldron Travel
* Widening pass (TCWG-547) - 3/10
- Looked at the false positive warning for uninitialized variable
- Have a reduced test-case and looking at ways to fix this
- Split the patch for easy review
- Fixed test-cases and going through all the test-case failures
* Misc - 2/10
- gcc-patches, gcc-bugs list
- Looked at gcc vectorization
== Plan ==
- Widening pass
== Progress ==
o Linaro GCC validation (4/10)
* Validation issues investigation
* Austin's APM board seems more reliable
* New validation instability on x86_64 discovered:
- Analysis on-going but it is related to fstack-protector
* Waiting for VPN access
o Upstream GCC (4/10)
* Reworked OOM issue on armeb-linux-gnu target fix
* Patch committed on trunk
* Backport on gcc-5 validated and about to be committed
o Misc (2/10)
* Various meetings
* Team member support
== Plan ==
o Continue on validation
o Redo FSF branch merge after big-endian fix committed.
Benchmark infrastructure - TCWG-360 [6/10]
* Fixed remaining known critical issues in multinode
** Testing stymied by LAVA oversubscription
** Attempted to implement some workarounds for oversubscription
* Some discussion/investigation on generating filesystems for benchmarking
* Started brain-dumping state of infrastructure into cards.linaro.org
Bringing up Juno for noise control experiments [1/10]
* Some struggles with firmware/bootloader configuration
Misc [3/10]
Featuring some ARM management duties
= Plan =
* Get benchmarking tests through, if LAVA permits
* Fix any critical issues discovered
* Finish brain-dump into cards.l.o
* Bring up Juno for noise control experiments
* Some more ARM management
== This week ==
* TCWG-140 - Transform end of loop conditions to min_expr (4/10)
- Wrote initial test case as requested by maintainers
- Need to determine how to not run test case on targets with no
MIN_EXPR/MAX_EXPR
* TCWG-833 - Exploit Wide Add operations when appropriate (4/10)
- Fixed internal error by using unspec for vector sign/zero-extend
- Fixed multiple issues with test cases
- Investigating why gcc.dg/vect/slp-reduc-3.c with lto is regressing
for aarch32 and aarc64
- Investigating two additional regression failures on aarch64
* TCWG-834 - Use non-unit stride loads by preference when applicable (1/10)
- Spoke with Richard Biener at GNU cauldron.
- He indicated this was a problem with a interaction problem with
vectorizer not attempting multiple strategies
- He indicated that he planned to fix this for GCC 6 and not to
attempt a fix
- My plan is to write a test case that fails until the issue is
addressed in GCC 6
* Misc (1/10)
- Conference calls
== Next week ==
- Resolve testsuite regressions for TCWG-833
- Create bugzilla reports for TCWG-833 and TCWG-834
- Write testcase for TCWG-834 and submit upstream
- Investigate TCWG-832 as time permits
== This Week ==
* TCWG-777 (4/10)
- Working on RTL pass, prototype patch works for the test-case.
- updating the trunk to r226907 , it appears combine can fold
arm_andsi3_insn/arm_cmpsi_insn
into zeroextractsi_compare0_scratch at -O2 while not at -O1.
.* TCWG-835 (2/10)
- Testing patch with Spec 2006
- Found ICE unrelated to my patch
* Misc (4/10)
- Travel to home from GNU Tools Cauldron 2015, Prague.
== Next Week ==
- Continue working on TCWG-777
- Continue testing TCWG-835 patch with SPEC 2006
== Progress ==
* Buildbots (CARD-1823 6/10)
- Setting up new APMs
- Running around for a decent UEFI, Kernel, Distro
- Testing different builds, configurations
- Managed to get one APM doing the test-suite *only*! sigh...
- http://buildmaster.tcwglab.linaro.org/builders/clang-cmake-aarch64-lnt
* Background (4/10)
- Code review, meetings, discussions, etc.
- Doing more reviews than usual due to buildbot work being erradic
- Laptop is playing stupid again
== Plan ==
* Moar bots
* Moar code review
* Maybe some vodka, to wash it all up
== Progress ==
LLDB development
-- Testing and bug fixing lldb on hikey AArch64 [TCWG-886] [9/10]
-- Debugging of watchpoint test failures and problems with multiple
watchpoints.
-- Clean-up of AArch64 watchpoint code and fixing watchpoint cache bugs.
Miscellaneous [1/10]
-- Meetings, emails, discussions etc.
== Plan ==
LLDB development
-- Submit and get review on bug fixes for lldb on hikey AArch64 code changes.
-- Investigate further lldb test failues on hikey AArch64.
Miscellaneous
-- Friday 14th August, Independence day public holiday in Pakistan.
== Progress ==
o Linaro GCC validation (4/10)
* Look at the various validation issues we have
* Dug into the logs, restarted jobs, asked to restart
builders/testers. etc ...
o Upstream GCC (5/10)
* Investigate OOM issue on armeb-linux-gnu target
* Found the issue and raised an upstream bugzilla:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67127
* Submitted a fix for it
o Misc (1/10)
* Various meetings
== Plan ==
o Continue validation babysitting
== Progress ==
LLDB development
-- Setup lldb test debug environment for testing failing tests on
chromebook. [1/10]
-- Debug various lldb hang scnarios on chromebook for armhf support
[TCWG-855] [4/10]
-- Figure out alternate fix for http://reviews.llvm.org/D11129.
[TCWG-855] [3/10]
Miscellaneous [2/10]
-- Fix chromebook malfunction due to battery issues
-- Meetings, emails, discussions etc.
== Plan ==
LLDB development
-- Debug lldb test cases on armhf to figure out problems in
individual test-cases
-- Create an updated lldb development project plan on JIRA.
-- Update collaborate with steps on LLDB development process.
== Progress ==
* Widening pass (TCWG-547) - 8/10
- Handled review comments.
- Improved CONVERT_EXPR handling.
- Re based and retested - found some failures.
- This was due to VRP reusing the range info computed in VRP1 after
type promotion is applied. Invalidating the range info when type is
promoted.
- ARM with type promotion now improves (-O2) about ~2.8% with one tiny
benchmark (where it used to regress).
- Still some test cases failures (not execution failures but scanning
certain patterns in the dump/asm).
* Misc - 2/10
- Connect slides.
- gcc-patches, gcc-bgs list
- Meetings
== Plan ==
- Widening pass
== This week ==
* TCWG-146 - Detect smin/umin idiom (1/10)
- Incorporated final feedback and submitted code in SVN
* TCWG-140 - Transform end of loop conditions to min_expr (1/10)
- Misc. discussion upstream about concerning where optimization
should be performed
- Upstream maintainers asked me to create gimple ir test case
* TCWG-833 - Exploit Wide Add operations when appropriate (1/10)
- Unable to validate yet
* TCWG-834 - Use non-unit stride loads by preference when applicable (5/10)
- Further Aarch32 investigation to determine where decision to forgo
vld3 decision is being made
* Linaro connect preparation (1/10)
* Misc (1/10)
- Conference calls
== Next week ==
- Validate patches for TCWG-833 and submit upstream
- Further TCWG-834 investigation
- GNU Cauldron conference and travel
== This Week ==
* TCWG-835 (4/10)
- Validated and submitted patch upstream for review.
- Made changes to patches according to upstream reviews.
- Microbenchmarks: http://pastebin.com/tDnHZuG5
* TCWG-777 (5/10)
- Investigating different ICE's caused by my gimple remove-temps pass
- Looked at expansion of GIMPLE_COND
- Trying to write rtl version of remove-temps pass
* Misc (1/10)
- Conference calls
== Next Week ==
- Continue with TCWG-777
- GNU Tools Cauldron 2015
== Progress ==
* Performance (CARD-1832 1/10)
- Checking differences of PostRAListSched on OOO ARM cores
- Not many changes, ignoring for now
* Maintenance (CARD-1833 4/10)
- Building libc++/abi/unwind in LLVM/Clang tree
- Getting -Wa,-mfpu patches in, last important Clang driver ARM bug
- Some patches to get libunwind and libc++ to compile in-tree on ARM
- Fixed native sub-features detection (http://llvm.org/PR12794)
* Background (5/10)
- Code review, meetings, discussions, etc.
- Long discussions about TargetTuple/TargetParser/Triple
- Lots of patch reviews this week (I mean, *A LOT*)
- Moving some machines around, checking for Chromebook batteries
- Setting up cross-builder using multiarch / QEMU
- Some future planning
== Plan ==
* Look for some more performance issues in 3.7
* Try to hook up the cross-builder
* Investigate libc++ check-all failures
Benchmark infrastructure - TCWG-360 [8/10]
* Testing found many problems in multinode
* Iterating to solutions
Misc [2/10]
=Plan=
Holiday next week.
Then back to fixing multinode, incorporating into jenkins, noise
control experiments
Hi Linaro Toolchain Group,
I am building a native toolchain for aarch64 with below configurations:
--build=x86_64-unknown-linux-gnu --host=aarch64-linux-gnu
--target=aarch64-linux-gnu.
In copy_gcc_libs_to_sysroot() - which copy libgcc.a to sysroot, current
implementation try to find the absolute path of libgcc.a as below :
libgcc="`${local_builds}/destdir/${host}/bin/${target}-gcc
-print-file-name=${libgcc}`
But above line will not execute (i.e. gcc -print-file-name) on x86_64 as
the toolchain is native toolchain for aarch64-linux-gnu. Thus a infinite
loop will be created in copy command i.e. copying directory x in x.
however, when I hard coded the libgcc.a path in my machine (as below),
everything went fine.
libgcc="/home/vpathak/arm/toolchain/build_abe_new/builds/destdir/aarch64-linux-gnu/lib/gcc/aarch64-linux-gnu/5.1.1/libgcc.a"
I think this is a bug in ABE build infrastructure.
Thanks.
--
with regards,
Virendra Kumar Pathak
* TCWG-806, aarch64 remote debugging multi-arch support. [4/10]
Patches are done. Need to test them and polish them.
Fix various multi-arch issues when --wrapper is used in GDBserver.
Patches are pushed in to mainline.
Could you describe this activity in more detail?
Is the goal here to support mixed aarch32/aarch64 in the same GDB binary
and detect the change at runtime?
Thanks.
-Duane
== Progress ==
* Factor conversion out of COND_EXPR - TCWG-849 (5/10)
- Iterated through the review and more testing
* Looked at widening pass and the test-case from Wilco (1/10)
* Misc (2/10)
- Connect slides.
- gcc-patches, gcc-bugs list
- Meetings
* Sick (2/10)
== Plan ==
- GCC Bugs
- Widening pass
- Linaro bug 1318
== This week ==
* TCWG-146 - Detect smin/umin idiom (1/10)
- Made change recommended upstream and resubmitted
* TCWG-140 - Transform end of loop conditions to min_expr (1/10)
- Validated and submitted upstream
* TCWG-833 - Exploit Wide Add operations when appropriate (5/10)
- Added early clobber and forced operand 0 and operand 2 to match
- Finished Aarch32 by using mode iterators
- Developed patch for Aarch64
- Wide add instructions are now emitted for both Aarch32 and Aarch64
* TCWG-834 - Use non-unit stride loads by preference when applicable (2/10)
- Further Aarch32 investigation
* Misc (1/10)
- Conference calls
== Next week ==
- Validate patches for TCWG-833 and submit upstream
- Further TCWG-834 investigation
- Linaro connect presentation preparation
* TCWG-835 (6/10)
- Looked at newton raphson method
- Need to write new md pattern that matches sdiv_optab for modes == v2sf, v4sf
- First attempt for patch: http://pastebin.com/NKy8WdWC
* TCWG-830 (2/10)
- Ran Charles's benchmarks on ARM and AArch64.
- Investigating testsuite fallout for ARM patch.
- Still blocked by permissions to do benchmarking
* Misc (2/10)
- Conference Calls
- US visa collection
== Next Week ==
- Continue with TCWG-830, TCWG-835, TCWG-777
Hi Linaro Toolchain Group,
I am trying to learn the 'decoding decision tree' for aarch64 in binutils
by trying to add a new assembly instruction 'addvp'.
For example: addvp x0, x0, 9
For this, I added a entry in struct aarch64_opcode aarch64_opcode_table[]
(file opcodes/aarch64-tbl.h) as below:
{"addvp", 0x01000000, 0x7f000000, addsub_imm, 0, CORE, OP3 (Rd_SP, Rn_SP,
AIMM), QL_R2NIL, F_SF},
ARM manual say, bit 27 & bit 28 are unallocated. Thus for addvp, I am
giving opcode 01000000 (with bit 27 & 28 as 0).
With this, generating object file from assembly file is successful (test.s
--> test.o); but while disassembling using objdump, it say undefined
instruction.
>From objdump log:
81002400 .inst 0x81002400 ; undefined
(but instruction was generated correct i.e. 81002400 !!!).
I know since addvp is a hack instruction, it won't execute on cpu. But
still disassembly should succeed.
1. Please help me in knowing what I am doing wrong here ? What else I
should do to add a new instruction in binutils ?
2. I also saw some printf in opcodes/aarch64-gen.c which I guess create
decoding tree (initialize_decoder_tree()). How to print them ? I made debug
=1 but still print is not coming.
3. There are some auto-generated files
like aarch64-asm-2.c, aarch64-dis-2.c. How to re-generate them ?
Thanks.
--
with regards,
Virendra Kumar Pathak
== Progress ==
* Maintenance (CARD-1833 5/10)
- Building libc++/abi/unwind in LLVM/Clang tree
- Fixing some build errors (D11486)
- Addressing comments to submissions from last week
- Committing approved ones
- Re-working the others
* Releases (CARD-1431 1/10)
- Building 3.7.0-RC1 on ARM and AArch64, uploading
* Benchmarks (CARD-716 2/10)
- Running LNT, SPEC and EEMBC on ARM and AArch64 for 3.7.0
* Background (2/10)
- Code review, meetings, discussions, etc.
- Upgraded APM to Debian, kernel 3.16
- Perf still segfaults. :(
== Plan ==
* Finish open reviews
* Continue getting libc++ to build and pass the tests in tree
* Look at some of the performance regressions in 3.7
# Progress #
* TCWG-806, aarch64 remote debugging multi-arch support. [4/10]
Patches are done. Need to test them and polish them.
Fix various multi-arch issues when --wrapper is used in GDBserver.
Patches are pushed in to mainline.
* TCWG-876 [1/10]
Re-run GDB testsuite with incoming Linaro toolchain release.
Everything looks OK.
* TCWG-860, aarch64 fast tracepoint. [1/10]
Polish the patches, and ready for submission.
* TCWG-757, upstream patch review. [2/10].
* Misc, meeting. [2/10]
# Plan #
* TCWG-806, test patches on different targets, polish patches
and post them for review.
# Absence #
06th Aug - 10th Aug, GNU Tools Cauldron.
11th Aug - 14th Aug, Holiday.
--
Yao
Benchmark infrastructure - TCWG-360 [6/10]
* Some user support/bugfixing/bugraising
* Multinode job more or less working (not fully tested)
* Additional restructuring got rid of some more complexity
** Though if my simplifying assumption doesn't hold, I'll have to put it back
Benchmarking 101 presentation [2/10]
* Ran through slides with Ryan & Maxim
* Removed many slides
* Collected up and categorized the removed slides
** Probably will go into future presentation(s)
Misc [2/10]
=Plan=
* Tweak multinode a little more
* Integrate multinode into Jenkins
** To the extent that I'm comfortable with the security
* Read a bit about some benchmarks that aren't SPEC
* Start noise control experiments (may inform presentation)
=Week After Next=
Holiday
* One day off - Bastille day (2/10)
== Progress ==
o Upstream GCC (3/10)
* Finalized and committed fix in trunk for Linaro bug #416
o Linaro GCC release (4/10)
* Reviewed and did more patches for tcwg-release script
* Still investigate validation issues.
* Prepared FSF branch merge into Linaro GCC 5 branch
o Misc (1/10)
* Various meetings
== Plan ==
- Summer Holidays (2 weeks)
== Progress ==
* Add REG_EQUAL note for arm_emit_movpair (1/10)
- Patch2 ok to commit.
- Ran complete validation.
- Found an issue and posted a patch to fix
* Factor conversion out of COND_EXPR - TCWG-849 (6/10)
- Found a performance regression in tree-ssa-reasoc
- Looked at the tree-ssa-reasoc code to see possible fixes
- Posted an RFC patch
* PR66865
- Wine segfaults from gcc in trunk (r225757)
- Reproduced it but turned out not from my commit
- Fixed by other PR
* Misc (2/10)
- Looked at interaction between gcc optimization passes
- gcc-patches, gcc-bugs list
- Meetings
== Plan ==
- GCC Bugs
- TACT driven optimization exploration for gcc
- Linaro bug 1318
Benchmark infrastructure - TCWG-360 [5/10]
* Worked through my Jenkins issues with Fathi, raised some tickets at him
* Converting LAVA end into multinode job
** Having some trouble with multinode API
Benchmarking 101 presentation [3/10]
* 1/2 day of discussions/reading, full day of redrafting
* Looked for Michael Hope's similar 2012 presentation
** Found slides, not video
=Plan=
* Complete multinode job
* Integrate into Jenkins to the extent that I'm comfortable
* Complete 'shareable' draft of benchmarking-101
** And see if I have enough left over for -102, maybe -103
== This week ==
* TCWG-140 - Transform end of loop conditions to min_expr (1/10)
- Blocked waiting on validation
* TCWG-833 - Exploit Wide Add operations when appropriate (7/10)
- Developed patch to handle signed and unsigned cases for Aarc32
- Investigation and debugging into support for Aarch64
* TCWG-834 - Use non-unit stride loads by preference when applicable (1/10)
- Initial Aarch32 investigation
* Misc (1/10)
- Conference calls
== Next week ==
- Validate Aarch32 patch for TCWG-833
- Develop Aarch64 patch for TCWG-833
- Validate TCWG-140
- Make recommended fixes to TCWG-146 and resubmit upstream
* TCWG-777 (3/10)
- O2 workaround: -fno-tree-pre -fno-tree-fre -fno-tree-dominator-opts
-fno-gcse -fno-peephole2
- Observing rtl dumps for gcse, combine, peephole2 with different
options and optimization levels.
- Continued investigating ICE during gcc build with my pass applied.
- Sent mail to tcwg, for further suggestions
* TCWG-830 (2/10)
- Verified the behavior for aarch64, and extended patch for aarc64
along same lines.
- Running Charles's microbenchmarks on r1-a7
- Benchmarking setup with Bernie. Blocked by permissions, sent a mail
to lava-lab,
for granting requisite permissions
* TCWG-835 (2/10)
- observing vector and asm dumps
* Misc (3/10)
- Travel to Mumbai for US Visa Interview
- Conference Calls
== Next Week ==
- Continue with TCWG-777, TCWG-835, TCWG-830
I'm trying to build the toolchain as win32 executable on Ubuntu with ABE.
I'm pretty new with ABE. I followed the FAQ
https://wiki.linaro.org/WorkingGroups/ToolChain/FAQ and Rob's post. Also
checked the MakeRelease.job and slave.sh. I have all packages listed in the
slave.sh installed. So I assume I have all dependencies ready for the build.
Here is what I have done:
Create _build subfolder beside abe
CD to _build and run: ../abe/configure --with-fileserver=148.251.136.42
--with-remote-snapshots=/snapshots-ref
First build this: ../abe/abe.sh --target aarch64-none-elf -build all
It installed the toolchain to
_build/builds/destdir/x86_64-unknown-gnu-linux. I added the bin under it to
my PATH
Then do 2nd round build: ../abe/abe.sh -host i686-w64-mingw32 --target
aarch64-none-elf -build all
However, I'm getting config error while it building libiberty:
configure:5946: checking for library containing strerror
Configure:5978: error: Link tests are not allowed after GCC_NO_EXECUTABLES.
My understanding is that the linker cannot find glibc or eglibc.
What have I missed?
Any where I can find detail instruction like step by step to build Linaro
toolchain for running on Windows host?
Sincerely,
Qyq
== Progress ==
* Maintenance (CARD-1833 4/10)
- Clang driver:
- Passing -Wa,-mfpu and friends to assembler (D11147, D11148)
- Passing -I to assembler (D11185)
- Don't include libgcc/asm if using libunwind/libc++abi (D11153)
- Asm warnings:
- Trying again to look for a way to disable asm warnings from clang (D11216)
* Benchmarks (CARD-716 3/10)
- Benchmarking shrink-wrapping in AArch64
- Setting up LNT Benchmarks on A32/A64
- Scripts to collate / compare LNT results on the fly
- Benchmarking LNT on 3.5.2 and 3.6.2 on ARM and AArch64
- Multisampling, perf, and all goodness
- Getting ready to compare with 3.7.0 to come
* Releases (CARD-1431 1/10)
- Spinning release 3.7.0
- Many changes, CMake builds, etc.
- Fixing the test-release.sh script (D11326)
* Background (2/10)
- Code review, meetings, discussions, etc.
- Upgrading APM to Debian 3.16
== Plan ==
* Upstreaming pending reviews
* Continue release 3.7.0, benchmark it
* Start looking at the effects of the stride vectorizer on ARM/AArch64
# Progress #
* TCWG-806, aarch64 remote debugging multi-arch support. [6/10]
Some code refactor and fix various multi-arch issues when --wrapper
is used in GDBserver. Patches are being tested.
* TCWG-757, Patches review. [2/10]
* Misc, meeting, [2/10]
# Plan #
* TCWG-806, aarch64 remote debugging multi-arch support.
# Absence #
* 06th Aug - 10th Aug, GNU Tools Cauldron.
* 11th Aug - 14th Aug, Holiday.
--
Yao
1 day off (2/10)
== Progress ==
* backports/release/infra (1/10)
- reviews
* GCC (3/10)
- posted patch to fix vget_lane on armeb
- investigating AdvSIMD failures on aarch64_be.
Having a way to debug target code would help (qemu does not seem
to support aarch64_be yet, and I use the foundation model in bare
metal mode)
* Misc (4/10)
- meetings, conf-calls, emails
== Next ==
Holidays until Aug 3rd.
Benchmark infrastructure - TCWG-360 [5/10]
* More thinking/prototyping sufficiently-secure Jenkins benchmarking
* Converting LAVA end into multinode job
Benchmarking presentation [2/10]
* A couple of helpful discussions
* Read a couple of helpful docs
Misc [3/10]
=Plan=
* Complete multinode job
* Settle on a plan for Jenkins
* Redraft presentation
== This Week ==
* TCWG-777 (4/10)
- Resolved ICE caused by pass during gcc build but hit another ICE:
http://pastebin.com/RUAY6scB
- Current pass state: http://pastebin.com/AGXnSkrZ
- For test-case:
void f(int flags)
{
void foo(void);
if (flags & 1)
foo();
}
- temporaries don't exist for -O1
- for -O2 temps introduced by peephole2 due to define_peephole2
pattern in thumb2.md:1540
http://pastebin.com/3rEF8Te4
So this intentionally transforms rtx from
zeroextractsi_compare0_scratch to rtx from shiftsi3_compare0_scratch.
Why is it beneficial to do this transform ?
- Looking into combine pass
- For above test-case works with -marm for -O2.
* TCWG-830 (3/10)
- trying to understand vect dump
- untested patch: http://pastebin.com/K4UX5iYz
* Misc (2/10)
- Started looking at TCWG-835, loop vectorized on x86 but not arm
- Committed fix to segfault on -dx
- Conference calls
== Next Week ==
- Continue with TCWG-777, TCWG-830, TCWG-835
- Travel to Mumbai on 14th July (Tuesday) for US Visa Interview with
US Consulate.
Hospital and physio (2/10)
== Progress ==
o Upstream GCC (2/10)
* More work on ongoing patches
o Linaro GCC release (3/10)
* Reviewed patches for tcwg-release script
* Looked at validation issues and redo backports for 5.1
o Misc (3/10)
* Various meetings
* Upstream libunwind support
== Plan ==
- Continue ongoing tasks
== Progress ==
LLDB development
-- Debugging problems with process launch and debugserver crash on remote
connection. [TCWG-855] [6/10]
-- Caught the notorious issue mentioned above fix can be found here
http://reviews.llvm.org/D11129.
-- Figured arm lldb and lldbserver host builds on chromebook will put
steps on collaborate LLDB page soon. [1/10]
Miscellaneous [3/10]
-- Travel to Islamabad for Czech Republic visa
-- Meetings, emails, discussions etc.
== Plan ==
LLDB development
-- Follow up review process for process launch bug fix.
-- Run testsuite for armhf and figure out issues to fix.
-- Submit patches to fix build on older version of gcc.
Eid Holidays 17th to 21st July 2015
== This week ==
* TCWG-146 - Detect smin/umin idion (0/10)
- Waiting for upstream approval/review
* TCWG-140 - Transform end of loop conditions to min_expr (1/10)
- Blocked waiting on validation
* TCWG-833 - Exploit Wide Add operations when appropriate (8/10)
- Mere detailed investigation
- Working theory is to develop wide add rtl patterns that
incorporate vec_unpack to widen 16-bit to 32-bit
* Misc (1/10)
- Conference calls
- Conference call with Charles and Prathamesh to discuss
autovectorization progress
== Next week ==
- Validate patch for TCWG-140
- Develop patch for TCWG-833
- Validate TCWG-833 if successful patch is developed
- Investigate Aarch64 implementation
== Progress ==
* Add REG_EQUAL note for arm_emit_movpair (1/10)
- committed patch1 after testing again
* Factor conversion out of COND_EXPR - TCWG-849 (5/10)
- Gone through couple of iterations and committed the patch
- There are still some improvements need as follow up patches
* TACT -TCWG-851 (2/10)
- Started looking into spec2k
* Misc (2/10)
- Looked at interaction between gcc optimization passes
- gcc-patches, gcc-bugs list
- Meetings
== Plan ==
- GCC Bugs
- TACT driven optimization exploration for gcc
- Linaro bug 1318
== Progress ==
* Releases (CARD-1431 1/10)
- Released 3.6.2-final
* Maintenance (CARD-1833 5/10)
- Reducing runtime of some benchmarks in LLVM's
test-suite by getting rid of millions of useless
fprintf calls.
- Working on https://llvm.org/PR20700 some more
* Background (4/10)
- Code review, meetings, discussions, etc.
- Long TargetTuple review (D10969) / discussions
- Replacing broken buildbot USB disks (need to buy more)
- Bisecting self-hosting bot breakage
- Testing patches for ARM
- Jira farming
== Plan ==
* continue PR20700
* continue review/discussion of TargetTuple
* look again at PR20757
* maybe look at PR21000
* Off Monday (2/10)
== Progress ==
* published linaro-4.8 and 4.9 2015.06 releases
* linaro-5.1-2015.07 (1/10)
- backport reviews
- updated my helper script for reviews for cope with the git-only branches
* upstream (1/10)
- started looking at vget_lane Neon intrinsic failure on armeb
* infra/release/backports (2/10)
- reviews
* Misc (4/10)
- meetings, conf-calls, emails
== Next ==
* Off Tuesday
* backports, release, validation: update doc
* backports, reviews
* upstream work
== Later ==
* Off July 18th-Aug 3rd
Hi Linaro Toolchain Group,
I am comparing execution time (run time) of sin() trigonometric function
between following glibc (including libm) libraries for aarch64 (juno cortex
a57) :
Linaro glibc 2.19, Linaro eglibc 2.19, eglibc 2.19 (from
http://www.eglibc.org/) and Linaro glibc 2.21.
My observation for execution time of sin():
with Linaro glibc 2.19 and eglibc 2.19 = 1m24.703s (approx)
whereas,
with Linaro eglibc 2.19 & Linaro glibc 2.21 = 0m25.243s (approx)
Has Linaro optimized the libm functions for aarch64 in Linaro eglibc 2.19 ?
If yes, please point me to relevant reference from where I can find more
information on them.
Since the eglibc development from version 2.19 has stopped, will Linaro
maintain its own development version of glibc ?
I am using below snippet code and linux 'time' command to calculate the
time.
void sin_func(void)
{
double incr = 0.732;
double result, count = 0.0;
printf("%s\n", __func__);
while (count < 105414350.0) {
result = sin(count);
count += incr;
}
}
Thanks.
--
with regards,
Virendra Kumar Pathak
== Progress ==
LLDB development [TCWG-855] [8/10]
-- Figure out build steps for building cross lldb-server with
arm-linux-genueabihf-g++
-- Debugging of lldb-server communication packets for fixing lldb-server
armhf crash problem.
-- Comparison with androidabi version to figure out missing pieces
Miscellaneous [2/10]
-- Ubuntu reinstall on laptop
-- Follow up on Czech Republic visa
-- Meetings, emails, discussions etc.
== Plan ==
LLDB development
-- Work on lldb-server for armhf and try to figure out crash problems
[TCWG-855]
End of sick leave (will work 100% from home until my cast is removed).
== Progress ==
o Upstream GCC (2/6)
* Back to ongoing patches
o Linaro GCC release (7/6)
* Reviewed FSF branch merge into 4.8/4.9 branches
* Reviewed patches for tcwg-realease script
* Sent a first batch of backports for 5.1
Still pending due to validation infra. issues
o Misc (1/6)
* Various meetings
== Plan ==
- Continue ongoing tasks
* One day off on Fri. [2/10]
# Progress #
* TCWG-805, aarch64 native debugging multi-arch support. [5/10]
Patches (part1) are posted upstream for review, need to rewrite some
of them. The rest of them are OK and can be pushed in after 7.10
branch is created.
Watchpoint support in multi-arch debugging. Both kernel and GDB need
some fixes. Ongoing.
* Complete the document of aarch64 tracepoint work. [1/10]
* FSF GDB. [2/10]
Review intel mpx patch again, and read something on intel
mpx stuff.
# Plan #
* TCWG-805, update some patches in part 1 patch series, and continue
the multi-arch watchpoint work.
--
Yao
== Progress ==
* Add REG_EQUAL note for arm_emit_movpair (1/10)
- Updated and reposted
- https://gcc.gnu.org/ml/gcc-patches/2015-07/msg00295.html
- https://gcc.gnu.org/ml/gcc-patches/2015-06/msg02066.html
* Factor conversion out of COND_EXPR - TCWG-849 (3/10)
- https://gcc.gnu.org/ml/gcc-patches/2015-07/msg00246.html
* TACT -TCWG-851 (2/10)
- Small examples now seem to work.
- Have to do cross testing
* Git work flow for upstream patches -TCWG-848 (1/10)
- Updated based on review
* Misc (3/10)
- Looked up LLVM documents
- Looked at the TODO list Renato provided
- gcc-patches, gcc-bugs list
- Meetings
== Plan ==
- GCC Bugs
- TACT driven optimization exploration for gcc
- Linaro bug 1318
== This Week ==
* TCWG-856 (2/10)
- submitted patch to flatten cfgloop.h:
https://gcc.gnu.org/ml/gcc-patches/2015-07/msg00277.html
* TCWG-777 (4/10)
- Modified pass to not generate redundant stores
- Investigating ICE caused by the pass during gcc build
- Discussions for possible approaches with Christophe and Kugan
- Reading thru documentation on optabs and ccmp patches
* Misc (4/10)
- Patch sent upstream which fixes segfault in gcc for -dx option.
- Filed upstream binutils bug for "branch range out of error"
- Conference calls
- Travel to Mumbai for US Visa OFC appointment
== Next Week ==
- Word towards committing cfgloop.h flattening patch
- Continue working on TCWG-830, TCWG-777, TCWG-847
== Progress ==
* Maintenance (CARD-1833 4/10)
- ADD/SUB with negative immediates solved by a year old
patch from ARM, sigh. On to the next bug... :(
- Working on https://llvm.org/PR20700
* Buildbots (CARD-1823 2/10)
- Moving benchmark bot to CMake, fixing deepcopy bug in
environment that broke new builds
- Restarting a few bots that crashed
* Background (4/10)
- Code review, meetings, discussions, etc.
- A lot of code review this week...
- Blocking disrespectful web spiders in llvm.org
- Emacs now almost works as I expect
== Plan ==
* Continue PR20700
* Have a look at Polybench
* Look for some more bugs to fix
Benchmarking presentation [7/10]
* More reading
* Ran through a couple more drafts
Misc [3/10]
* Featuring a bug in my backup scripts that took ~1/10 to fix
=Plan=
Back to benchmark automation as main activity
Presentation in the background
== Progress ==
LLDB development
-- Support for running lldb on arm hard float abi targets [TCWG-855] [7/10]
-- Built lldb-server for armhf trusty chromebook
-- Figured out problem with lldb-server showing up i386-linux-gnu as
target triple.
-- Verfied load of arm-elf executable and breakpoint setting.
-- LLDB GDBserver dies while trying to run the target.
Miscellaneous [3/10]
-- Playing with highkey board and setup chromebook with armhf and armel
chroots on ssd.
-- Preparing document for Czech Republic visa
-- Meetings, emails, discussions etc.
== Plan ==
LLDB development
-- Further progress and try to fix run control on armhf targets [TCWG-855]
== This week ==
* TCWG-146 - Detect smin/umin idion (1/10)
- Patch sent upstream for approval
* TCWG-140 - Transform end of loop conditions to min_expr (4/10)
- Patch and investigating validation regressions
* TCWG-833 - Exploit Wide Add operations when appropriate (4/10)
- Investigation into why vectorizer does not exploit wide adds
* Misc (1/10)
- Conference calls
- Conference call with Kugan and Prathamesh to discuss GCC Git workflow
- Conference call with Charles and Prathamesh to discuss
autovectorization
== Next week ==
- Vacation
== Progress ==
(TCWG-831) post-indexed addressing [3/10]
. vectorization project kick-off call
. code browsing/reading to understand mailing list feedback about previous patch
(TCWG-775) NEON error messages [6/10]
. completed conversion of some ARM intrinsics to give same error
messages as AArch64 work
. reworked tests so they can be shared between AArch64, ARM.
. re-submitted previous patch with updated tests
Misc [1/10]
email, irc, gerrit reviews, connect travel booking, AArch64 qemu
big-endian experiment
== Plans ==
submit patch for work done so far on ARM NEON error messages
cortex-a53 workarounds
Benchmark automation - TCWG-360 [3/10]
* Created a partial Jenkins prototype
* Considered some security issues
Benchmarking presentation [5/10]
* Drafted some slides, did some reading
Misc [2/10]
=Plan=
More of the above
== Progress ==
* TCWG-849 (1/10)
- Committed improvement for VRP
https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=225108
* Add REG_EQUAL for arm_emit_movpair (4/10)
- Posted patches for review
* TACT -TCWG-851 (3/10)
- Started with the small examples.
- Ran into an error while tuning; looking into it
* Git work flow for upstream patches -TCWG-848 (1/10)
- Had a chat with Michael and Prathamesh
- Tried the work-flow and now started documenting them
* Misc (1/10)
- gcc-patches, gcc-bugs list
- Meetings
== Plan ==
- GCC Bugs
- TACT driven optimization exploration for gcc
* TCWG-830 (4/10)
- Observing tree dumps
- Peeling for alignment happens at -O3 but not at -O2 -ftree-vectorize
Reason: in vect_enhance_data_refs_alignment() for:
a) -O2 -ftree-vectorize: max_allowed_peel == 0
b) -O3: max_allowed_peel == (unsigned) -1;
which equals UINT_MAX and therefore peeling gets allowed.
- Workaround: Pass -param vect-max-peeling-for-alignment=0
- Peeling for alignment with O2 can be enabled by passing
-fvect-cost-model (we don't want this!)
Reason:
opts.c:
/* Tune vectorization related parametees according to cost model. */
if (opts->x_flag_vect_cost_model == VECT_COST_MODEL_CHEAP)
{
maybe_set_param_value (PARAM_VECT_MAX_VERSION_FOR_ALIAS_CHECKS,
6, opts->x_param_values, opts_set->x_param_values);
maybe_set_param_value (PARAM_VECT_MAX_VERSION_FOR_ALIGNMENT_CHECKS,
0, opts->x_param_values, opts_set->x_param_values);
maybe_set_param_value (PARAM_VECT_MAX_PEELING_FOR_ALIGNMENT,
0, opts->x_param_values, opts_set->x_param_values);
}
The above if condition becomes false when -fvect-cost-model is passed.
- Proposed patch (untested): http://pastebin.com/ftp0mrwH
Patch follows the workaround and passes --param vect-max-peeling-for-alignment=0
if unaligned access is supported.
* TCWG-777 (4/10)
- Observing tree and rtl dumps
- Workaround: for -O1 pass -fno-tree-fre -fno-tree-dominator-opts
Test-case: http://pastebin.com/cjBcSpiT
Generated assembly at -O1 without workaround: http://pastebin.com/jmQGZhN9
Generated assembly at -O1 with workaround: http://pastebin.com/JGj05z66
Is that the expected output for no unnecessary temps in assembly with
workaround ?
Is it profitable over the assembly generated without workaround ?
- Approach currently taken:
a) New pass "remove-temps" (for lack of better name), after nrv (added
as last gimple pass).
b) Transforms:
if (ssa_var != 0)
to
new_ssa_var = SSA_NAME_DEF_STMT (ssa_var)
if (new_ssa_var != 0)
This "unfolds" cse on expressions within if, which was done by fre
(and if fre was disabled then by dom pass).
c) However this approach results in dead stores.
eg:
_8 = flags_7(D) & 1;
if (_8 != 0)
...
is transformed to:
_8 = flags_7(D) & 1;
_32 = flags_7(D) & 1;
if (_32 != 0)
...
so store to _8 is dead store.
I tried to run dse after remove-temps but that didn't work.
RTL 194r.jump eliminates the above dead store as "trivially dead insn".
However I don't think it's a good idea to have dead stores like these
in gimple and rely
on RTL to eliminate them. I could try to make the pass bit smarter to
not generate redundant stores like _32 != 0 in above case.
d) Patch (no intent to commit as-is): http://pastebin.com/AGXnSkrZ
Generated assembly at -O1 with the patch: http://pastebin.com/VmHCVpGC
Patch eliminates temporaries at -O1 but not at -O2.
I have not yet figured out the reason for that.
For if (flags & 1),
In dfinish pass for -O1, the generated RTL is from
zeroextractsi_compare0_scratch
while for -O2, the generated RTL is from andsi3_compare0
e) Is this a problem also on x86 ?
x86 generated assembly with -O1: http://pastebin.com/XMeTXXwK
* Misc (2/10)
- Getting familiar with vectorizer and NEON gcc intrinsics
- Reviewed git tutorials and starting preparation of git doc
- Conference calls
== Next Week ==
- Continue working on TCWG-830 and TCWG-777
- Header file flattening
- Travel to Mumbai on 2nd July (Thursday) for US Visa OFC appointment.
== Progress ==
* Maintenance (CARD-1833 4/10)
- Found the trail on the ADD/SUB with negative immediate
- Submitting RFC for discussion (http://llvm.org/PR20978)
- Bugzilla farming
- More LNT investigations (http://llvm.org/perf/ unstable)
* Releases (CARD-1431 1/10)
- Building, testing and uploading 3.6.2 RC1
* Background (5/10)
- Code review, meetings, discussions, etc.
- More stride vectorizer code review (lnN/stN implementation)
- More lab discussions (routers, lab split, new link)
- Changing my dev env to emacs (huge mind set flip)
== Plan ==
* Continue with ADD/SUB change
* Continue with Emacs setup
* Move benchmark bot to CMake
* Some other bugs
* One day off on Thu [2/10]
# Progress #
* Linaro GDB [4/10]
** TCWG-805, aarch64 native debugging multi-arch support.
Prepare for the patches submission.
It is a big patch series, and think about how to upstream them.
Write commit log including the rationale of the changes.
* FSF GDB [2/10]
** FSF GDB 7.10 release. Audit some GDB regressions caused by intel
mpx stuff.
** PR 18605. Write a patch and it is in testing.
** Other patches review.
* Misc [2/10]
** File expense report for Grenoble travel.
** Some discussions on aarch64 tracepoint.
# Plan #
* TCWG-805, upstream some patches on multi-arch debugging.
--
Yao
* One day off (Wed) (2/10)
== Progress ==
* linaro-5.1-2015.06 snapshot (1/10)
- dealt with tags, release notes
- shared it with B&B
* 4.8-2015.06 branch merge (1/10)
- investigated regression: incorrect automatic merge
- fixed, validation on-going
* 4.9 branch (2/10)
- updated our git linaro-4.9-branch to match the svn one
- ready for branch merge, will be done right after fsf release
* Misc (4/10)
- meetings, conf-calls, emails, reviews (GCC backports, ABE, backflip)
== Next ==
* more reviews for new backports
* backports, release, validation: update doc
* hopefully upstream work
Recently I came across two excellent post about accelerating clang/llvm
build with different compiler/optimization [1] [2].
I tried some of author advices getting very good results. Basically I
moved to optimized clang build, changed to gold linker and used another
memory allocator than system glibc one. Results in build time for all
the clang/llvm toolchain is summarized below (my machine is a i7-4510U,
2C/4T, 8GB, 256GB SSD):
GCC 4.8.4 + gold (Ubuntu 14.04)
real 85m17.640s
user 257m1.976s
sys 11m35.284s
LLVM 3.6 + gold (Ubuntu 14.04)
real 34m4.909s
user 128m43.382s
sys 3m51.643s
LLVM 3.7 + gold + tcmalloc
real 32m56.707s
user 121m40.562s
sys 3m52.358s
Gold linker also shows a *much* less RSS usage, I am able to fully use make -j4
while linking in 8GB without issue any swapping.
Two things I would add/check for the posts:
1. Change from libc to tcmalloc showed me a 3-4% improvement. I tried jemalloc,
but tcmalloc is faster. I am using currently system version 2.2, but I have
pushed an aggressive decommit patch to enable as default for 2.4 that might
show lower RSS and latency (I will check it later).
2. First I try to accelerate my build by offloading compilation using distcc.
Results were good, although the other machine utilization (i7, 4C/8T, 8GB)
showed mixes cpu utilization. The problem was linking memory utilization
using ld.bfd, which generates a lot of swapping with higher job count. I
will try using distcc with clang.
[1] http://blogs.s-osg.org/an-introduction-to-accelerating-your-build-with-clan…
[2] http://blogs.s-osg.org/a-conclusion-to-accelerating-your-build-with-clang/
Benchmark automation - TCWG-360 [7/10]
* Arndales stopped booting
** Package servers for elderly filesystem had gone
** Investigated some approaches to creating more stable filesystems
** Realized I could just updated image to point at old-releases, so
did that for now
* _More_ time thinking about interactions with Jenkins & LAVA. Fathi
gave me some Jenkins jobs to prototype in.
* Brain-dumped some of the present state of things into Collaborate
Misc - [3/10]
=Plan=
Jenkins prototyping
>> Using Python to script GDB makes it much more efficient to do testing.
>> Having a Python-disabled build of GDB prevents this.
I use the example of the gdb-python scripts for the linux kernel.
They are very useful, these do not work when using GDB from windows.
-Duane
It seems the prebuilt windows releases of GDB do not enable Python.
Are there plans to release a python-enabled-gdb in the windows builds?
If not, what are the roadblocks to this?
Thanks
Example:
$ ./aarch64-linux-gnu-gdb.exe
GNU gdb (Linaro GDB 2015.02-3) 7.8-2014.09-1-git
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show
copying"
and "show warranty" for details.
This GDB was configured as "--host=i686-w64-mingw32
--target=aarch64-linux-gnu"
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://bugs.launchpad.net/gcc-linaro>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word".
(gdb) python
>
>Scripting in the "Python" language is not supported in this copy of GDB.
(gdb) quit