Summary:
* Investigate Automotive benchmark performance on different branch cost.
Details:
1. Automotive benchmark performance analysis for different branch cost
on Pardaboard ES.
* Design small test cases to simulate bitmnp01 to compare the
performance between ITTT and conditional branch. Test results show
- If branch prediction does not work (put the codes in a
function), ITTT is always better than conditional branch.
- If branch prediction works (inline the codes t in the loop
body), for most cases, conditional branch is better than ITTT.
* Code alignment has big impact for tblook01. By default IT block
has better performance. When adding __attribute__((aligned (16))) for
function t_run_test, performance of conditional branch is better than
IT block.
2. Prepare Linaro toolchain binary release.
* Update Linaro crosstool-ng local patches due to the fix of
lp:1067766 in source package.
* Spawn all builds and smoke tests.
Plan:
* Investigate SPEC2k performance for different branch costs.
* Work with Bero for 2013.01 toolchain binary release .
Planed leaves:
* Feb. 9 - 15: Chinese Spring Festival.
Best Regards!
-Zhenqiang
== Progress ==
* Buildbot
- Taking buildbot to Linaro
- Had wireless/GPU overheating, disabled kernel modules
- Running smooth again (most of the time)
- Debugging errors that only appear on ARM.
* Building and Testing LLVM
- Compiling on Intel with only the ARM backend helps a lot
- Sent a call for Action to people clean up cross-compilation failures
* LAVA
- Progress on LAVA LLVM job
- Got it checking out, configuring and building
- Got PASS/FAIL/SKIP patterns working
- https://validation.linaro.org/lava-server/scheduler/job/46027
- Need to get a patch from a specific place to apply
* Cost Model
- Re-wrote table lookup patch a few times, finally in for good
- http://llvm.org/viewvc/llvm-project?rev=173382&view=rev
- Studying costs of instructions, all seem good enough
- Better approach now is to change the target description (less code, more
gain)
* EuroLLVM
- 136 people so far
== Plan ==
* Test distcc (or similar) on Pandas
* Get a buildbot running with cross-compilation
* Internal git repository for LAVA LLVM job
* Confirm Linaro's sponsorship for EuroLLVM
* Continue cost model changes in between
== Background ==
* Monitor list for ARM changes
* Monitor buildbot for failures
Activity:
* calls and meetings (about 20% of my working week this week ;-))
* finished rebasing and testing the KVM QEMU patches (thanks
to Pawel for getting me an updated RTSM device tree), sent
out updated version to go with -v17 kernel
* minor qemu maintenance patches (including a minor cfi01
flash model bugfix)
* trying to track down issues running a 3.8-rc4 vexpress
kernel on QEMU. Among other things:
* looks like we need to emulate some more of the oscillator
and voltage config registers now (if only to make the
kernel a bit quieter)
* the kernel doesn't like the way qemu's boot loader puts
the DTB blob after the initrd but beginning in the same
page as the initrd ends [free_initrd_mem will trash memory
outside the initrd proper but inside that last page]
* a15 reports the wrong board model number
-- PMM
Dear All,
Is it possible to compile ARCH "AArch64 " for 32 mode, like if I have
x86 64 bit machine and I install 32 bit OS on it, and machine is
compatible with 32 bit binary.
So is it possible to use AARCH64 (Cortex-V8) with installation of
kernel 32 bit and use 32 bit tool chain.
If answer is yes, can I build tool chain or is there option available
in linaro cross-compile available from
https://launchpad.net/linaro-toolchain-binaries/+milestone/2012.12
Thanks
Activity:
* usual set of calls and meetings (and there is another
KVM related weekly meeting in the pipeline...)
* reviewed virtio patches
* rebased qemu-linaro on upstream
* rebased KVM patches; couldn't get updated kernel to run
on RTSM (probably a device tree or config issue; need to
attack problem again this week)
NB: I'm currently working a reduced set of hours due to RSI,
though I am trying to remain responsive to email etc.
-- PMM
== Progress ==
* Prepared Venkat on-boarding.
* Aarch64 porting meeting:
- libunwind is in the pipe
* Boehm GC AArch64 support:
- basic port done, test on-going
== Next ==
* Boehm GC AArch64 support:
- test and ask for up stream review.
Summary:
* Investigate automotive benchmark.
* Linaro gcc 4.6 release
Details:
1. Automotive benchmark performance analysis for different branch cost
for Cortex-A9.
* Debug function WriteOut, which is called 12 times on average,
leads common performance issue since the IF-THEN in the function is
converted to IT block, which TRUE probability is less than 4%.
* Identify the root cause of performance regression with IT block
for bitmnp01, rspeed01, pntrch01 and ttsprk01. Overall,
- The performance of a taken bpl is better than an ITTT. If this
is a common sense, for IF-THEN, we'd set branch-cost to 1.
- For IF-THEN-ELSE, we'd take branch probability into account when
converting it to IT block.
- ifcvt might generate useless IT block.
2. Try to do Linaro GCC release. But meet several issues:
* Can not branch a clean lp:gcc-linaro/4.7. As a workaround, I had
downloaded a clean bzr tree from other site. For next release, I can
use the local tree to create the release tarball.
* All a9hf-builder ubutests fail due to test environment issue.
Plan:
* Investigate more benchmarks for different branch costs.
Planed leaves:
* Feb. 9 - 15: Chinese Spring Festival.
Best Regards!
-Zhenqiang
== Progress ==
* 64-bits ops in Neon: pinged patch proposal.
* vectorizer cost model: received results from spec2k. Prepared
initial tuning to submit to benchmarking again.
* smin-umin: tests OK, benchmarks ran, but did not generate the diff
over a valid ancestor. I didn't make the manual comparison yet.
* updated board for local benchmarking
* tcpanda heat problems: built a new kernel with the thermal driver;
need to reboot the board with it
== Next ==
* handle 64-bits bitops in Neon feedback from upstream if any
* analyze results of benchmarking with vectorizer cost model
* analyze results of benchmarking with smin-umin idiom patch
* continue board setup/update; I will probably try to cross-build the
benchmarks to avoid having the build GCC itself on the board and save
time.
* followup on tcpanda heat problems
== Progress ==
* Buildbots
- Added a Panda ES buildbot on clang-native-arm-cortex-a9 group
- Reporting and helping fix bot bugs on ARM
- ARM buildbots are green again!
- Each ARM buildbot takes 4h15min to complete, versus 15min on Intel
- We're still testing up to 12-15 patches on each build, on peak times
(PST)
* LAVA
- Created a test run for llvm check-all, infrastructure is there
- Need to make it actually do some work
* Vectorization
- Refactored cost model's temp tables
- http://llvm.org/viewvc/llvm-project?view=rev&revision=172658
- Studying NEON costs, changing ARM target lowering
* test-suite A15
- Building LLVM on Chromebook, check-all (1h, 181 failures)
- Self-hosting LLVM on Chromebook, check-all (50min, many more)
- Found some floating point type errors, only on Chromebook (libs?)
* AArch64 back-end
- Reviewed patches, look ok, some comments
- Should be all in by next week
* LLVM cross-compilation woes
- Had to define include path for c, c++ and arm locations
- It calls the wrong assembler, even defining the right gnu toolchain
- Someone needs to fix these cross-compilation bugs!! :)
== Plan ==
* Finish basic NEON costs for vectorization
* Finish LAVA bot compiling clang + check-all
* Install Panda buildbot on rack
* Continue investigating Chromebook failures
* Continue thinking about the long term plan for LLVM
The Linaro Toolchain Working Group is pleased to announce the 2013.01
release of both Linaro GCC 4.7 and Linaro GCC 4.6.
Linaro GCC 4.7 2013.01 is the tenth release in the 4.7 series. Based
off the latest GCC 4.7.2+svn194772 release, it includes ARM-focused
performance improvements and bug fixes.
Interesting changes include:
* Updates to GCC 4.7.2+svn194772
* Includes arm/aarch64-4.7-branch up to svn revision 194808
* Support for the rev16 and revsh instructions
* A15 Neon pipeline backported from trunk
* FMA intrinsic backported from trunk
* Better extending core to NEON transfers
* Fused multiply-add support
Fixes:
* LP #1088898 regression of x86 gcc bootstrap with Linaro sourcebase
* LP #1067766 Backport support for arm-linux-gnueabihf to GCC Linaro
* LP #1084010 __atomic_load doesn't match ACQUIRE memory model
Linaro GCC 4.6 2013.01 is the 23st release in the 4.6 series. Based
off the latest GCC 4.6.3+svn194771 release, this is the tenth release
after entering maintenance.
Interesting changes include:
* Updates to 4.6.3+svn194771
The source tarballs are available from:
https://launchpad.net/gcc-linaro/+milestone/4.7-2013.01https://launchpad.net/gcc-linaro/+milestone/4.6-2013.01
Downloads are available from the Linaro GCC page on Launchpad:
https://launchpad.net/gcc-linaro
More information on the features and issues are available from the
release pages:
https://launchpad.net/gcc-linaro/4.7/4.7-2013.01https://launchpad.net/gcc-linaro/4.6/4.6-2013.01
Mailing list: http://lists.linaro.org/mailman/listinfo/linaro-toolchain
Bugs: https://bugs.launchpad.net/gcc-linaro/
Questions? https://ask.linaro.org/
Interested in commercial support? Inquire at support(a)linaro.org