Hi,
I have a usecase where linaro toolchain is used to build my executables and
the sysroot is copied and used as glibc for running my embedded system.
Reason for this is, I want to use the same glibc what the application is
compiled against.
I found a bug fix from glibc community which I want to cherry pick and
rebuild the sysroot to include this fix. But, in the README.txt published
with linaro toolchain binary, there are no instructions for rebuilding
sysroot.
Can anyone point me to info on rebuilding sysroot? If formal steps don't
exist, could you point me to the current process being followed by linaro
so that I can observe the build log and attempt to do the same?
Thanks
Bharath
All,
In the Toolchain Working Group Mans has been doing some examination of SPEC
2000 and SPEC 2006 to see what C Library (glibc) routines impact performance
the most, and are worth tuning.
This has come up with two areas we consider worthy of further investigation:
1) malloc performance
2) Floating-point rounding functions.
This email is interested with the first of these.
Analysis of malloc shows large amounts of time is spent in executing
synchronization primitives even when the program under test is single-threaded.
The obvious 'fix' is to remove the synchronization primitives which will
give a performance boost. This is, of course, not safe and will require
reworking malloc's algorithms to be (substantially) synchronization free.
A quick Google suggests that there are better performing algorithms
available (TCMalloc, Lockless, Hoard, &c), and so changing glibc's algorithm
is something well worth investigating.
Currently we see around 4.37% of time being spent in libc for the whole of
SPEC CPU 2006. Around 75% of that is in malloc related functions (so about
3.1% of the total). One benchmark however spends around 20% of its time in
malloc. So overall we are looking at maybe 1% improvement in the SPEC 2006
score, which is not large given the amount of effort I estimate this is
going to require (as we have to convince the community we have made
everyone's life better).
So before we go any further I would like to see what the view of LEG is
about a better malloc. My questions boil down to:
* Is malloc important - or do server applications just implement their own?
* Do you have any benchmarks that stress malloc and would provide us with
some more data points?
But any and all comments on the subject are welcome.
Thanks,
Matt
--
Matthew Gretton-Dann
Toolchain Working Group, Linaro
Greetings,
I'm using the Linaro tool chain with Eclipse (Juno) (under Windows) and
openOCD to write firmware for an STM32F20x based design (using an ST-Link2
debugger).
In general, that all works fairly well.
The part I'm having problems with is debugging (step-in, etc) from Eclipse.
The execution flow seems chaotic when single stepping through C code: it
skips statements, it jumps into the middle of a function, then returns to
the start of a function, it loops over certain statements (while there's no
loop in the code), etc. (It's close to useless).
I have seen this behavior with other IDE's and tool chains when code was
built with optimization turned on.
However, I specify 'no optimization' (-O0) when I build my code.
My questions:
a) Is there some implicit optimization being done in the compiler, even
though I tell it not to do so, which may affect proper debugging?
b) Are other people using Eclipse (Juno) and are they seeing the same
issue? Are there any known ways to fix this chaotic debugger behavior?
Kind regards,
~ Paul Claessen
== Progress ==
* Performed investigation on gdb7.6 test suite failures and untested test
cases.
* Updated JIRA enteries with test suite failures on arm to track progress.
* Wrote an automation script for selection of individual test cases from a
text file.
* Got the gdb.dwarf2 test suite patch reviewed from Matt and Will.
* Day off on Friday.
== Plan ==
* Finish up initial investigation on gdb7.6 test suite results.
* Complete updates of JIRA enteries after investigation on test suite
results in complete.
* Start work on integration of different testing scripts written in past
couple of months.
* Send gdb.dwarf2 test suite patch upstream.
Hi Richard,
After adding some new ops, I can keep the conditional compare to the
end of tree-level optimization. As tests, I expand conditional compare
to BIT_AND_EXPR/BIT_IOR_EXPR, which still depend on later "combine"
pass to combine them.
Is it possible to expand it to *cmp_and/*cmp_ior like patterns?
What's the expected RTL representation for conditional compare after
expand and before combine?
Thanks!
-Zhenqiang
== Progress ==
4 day week was ill on Friday.
* AARCH64 gprof –c option support.
Completed and submitted patch in binutils and got it upstreamed.
http://sourceware.org/ml/binutils/2013-05/msg00265.htmlhttp://sourceware.org/ml/binutils/2013-05/msg00264.html
The committer has changed the logic and seems it is not working for
backward addresses. Sent him an offline mail to correct it.
http://sourceware.org/ml/binutils/2013-05/msg00279.html
* AARCH64 gprof –glibc support.
Submitted patch and got is approved
http://sourceware.org/ml/libc-ports/2013-05/msg00098.html
* AARCH64 gprof – gcc support
Tested with removing test clause as suggested by Marcus (Aarch64 maintainer).
Marcus wants me to send two separate patches. Will be posting it.
* AARCH64 testing
Got boot strap failure with GCC 4.9 trunk on open embedded image with
glibc changes.
To confirm it is not a regression I ran with the openembedded image
available at Linaro download site.
Bootstrap failure can be reproduced. I am documenting the steps to
increase the size of image and also
Steps to reproduce boot strap failure in the model.
== Plan ==
* Continue bootstrap testing and push patches to GCC
* libssp support for Aarch64
* Linaro connect travel prep.
Misc
------
Planned leave 29-5-2013.
== Issues ==
* None.
== Progress ==
* LRA on ARM and AArch64:
- Reduced the ARM test case.
- The issue occurs during the constraints alternatives choice.
- Debug still ongoing.
== Plan ==
* LRA, LRA and LRA
== Progress ==
VRP based zero/sign extension
- Got some review comments for the patch and started addressing them
- split the patch into two; 1. propagating value range and 2. do rtl
expansion
- testing in progress
specfp regression
- Benchmarked spec2k for A15 with trunk and couldn't reproduce it.
- benchmarked spec2k for A9 with trunk and couldn't reproduce between
24th and 28th
== Leave ==
- Monday off Sick
== Plan ==
VRP based zero/sign extension
- Send patch for review
specfp regression
- benchmark for trunk version on 23rd
- Resolve regression
Sorry, this covers the last two weeks, not just one.
== Progress ==
* Created database schema for DejaGnu test results
* Created data schema for benchmarks
* Wrote scripts to convert benchmark and test data into a form that
can be imported into a database, added them to DejaGnu branch
* Imported all historical benchmark data
* Imported most historical test results (GCC still importing)
* Did some experimental graphs of test results
* Read lots of web pages to come up to speed on Linaro, registered
for lots of web pages and accounts
* Learned about Cbuild and LAVA
* Started on Cbuild v2
* Installed Ubuntu on Chromebook
== Plan ==
* Write Cbuild 2 design doc
* Continue work on Cbuild v2 to be able to use it for the June release
* Get remote testing working with Chromebook & foundation model
* More support tasks resulting from move off launchpad
- rob -
Greetings,
I'm using the Linaro tool chain with Eclipse (Juno) (under Windows) and
openOCD to write firmware for an STM32F20x based design (using an ST-Link2
debugger).
In general, that all works fairly well.
The part I'm having problems with is debugging (step-in, etc) from Eclipse.
The execution flow seems chaotic when single stepping through C code: it
skips statements, it jumps into the middle of a function, then returns to
the start of a function, it loops over certain statements (while there's no
loop in the code), etc. (It's close to useless).
I have seen this behavior with other IDE's and tool chains when code was
built with optimization turned on.
However, I specify 'no optimization' (-O0) when I build my code.
My questions:
a) Is there some implicit optimization being done in the compiler, even
though I tell it not to do so, which may affect proper debugging?
b) Are other people using Eclipse (Juno) and are they seeing the same
issue? Are there any known ways to fix this chaotic debugger behavior?
Kind regards,
~ Paul Claessen
== Progress ==
* binutils on ARM testsuite finally green in cbuild!
* Tested and pushed to gerrit bionic memcpy patches.
* Investigated binutils native AArch64 testsuite failures (not IFUNC related).
* Made a start on the DeveloperTools/LibraryPerformance wiki.
* Started looking at the Android memcpy problem on Galaxy Nexus.
== Issues ==
* binutils make ; make check takes over 24 hours on foundation model!
== Plan ==
* Respin AArch64 IFUNC binutils patch once relocation number allocated.
* Setup git mirrors for binutils, glibc and newlib.
* Android memcpy issue.
--
Will Newton
Toolchain Working Group, Linaro
== Progress ==
* Disable-peeling: looking at how to have less aggressive vectorization
* Libsanitizer/aarch64: initiated upstream discussion
* PGO/LTO bug reported by Doko: SD card too small to reproduce the problem
* Merges for linaro-gcc-2013.06: started looking at what to backport,
started merges
* Jira/wiki: started cleanup/collecting new cards
* Internal support
== Next ==
* Jira: update status on cards/blueprints backported from launchpad
* Merges for linaro-gcc-2013.06: continue collecting relevant merges
* Disable-peeling: continue investigating vectorizer behaviour
*Libsanitizer/aarch64: look at frame implementation
* PGO/LTO: complete build of python
* Neon intrinsics: continue improving crc with vuzp/veor
Progress:
* misc
** got raring/aarch64 cross build set up
** reducing number of places that need changing for a new qemu
target: sent some simple configure patches
** some 32 bit cleanup work that will help with getting John's
AArch64 patches to work
** tested Huawei's aarch64 patches and confirmed they work
** rebased qemu-linaro (and passed the results to Serge H for Ubuntu)
** sent patches which make QEMU builds for arm/ppc/microblaze guests
require libfdt, since a non-FDT-aware ARM QEMU is becoming
rapidly less and less useful
Plans:
* handover from John Rigby
* VIRT-55: talk to Andre about testing; investigate testing migration
using LAVA
* set up a new qemu-linaro tree/branch as our CI/LAVA input [to keep it
separate from our "we release this" tree]
* restart work on upstreaming omap3 patches as part of my generic qemu
maintenance work (will reduce our maintenance burden in the long term)
-- PMM
== Progress ==
* Buildbots
- Self-hosting bot online
- Fiddling with MCJIT tests to get bots green
* Benchmarks
- Running Phoronic benchmarks: GCC vs. LLVM, good results
- Got a sample of the PerfDB SQLite database, writing some queries
* Jira/Wiki farming
- Creating loads of new cards, blueprints, sub-tasks
- Adding content to the wiki pages about processes, cards, etc
* Release 3.3
- RC2 is out, no regressions, already on official repository
* EuroLLVM 2013
- Monthly call, wrap-up, preview of next year's
== Plan ==
* Try running a CBuild benchmark with LLVM 3.3 (Rob?)
* Automate release process, maybe we can do that every month
* Automate Phoronix test (GCC+LLVMrel+LLVMsvn)
* Follow up on Panda/Arndale ordering, needed for buildbots
* Try to extract useful information from perf database
Hi all,
I've spent a little while porting an optimization from Python 3 to
Python 2.7 (http://bugs.python.org/issue4753). The idea of the patch is
to improve performance by dispatching opcodes on computed labels rather
than a big switch -- and so confusing the branch predictor less.
The problem with this is that the last bit of code for each opcode ends
up being the same, so common subexpression elimination wants to coalesce
all these bits, which neatly and completely nullifies the point of the
optimization. Playing around just building from source directly, it
seems that -fno-gcse prevents gcc from doing this, and the resulting
interpreter shows a small performance improvement over a build that does
not include the patch.
However, when I build a debian package containing the patch, I see no
improvement at all. My theory, and I'd like you guys to tell me if this
makes sense, is that this is because the Debian package uses link time
optimization, and so even though I carefully compile ceval.c with
-fno-gcse, the common subexpression elimination happens anyway at link
time. I've tried staring at disassembly to confirm or deny this but I
don't know ARM assembly very well and the compiled function is roughtly
10k instructions long so I didn't get very far with this (I can supply
the disassembly if someone wants to see it!).
Is there some way I can tell GCC to not compile perform CSE on a section
of code? I guess I can make sure that the whole program, linker step
and all, is compiled with -fno-gcse but that seems a bit of a blunt
hammer.
I'd also be interested if you think this class of optimization makes
little sense on ARM and then I'll stop and find something else to do :-)
Cheers,
mwh
The v8 Foundation Model User Guide has a bare metal hello world example that uses semi-hosting. The Makefile uses ARM tools, however. Is there equivalent support for this example using a bare metal version of the gnu tools, such as gcc-linaro-aarch64-none-elf-4.8-2013.04-20130422_linux.tar.xz? I took a look, but didn't see a way to do this.
Of course, running the Linaro linux port on the v8 Foundation Model allows one to run hello world and much more, but I'm currently only interested in a bare metal target using gnu tools.
Thanks, Don
== Progress ==
* Backed up laptop data and did new ubuntu installation which crashed for
some reason.
* Wrote python script with googledoc API to automate fill up of
googlespread sheet.
* Created and tested patch for arm assembler compatibility fixes for
gdb.dwarf test suite assembly files.
== Plan ==
* Identify arm bugs out of gdb7.6 test results and work towards fixing them.
* Update JIRA enteries with test suite failures on arm to track progress.
* More work on automating googledoc spreadsheet writing using python.
* 2 Day off on coming Friday and Monday.
== Progress ==
* AARCH64 - gprof support.
Completed gprof -c support for aarch64.
Got reviewed internally by Matt and Will.
Patch yet to be posted. Waiting for some feedback on copyright message.
*Testing GCC bootstrap and regression suite.
Created a large image with help of Bero.
Bootstrap fails with GCC trunk libgcc_eh.a (unwind-dw2-fde-dip.o)
hidden symbol __register_frame_info is referenced bu DSO
ld final link Bad value.
Drilling down
== Plan ==
* Post patches in gcc and binutils for gprof work
* Continue handling builtin_return_address when -fomit-frame-pointer
is enabled.
* Continue gcc bootstrap and regression test.
== Issues ==
* Cbuild down most of the week.
== Progress ==
* 4.6 and 4.7 releases
- Released after a painful week !
* LRA on ARM and AArch64:
- Enabled on AArch64, but it leads to an ICE too.
- Applied Brice's ARM patches didn't solved the issue.
- Looked at the documentation/comments to understand the process.
- Debug ongoing.
== Plan ==
* Continue on LRA
== Issues ==
* None
== Progress ==
* Continue on conditional compare.
- Mix fixes for bootstrap.
* Update shrink-wrap patches according to comments and retest them on
Pandaboard and Chromebook.
* Prebuild 2013.05 Linaro toolchain locally.
- gdb related local patches need rework.
== Plan ==
* Continue on conditional compare to bootstrap.
* Linaro toolchain 2013.05 binary release.
Best Regards!
-Zhenqiang
The Linaro Toolchain Working Group is pleased to announce the release
of Linaro GDB 7.6.
Linaro GDB 7.6 2013.05 is the first release in the 7.6 series.
***NOTE*** Linaro GDB 7.6 2013.05 is identical to the FSF GDB 7.6 release,
except for the change in version number and Linaro branding, since all
Linaro GDB features were already accepted upstream and are included in
the FSF release as-is. Future releases in the Linaro GDB 7.6 series may
include additional ARM-focused bug fixes and enhancements.
The source tarball is available at:
https://launchpad.net/gdb-linaro/+milestone/7.6-2013.05
More information on Linaro GDB is available at:
https://launchpad.net/gdb-linaro
--
Matthew Gretton-Dann
Linaro Toolchain Working Group
matthew.gretton-dann(a)linaro.org
== Progress ==
* 3.3 Release
- Bootstrapping, testing, fixing bugs, etc.
- RC1 released on Tuesday
http://people.linaro.org/~rengolin/llvm/
- Fix for C11 atomics on Linux
http://llvm.org/bugs/show_bug.cgi?id=15429
- Fix for zero extend vector bug in test-suite
http://llvm.org/bugs/show_bug.cgi?id=15970
* Test-Suite
- ClamAV fails on Calxeda, same as PPC64 (bad test)
* Self-Host Buildbot
- Re-purposing the Panda back to buildbot service
- Buildbot passing green, setting it up on Build Master
== Issues ==
Calxeda gets turned off quite regularly, which messes up any long term
commitment you might have for them.
== Plan ==
* Install Ubuntu on Chromebook, run benchmarks with 3.3 RC1
* Try to revive LLVM CBuild job if it's any different than our current
buildbot
* Try to setup benchmark jobs for LLVM, either buildbots, CBuild, or
whatever
* Stay alert for RC2 and re-run release process on them