RAG:
Red:
Amber:
Green:
Current Milestones:
|| || Planned || Estimate || Actual ||
||upstream-omap3-cleanup || 2011-11-10 || 2011-11-10 || ||
Historical Milestones:
||add-omap3-networking || 2011-10-13 || 2011-10-13 || 2011-10-13 ||
||a15-systemmode-planning || 2011-10-13 || 2011-10-13 || 2011-09-22 ||
||a15-usermode-support || 2011-11-10 || 2011-11-10 || 2011-10-27 ||
== other ==
* Linaro Connect week. Included an extremely useful double-length
session about KVM on A15, which should turn into blueprints/plans
in due course
* Found out a bit more about UEFI -- I'm leaning towards having QEMU
for vexpress run UEFI by default as a way of letting you just pass
it a disk image rather than having to feed it a separatekernel/initrd.
(Will look into this more when the ARM landing team have it all
building and working on hardware.)
* I have a working prototype of the QEMU virtio-mmio transport (written
to Pawel's spec). However to get this upstream we will first need to
properly refactor the qemu virtio code so the link between the
transport and the blk/net/etc backends is a qdev bus.
-- PMM
Continue working on the regsiter pressure estimation implementation -
testing the implementation on libav micro benchmarks.
With the patch some SMSed kernels in put-h264-qpel8-hv-lowpass-8,
swscale-rgb24ToY_c mjpegenc benchmarks are identified as having
register pressure.
I'm looking at the kernels which still have regressions with SMS and
it seems the reason is not related to register pressure.
Hi All,
This is a brain dump of what I learned about running LAVA today.
Dave will probably find a place for this in the Validation wiki, but
I'll pass it round in the meantime.
Hope it helps
Andrew
Hi,
* libunwind
* posted small bug fixes
* noticed the unwinding on Android is broken somehow
(need to track down the commit that broke it)
* linaro android
* repo sync fails due invalid bionic commit id (#885792)
* tried to remotely attend the Connect
* +1 for having live streams of the plenaries
(http://video.ubuntu.com/live/)
* -1 for pointing us to the wrong grand sierra irc channels
(http://uds.ubuntu.com/participate/remote)
* icecast streams worked most of the time
(* public holiday on tuesday)
Regards
Ken
=== 64 bit atomics
* I got the race in membase down to a futex issue, and asking dmart
pointed me at a kernel bug that
affects recent kernels where a fix had gone in about a month ago.
That was a nasty one!
* I've still got a few bugs left; most are turning out to be timing
races in the test code (e.g. one that
times out after 2seconds but the code takes around 1.7 seconds ish -
but if something else gets
in trips over the line, and another one where it did a recv_from on a
socket but only got
the start of a message, presumably because the sender had used
multiple sends). It's tricky going
because the tests are a combination of most scripting languages (perl,
python, ruby with a splash of Erlang).
I've so far found no bugs in the atomic code.
* I looked at apr and SDL-1.3; both of which use atomics; but end up
not using 64bit atomics;
the tendency is for them to ensure they can do atomics on long and on
a void*; both of which
for us are 32bit.
=== String routines
* I've got the Newlib A15 optimised memcpy running in a test harness
at the moment for
comparison.
=== Listening to connect
* I listened in to a few connect sessions each day; the 1st day or
so was 3/4 lost on
audio systems that didn't work (I'm especially annoyed at not being
able to hear the QEMU for A15/KVM session
and toolchain support for kernel). The Rypple session was rather lost
through the lack of any screen share
or slides.
Hello all,
I've been playing around with linaro and have it working on my
Pandaboard locally. I have a couple of questions about the linaro
environment; if this is the wrong forum, I'm happy to take it elsewhere.
I see that Linaro makes monthly releases of the hwpacks and images.
How are the packages/binaries in those images created? Are they
cross-compiled, or compiled natively on the target platform? If they
are cross-compiled, how is the environment created?
The reason I ask is that we've been looking at cross-compiling some
packages ourselves, and have been running into issues. So we were
wondering what toolchain the linaro community uses.
Thanks in advance,
--
Chris Lalancette
Hello all,
I've been playing around with linaro and have it working on my
Pandaboard locally. I have a couple of questions about the linaro
environment; if this is the wrong forum, I'm happy to take it elsewhere.
I see that Linaro makes monthly releases of the hwpacks and images.
How are the packages/binaries in those images created? Are they
cross-compiled, or compiled natively on the target platform? If they
are cross-compiled, how is the environment created?
The reason I ask is that we've been looking at cross-compiling some
packages ourselves, and have been running into issues. So we were
wondering what toolchain the linaro community uses.
Thanks in advance,
--
Chris Lalancette
Hi,
- Finished rewriting SLP analysis to support not only unary and binary
operations. Committed upstream.
- Implemented cond_expr support in SLP (for libav weight_h264_pixels).
Testing it now.
- Vectorizer maintenance (test/bug fixes, patch reviews).
Ira
Testing an initial version of the implementation which estimates
register pressure in SMS on libav micro benchmarks.
I see 20% improvements in mjpegenc microbench and 11% on aacsbr-2 with
SMS. However swscale-rgb24ToY_c
still have spills in the final code although it requires maximum 64
VFP_REGS registers out of the available 64 registers so I'm trying to
understand the reason for the spill.
==Progress===
* Off for one day during the week for Diwali.
* Connect preparation - Wrote down areas to look at during connect and
tried to plan what we
want to look at during connect.
* Looked at some of the cases with vcond<float> with Ira and helped
frame blueprint.
* Investigated one of the big performance regressions in the popular
embedded benchmark
and looked at why it wasn't being vectorized only to realize that it
couldn't be. Thanks
Ira. Still don't know why ARM state is 22% faster than Thumb2 state.
* Looked at the issue with fPIC where GCSE appears to remove a label
for sometime
but not much progress.
=== Plans ===
* Connect ! next week and then vacation.
Absences.
* 26 Oct - Diwali
* 31st Oct - 4th Nov - Linaro Summit Orlando - Travel booked -
* 08 Nov - 11 Nov - Vacation booked
* Dec 19 - 31st Dec - Vacation booked
== 64 bit atomics ==
* I've been building and testing membase
* Version 1.7.1.1 source builds OK (after turning off -Werror due to
some of their curious type naming)
* The git version fails to build - it doesn't seem consistent
* 1.7.1.1 passes simple tests, but there are 3 tests in its test
suite that intermittently fail on ARM and
seem to be solid on x86. (There are also some that just require
timeouts increased due to the
relatively slow machine).
* t/issue_163.t turned out to be a timing race in the test itself,
made worse by being on a relatively slow
machine and probably made worse by the Pandas odd idea of timing.
That was reported to them with
a break down of it, and upstream has fixed their test. (
http://code.google.com/p/memcached/issues/detail?id=230 )
* t/issue_67.t is proving tougher; once in a while memcached will
lock up during init in thread_init;
there is one particular point where adding a printf will make it work
apparently reliably. I've got one
or two ideas but I need to check my understanding of pthread_cond_wait first.
* There is an assert I've seen triggered once - not looked at that yet.
== String routines ==
* While I was off last week, my memchr and strlen were accepted into newlib
* Joseph has responded to my eglibc mail, with a couple of small queries.
== Other ==
* Wrote a more detailed test case for bug 873453 (odd timing
behaviour on panda); it's
quite odd - I can get > ~80ms timing discrepency so it's not a clock
granularity issue.
* Replicated a QEMU crash for Peter.
Dave
Hi,
* finished changing libunwind to be more portable
* tested patchset on ARM and X86_64
* now builds on Android without modifications
(Android.mk, config.h and libunwind-common.h are still required)
* verified that the modified debuggerd still works
* discussed backtracing using libunwind on ARM with Harald from the BSC
* they use libunwind in a sampling tool that generates Paraver
tracefiles
* started to upgrade my Linaro Android environment and ran into issues
* need to check:
* why building the toolchain using linaro-build.sh fails
* why repo sync fails due to invalid platform/bionic SHA1
* what happened to LEB-panda.xml
Regards
Ken
RAG:
Red:
Amber:
Green:
Current Milestones:
|| || Planned || Estimate || Actual ||
||a15-usermode-support || 2011-11-10 || 2011-11-10 || 2011-10-27 ||
||upstream-omap3-cleanup || 2011-11-10 || 2011-11-10 || ||
Historical Milestones:
||qemu-linaro-2011-07 || 2011-07-21 || 2011-07-21 || 2011-07-21 ||
||qemu-linaro 2011-08 || 2011-08-18 || 2011-08-18 || 2011-08-18 ||
||qemu-linaro 2011-09 || 2011-09-15 || 2011-09-15 || 2011-09-15 ||
||add-omap3-networking || 2011-10-13 || 2011-10-13 || 2011-10-13 ||
||a15-systemmode-planning || 2011-10-13 || 2011-10-13 || 2011-09-22 ||
== a15-usermode-support ==
* A15 instruction support patches committed upstream in time for
upstream's 1.0 release
== upstream-omap3-cleanup ==
* some work on restructuring the omap3 patchset -- it's now basically
in the right order and the last 'touches several different
bits of code' jumbo patch has been split
== other ==
* sent some patches upstream which address the main things I
want to get into qemu 1.0 (PL041 audio support and fixing a
regression in handling multithreaded programs in linux-user mode)
* A15 KVM planning work and other preparation for Linaro Connect
* finally tracked down the qemu-on-ARM memory corruption: we
mmap the code generation buffer at 0x1000000 with MAP_FIXED;
unfortunately this is now in the middle of glibc's heap...
(filed as LP:883133)
* qemu now has a coroutine implementation which defaults to using
makecontext() if it is present. Unfortunately ARM eglibc provides
an implementation which always returns ENOSYS, which is a bit
tricky to detect with a compile time configure check (without
breaking cross-compilation support).
* these two things (and some other known bugs) mean that QEMU on
ARM hosts is basically broken, and will probably continue to be
since we don't have the spare resource to test and fix bugs
(beyond those which we need to fix for KVM-on-ARM)
* Looked at how to configure Firefox and how to build different parts of the
program. Usage of .mozconfig, myrules.mk and myconfig.mk.
* Tested the Talos framework. https://wiki.mozilla.org/Buildbot/Talos. I
think it would be good to use Talos for the browsing benchmarks. We can
discuss it further at connect.
* Preparing for connect.
Best Regards
Åsa
Summary:
* Exercise crosstool-ng and summarize the gaps.
Details:
* Exercise crosstool-ng
(1) Sync with lp:~linaro-toolchain-dev/crosstool-ng/linaro.
(2) Try to config linux-host-baremental-target an
mingw32-host-baremental-target.
(3) Try to build the toolchain for both embedded toolchain and
linaro-gcc-4.6-2011.10 with the config.
. C compiler for linux and mingw32 hosts and c++ compiler for
linux host can be built without any change.
. C++ compiler for mingw32 host can be built after PCH is disabled.
. GDB-cross build fail due to dependence packages.
* Gaps in crosstool-ng
(1) Improve GDB-cross scripts to download and build the dependence
packages: expat and ncurses. Or put expat and ncurses as
companion_libraries.
(2) To remove dependence, embedded toolchain requires more
prerequisites like zlib.
New config and scripts are required to support the packages.
(3) Currently, the embedded toolchain source packages are released
as a tarball, which includes gcc, gmp, etc. New scripts are required
to support it.
(4) To make sure the toolchain can run with lower version glibc like
redhat4/5, the embedded toolchain requires lower version native
gcc4.3.6 to build it.
To support it,
. Users can build the native gcc manually, or
. Enhance the scripts to add one step to build native gcc.
(5) All the default package configurations are different from
embedded toolchain internal build scripts.
Since the configurations in embedded toolchain had been tuned
and tested, we will change the configurations in crosstool-ng if they
do not match and not configurable.
The same rule will apply for linaro toolchain.
Plans:
* Write scripts to re-pack the embedded toolchain source packages.
* Add the supports for all prerequisites in crosstool-ng menuconfig.
Thanks!
-Zhenqiang
Posted a patch upstream to fix big-endian for generic tuning. This was a
simple omission from my previous patches.
Merged GCC 4.6.2 to Linaro GCC. It's still in testing now, so I'll have
to commit it sometime over the weekend or next week.
Looked at the benchmark results from Spec2000 running on both A8 and A9
systems, with and with NEON, and with various compiler options. Posted
the results in a spreadsheet (visible within Linaro only).
Begun making adjustments to generic tuning and started new spec2k runs
to see if they are beneficial. First, I'm trying A9 prefetch settings on
A8 to see how much damage it does. Next I'll try enabling the A8 NEON
tuning settings on A9 to see what happens there.
Prepared for travel next week.
Vacation Friday
Hi,
- Merged to gcc-linaro:
- widening shifts
- SLP features: support loads with different offsets and swap
operands if necessary
- Started rewriting SLP analysis to support operations with more than
two operands (towards SLP of conditions)
- Updated NEON presentation following Ramana's suggestions (thanks!)
- Suggested to Ramana to implement vcond with mixed types, created a
blueprint: https://blueprints.launchpad.net/gcc-linaro/+spec/vcond-with-mixed-types
- Vectorizer:
- updated vectorizer's webpage
- updated vectorizer's wiki page
- the usual maintenance
- Committed upstream two SLP data-ref analysis improvements: PR 50730
and PR 50819
Ira
Hi there. Connect is just around the corner. Have a look at:
https://wiki.linaro.org/MichaelHope/Sandbox/Q4.11Plans
for a summary of the toolchain sessions and hacking topics.
It would be great to have kernel and OCTO input in the ARM STM driver,
Kernel debugging, and KVM sessions.
-- Michael
Hi Folks,
Draft agenda for the performance meeting next week at Connect -
https://blueprints.launchpad.net/gcc-linaro/+spec/linaro-toolchain-performa…
Are there any topics that people would like to bring up during this
meeting other than the ones listed here ? I suspect that we'll
probably just have about 10-15 minutes for a topic in this case. I am
not considering discussing PGO related stuff in this session given
that we've got another session in which we can discuss this.
Thoughts ?
cheers
Ramana
Hi Folks,
I've been trying to capture what we want to do in terms of hacking
time and some of the performance related backlog that we have in the
system. I have done so here.
https://wiki.linaro.org/RamanaRadhakrishnan/Sandbox/Q411ConnectGCCPerfPlan
I'm on vacation tomorrow but should be picking email for sometime
during the day.
Thoughts about what else we could be doing in this area or if there's
a better way we could use our hacking time.
cheers
Ramana
---
At the moment ARM eglibc doesn't support the functions declared
in ucontext.h: getcontext(), setcontext(), swapcontext() and
makecontext(). Instead you get implementations which always
fail and set errno to ENOSYS.
QEMU uses these functions to implement coroutines. Although there
is a fallback implementation in terms of threads, there are reasons
why using the fallback is suboptimal:
* its performance is worse
* it will be less tested, because x86_64 and i386 both implement
the ucontext functions and so QEMU on those hosts will be using
different code paths
* I'm not aware of a good way at configure time to detect whether
getcontext() et al will always fail without actually running a
test binary, which won't work in a cross-compile setup. (If eglibc
just didn't provide the functions at all this would be much
simpler...)
We're going to care about performance and reliability of QEMU on
ARM hosts as we start to support KVM on Cortex-A15, so it would
be good if we could add ucontext function support to eglibc as
part of that effort.
Opinions? Have I missed some good reason why there isn't an
ARM implementation of these functions?
(I'm aware that the ucontext functions have been removed from
the latest version of the POSIX spec; however AFAIK there's no
equivalent functionality that replaces them so I think they're
still worth having implementations of for parity with other
architectures.)
-- PMM
==Progress===
* Some upstream patch review.
* Spent time looking at LP 836588 which is a case where CSE removes a
particular label access in one case but doesn't remove it from the
list of things in the constant pool which is quite bizarre. Will
probably need some help with looking into this one.
* Sent out vcvt.f32 and vcvt.f64 patches .
* Connect preparation - laptop cleanup and getting it finally onto an
x86_64 distribution.
* Looked at some of the vec_perm / vec_rev cases in Neon with Ira.
* Spent some time looking at some of Andrew's issues with generic-v7a
tuning especially the cases where it was doing better and gave some
suggestions.
=== Plans ===
* Prepare for Connect.
* Prepare by looking at some of the large differences between
various comparative benchmarks.
* Some research into PGO related stuff.
* Try to upstream some more of my patches in the backlog before the
end of the week.
* Finish off some internal paperwork.
* I'm off on 26th - Wednesday.
Absences.
* 26th Oct - Day off.
* 31st Oct - 4th Nov - Linaro Connect Q4.11
* 08 Nov - 11 Nov - Tentatively booked
* Dec 19 - 31st Dec - Tentatively booked