== This week ==
* Submitted the fix for the Qt miscompilation upstream. Applied after
approval.
* Submitted a patch for the Thumb LDR problem that Dave Martin hit.
This was rejected.
* Ended up spending a few days on the "unreasonable amount of memory
while compiling qemu" bug due to unfamiilarity with the DWARF 2 code.
I realise the original idea was that I'd just file this upstream,
but it was one of those cases where I kept finding out more info
for the bug report until the problem became obvious.
I've now submitted two patches for this upstream. The first was trivial
and is now in. I was asked to add a bit of extra code to the second,
which I hope to do next week.
* Looked at the MIPS bug that was reported against the Linaro toolchain.
This turned out to be a problem in our extension elimination pass.
Submitted a merge request for that.
* Got confirmation from ARM that we should use relocation number 160
for R_ARM_IRELATIVE, and that it was OK to make the changes public
(thanks!). I've now submitted the binutils patches upstream.
I'll do the eglibc ones when I get back.
== Next week ==
Holiday!
Richard
Hello,
Testing the patch for SMS to support targets
that their doloop part is not decoupled from the rest of the loop's
instructions, as SMS currently requires.
The testing includes bootstrapping on ARM machines for c language
configured w and w\o --with-arch=armv7-a options and using"-O2
-fmodulo-sched -fmodulo-sched-allow-regmoves -fno-auto-inc-dec
-funsafe-math-optimizations -mthumb" flags.
Thanks,
Revital
On Monday, I was asked to find out whether the fix for GCC Bugzilla
PR43137 was present in our source base.
I can confirm that it is *not* present.
Apologies for the delay.
Andrew
Hi All,
Up until now, I have had no choice but to test toolchain correctness on
A8 hardware. It made sense to use the same -mfpu settings as the
Linaro/Ubuntu package builds use. This did not match the policy that the
interesting platform was A9-NEON, but I didn't have that option.
That's changed now - our Panda boards have arrived! Yay! :)
But, it seems to me that if I change to using the Pandas for correctness
testing (not performance testing) then I won't be testing what Ubuntu
will actually use.
So what should I test on?
I'd rather not double my test load by testing on both, but that is an
option ....
Any suggestions?
Andrew
Hi,
On Mon, Feb 28, 2011 at 9:19 PM, Nicolas Pitre <nicolas.pitre(a)linaro.org> wrote:
> On Thu, 24 Feb 2011, John Rigby wrote:
>
>> The resulting kernel builds and boots but some modules have problems:
>>
>> $ modprobe fat
>> fat: unknown relocation: 102
>> FATAL: Error inserting vfat
>
> A workaround for what appears to be a binutils bug has been merged in
> linaro-2.6.38. So the Thumb2 kernel testing may resume on trusted
> targets.
Thanks for merging it.
It's a bit ugly to include turn off compiler optimisations to work
around this though, so we might encounter upstream to that patch.
In any case, we still need someone to take a look at the possible
tools issue -- CC'ing linaro-toolchain in case people aren't aware:
https://bugs.launchpad.net/binutils-linaro/+bug/725126
Cheers
---Dave
On 25 February 2011 22:28, Alexander Sack <asac(a)linaro.org> wrote:
> On Wed, Feb 23, 2011 at 8:28 AM, Jim Huang <jserv(a)0xlab.org> wrote:
>> I would like to make a proposal about utilizing Linaro toolchain for
>> Android and NDK (Native Development Kit)[1].
Added linaro-toolchai list in Cc.
>> ** Motivation
>>
>> There are some different perspectives between Linaro toolchain and
>> Google Android toolchain including technical and
>> non-technical considerations. It doesn't really work if we only
>> replace prebuilt toolchain with Linaro toolchain because
>> of the compatibility of Android system utilities such as ELF
>> prelinker. Also, since Android is developed in relatively closed
>
> I don't have enough background to understand this "ELF prelinker"
> stuff. Are you saying that because of the way how android links stuff
> we cannot have one code base for gcc that works for both, android and
> "normal libc linux"?
>
Take Bug #707487 for example:
https://bugs.launchpad.net/binutils-linaro/+bug/707487
It is evident that Android's system utilities like soslim ("strip"
implementation)
and apriori ("prelink" implementation) expect the specific output of
GNU Toolchain,
but it sometimes varies since we would take Linaro's toolchain.
>> environment (Google style open source model), a great amount of
>> software components are not always verified by different
>> toolchain or build configurations. This proposal attempts to
>
> ack. thats what we want to do. Of course, we cannot really verify what
> is going on behind closed walls, but we can continuously build android
> with our toolchain and fix issues due that in android public master
> and if even that doesn't work we can ensure that our android trees
> always work nicely with both, our gcc and android gcc.
>
Android team is known to work on this field already.
> Another thing is to make our toolchain easily consumable (like the NDK
> you mention at the bottom); this will increase chances that someone
> from google can eventually take a look at what we are doing etc. and
> also helps the community to use linaro toolchain to built their
> android distributions.
Agree.
>> establish the compact development flow to enable Linaro
>> optimized ARM toolchain to build Android from scratch and verify it
>> transparently. Eventually, Android can be the reference
>> indicator as Linaro toolchain performance and reliability.
>>
>>
>> ** Brief introduction to Google Android toolchain
>>
>> Inside Google, there is a dedicated compiler team working on GNU
>> Toolchain for various purposes including server-side
>> computing, Android, Chrome OS, etc. Google engineers submit patches to
>> upstream for public review and maintain the
>> toolchain for Android. Along with each Android Open Source Prokect
>> (AOSP) release, there is a special branch in korg
>> GIT [2] for hosting the GPL'd toolchain source code modified by
>> Google. Usually, file "README.google" mentioned the
>> summary, but it is not developer friendly because several changes were
>> done within one GIT commit.
>>
>> Please refer to wiki for details:
>> https://wiki.linaro.org/Platform/Android/UpstreamToolchain
>>
>
> thats a good wiki page. thanks for the content. If I read the skia
> example correctly, we could add a test to our "normal" abrek testsuite
> that uses our daily android toolchain and run the skia benchmark? e.g.
> we could start doing this benchmarking even without having a
> validation solution ready for android targets?
>
If adb is supposed to work well on target, then you can easily use "bench.py"
script mentioned in the above wiki to do several benchmarking.
> Please let's talk to Paul how we can get the android toolchain to
> /opt/android as part of abrek and lets try to add this to our abrek
> testsuites. Until we have daily toolchain builds it would be OK to
> download the android toolchain tarball from a fixed place from
> people.linaro.org I guess.
>
Ok!
>> ** What's wrong with Android upstream Toolchain?
>>
>> In my opinion, list as following:
>>
>> (1) Few information about Google improvement: Sometimes, we have to
>> guess something from implicit GIT commitlog
>> such as "commit gcc-4.4.3 which is used to build gcc-4.4.3 Android
>> toolchain in master"[3]. It is hard to track and get
>> verified carefully.
>
> yes, that feels like a messy situation. Do we know why they don't
> commit the changes as individual commits but then in next step
> document what they changed?
I have no exact idea since I am just an observer regarding Android's GIT tree.
Google engineers do send patches to FSF/GNU, but it is not always related to
the GIT activities we have seen in korg.
You can search the keyword, "submit", in file gcc/gcc-4.4.3/README.google , and
you will see some descriptions as following:
gcc/cp/cp-lang.c
gcc/gimple.c
gcc/langhooks-def.h
gcc/langhooks.h
gcc/langhooks.c
gcc/tree-flow.h
gcc/tree-ssa-dce.c
gcc/testsuite/g++.dg/tree-ssa/vptr_init_dse.C
gcc/testsuite/g++.dg/tree-ssa/vptr_init_dse2.C
Enhancing dead code elimination to eliminate
useless vptr field initialization.
Owner: davidxl
Status: not submitted
gcc/fold-const.c
gcc/Makefile.in
Fix 2045297
Owner: davidxl
Status: Not submitted
The information is too few to track for us since the above "Fix
2045297" tends to
indicate Google bug database number instead of FSF's.
>> (2) Google specific improvements are absent in recent release, only
>> enabled months later. For example, Google Compiler
>> Team Lead, Dr. Shih-wei Liao, presented the improvements against GNU
>> Toolchain in the middle of 2009.[4]. The report
>> came with several impressive improvements like FDO (Feedback Directed
>> Optimizations) and IPO (Inter-Procedure
>> Optimizations). However, only some of them are public to AOSP and be
>> integrated late in the middle of 2010 (Android
>> Froyo; 2.2). Even FDO was merged in Android Froyo already, but there
>> is few documentation and no robust method to verify
>> by community members such as Linaro engineers.
>
> you say that they don't publish the code for lets say the
> "gingerbread" toolchain in a timely fashion when they release
> gingerbread? Or do they ship a separate "fast" NDK/prebuilt for
> partners through secret channels?
I have no idea.
>> (3) For some reasons, Google tends to deliver stable (old) toolchain
>> plus mainline backport. It is a safe and workable approach,
>> but sometimes developers would expect to use the latest technologies
>> as Linaro aims to bring to the world.
>>
>> (4) Few readable documentation. For example, Google already open its
>> toolchain benchmark suite in early 2010, but there was
>> no document specific to such important components. Furthermore, there
>> was one file gone in public kog GIT, required by
>> automated benchmark process. One year later, Google engineer finally
>> put back the one to public. This implies the unusual way
>> Google developed and delivered software.
>>
>
> Assuming good faith I would think this might just have been an oversight.
>
> Do you know if anyone from community pointed this out to google using
> official android mailing lists/groups or a bug?
>
Google engineers sometimes pick up the issues from Google Code:
http://code.google.com/p/android/issues/list
And, they do discuss on mailing-list:
http://developer.android.com/community/
>> ** Linaro's Approach to enable latest technologies
>>
>> Linaro android team tries to do:
>> (1) Document Android toolchain and related utilities in korg GIT as
>> possible as we can.
>
> That's good stuff and I think your wiki page is already a great
> contribution in that direction. What we should do though is run this
> through google eyes early by using official android mailing lists.
>
Got it.
>> (2) Early adaptation of Linaro toolchain to Android build system and
>> verify these output systematically.
>
> ack. Do you know if those changes would be conflicting with what we do
> on "normal" linux side? e.g. do we need to maintain special android
> patches or can we merge those into our main trees?
>
In fact, GCC 4.6 already merges Android specific patches with the help of
CodeSourcery. We would initially backport these patches to linaro-gcc-4.5
branch for review. Luse Cheng already did it.
However, other parts are not related to Android directly, and they might be
too aggressive to generic GCC optimization, that can be the reason why Google
didn't submit first.
>> (3) Backport Google changes to Linaro GCC and review in public.
>
> This is really tricky as you said. Here again, we should propose this
> on android mailing lists to maybe get feedback from google team and
> maybe improve the way we work on that. Untangling a big patch based
> just on changelog feels really unefficient.
Ok, I got your point. However, what we need is to create workable combination
of Linaro kernel + Linaro toolchain for Android integration engineers.
Alexander, I need your help to catch the attention of someone at Google.
> Also, we have to remember that if we pick changes out of _their_ tree,
> we cannot upstream those to fsf because we don't own copyright to
> those. Of course, for stuff they already pushed to 4.6 its not a
> problem to backport them from fsf trunk.
Thanks for notice.
>> (4) Improve the deployment and validation flow by means of Linaro
>> infrastructure.
>
> my understanding is this:
>
> 1. we add support to build android toolchain from linaro branches to
> our cloud build service
> 2. we do this so that we either produce a full toolchain tarball that
> can be installed under /opt/android or a NDK tarball (or both)
NDK doesn't need admin permission to install.
> 3. we improve our android platform build infrastructure to allow
> using latest daily toolchain tarball and then we build android with a)
> google toolchain and b) linaro toolchain; in this way we get daily
> android builds for both toolchains that can go into the linaro
> validation farm and get the typical validation/testing and
> benchmarking done.
Yes, it would be great.
>> (5) Build and test Android system with Linaro tools. Then, figure out
>> the regressions caused by Linaro Toolchain and/or
>> aggressive optimizations
>
> right. I think that's covered with the point above, no? The android
> builds done with our toolchain would also be available in public, so
> you can do whatever you want on top of what we already
> test/validate/measure automatically in the validation farm with them.
Agree.
>> (6) measure performance gain by Linaro tools
>
> right. for this we need to define a set of open-source benchmarks to
> run and ensure that those are supported in our validation framework.
>
>> The detailed specification in wiki:
>> https://wiki.linaro.org/Platform/Android/Specs/LinaroAndroidToolchain
>>
>> ** Implementation of Linaro toolchain for Android
>>
>> We started from Android style toolchain build and move to Linaro GCC +
>> ARM specific optimizations in mind. The initial work
>> can be obtained by wiki:
>> https://wiki.linaro.org/Platform/Android/Toolchain
>>
>> We plan to maintain the following GIT repositories at least:
>> * android/toolchain/build.git : Linaro-aware build system. Derived
>> from Android toolchain build system, it can handle Linaro-GCC
>> and Linaro snapshot/bzr.
>> * android/toolchain/gcc-patches.git : Patchset to be applied on top
>> of Linaro-GCC release/snapshots
>
> I think thats fine. however, how do we ensure that we have patches
> that always apply to both release/snapshots? do we maintain branches
> for gcc-patches.git in case you need two versions of patch X if the
> linaro gcc codebase diverged?
I might need help from toolchain WG.
>> The reference builder script output:
>> $ ./linaro-build.sh --help
>> --prefix-dir= Specify where to install (default:
>> /tmp/android-toolchain-eabi)
>> --gcc-src-dir= Specify where linaro gcc source is (in <toolchain>/gcc)
>> --apply-gcc-patch=(yes|no) Apply-patch which in
>> <toolchain>/gcc-patches directory (default: no)
>>
>> Current verified combinations:
>> * gcc-linaro: 4.5-2011.02-0
>> * binutils: 2.20.1
>> * gmp: 4.2.4
>> * mpfr: 2.4.1
>>
>> Only gcc is replaced by gcc-linaro: 4.5-2011.02-0 and others are
>> checked out from korg GIT.
>
> do we need to do something like --gcc-src-dir and -patches for
> binutils, gmp and mpfr as well? or would we be only interested in
> improving/fixing gcc for now?
>
I think focusing on linaro-gcc is pretty good. We can follow the
original combination
of Google.
> Waybe we also want to support protocol schemes like git: http: and
> bzr+ssh:/lp: for the --gcc-src= argument. this would then
> automatically download/branch the source tree from the given location.
> What do you think?
Agree.
>> ** Summary of gcc-patches
>>
>> "gcc-patches" are used as "backport" from Google changes into Linaro
>> gcc base. Here is the summary at present:
>>
>> 0001-Add-linux-android.patch
>> Add linux-android
>>
>> 0002-Add-support-for-Bionic-C-library.patch
>> Add support for Bionic C library
>>
>> 0003-Support-compilation-for-Android-platform.patch
>> Support compilation for Android platform
>>
>> 0004-Add-multilib-configuration-for-arm-linux-androideabi.patch
>> Add multilib configuration for arm-linux-androideabi
>>
>> 0005-Fix-gthr-posix.h-to-support-Bionic.patch
>> Fix gthr-posix.h to support Bionic
>>
>> 0006-Add-untested-support-for-Bionic-to-libstdc.patch
>> Add [untested] support for Bionic to libstdc++
>>
>> These patches are taken from Maxim Kuvyrkov of CodeSourcery in gcc-4.6
>> branch. Of course, we can always add changes by
>> Google or other Android specific adaptation by this model.
>
> Can we get a toolchain example tarball done and uploaded to
> people.linaro.org? I would like to verify that those work out of the
> box with gingerbread and if so, i would like to see those land in the
> main toolchain WG branch rather than adding them to our gcc-patches
> tree.
Yes, I would like to do that later.
>> ** Planned improvements over Linaro toolchain for Android
>>
>> (1) GCC multilib setting
>> Default: arm, fpu and thumb. The prebuilt google toolchain use:
>> armv5te and mandroid. We should focus on ARMv7.
>> (2) HardFP-ABI Support for Android.
>> (3) Patch management: Better to get the Android patches into
>> Linaro-GCC tree eventually.
>> (4) Build system improvement. Don't have to build gmp, mpfr everytime,
>> and provide option to build without gdb.
>> (5) Enable LTO (Link Time Optimization, introduced since gcc-4.5) in
>> Android TARGET_GLOBAL_CFLAGS
>> (6) Verify the functionality of FDO (Feedback Directed Optimization)
>> and introduce the approaches to integrate.
>
> I really think those topics should be executed by the toolchain WG
> rather than in platform. I am happy that we give them guidance and
> support them by providing them with easy to use tools to get their job
> done. Also feeding them with topics is great. Please talk to Michael
> Hope and ask him how he wants to collect those android toolchain
> optimization topic ideas. Could be good input for our 11.11
> requirements gathering process.
Agree.
>> ** Toward Android NDK
>>
>> Once Linaro toolchain for Android is ready to use, it is time to
>> re-package Android NDK by Linaro toolchain. To do that, extra
>> build configuration, sysroot, is required. According to Android
>> Release Cycle & Phases[5], the repacked NDK should be verified
>> one moth after Android public release.
>
> That sounds like a great idea. What's the a benefit/difference of
> shipping an NDK compared to just shipping a "normal" toolchain binary
> tarball for this purpose?
NDK consists of some architecture specific helper scripts/headers to indicate
the optimization flags and some combinations such as ARMv7 with/without
NEON, etc.
If we provide NDK directly, users don't have to consider the above integration
issues as far as I know.
Sincerely,
Jim Huang (jserv)
Android Team
Temporarily took over Tech Lead of the Toolchain Working Group while
Michael Hope recovers from the Christchurch earthquake. (He's fine, but
unable to work.) This didn't actually require any action, in the end.
Michael returned to work towards the end of the week.
Forward ported, benchmarked, and posted one of Mark Shinwell's NEON
patches upstream.
Further benchmarking was not possible as the Panda board I was using is
located in Christchurch, NZ.
Merged and tested the FSF GCC 4.5 branch into Linaro GCC. There were a
couple of test regressions in the fortran testsuite, so I've filed bug
lp:723086. The other test results were either the same or better.
Benchmarked the ARM A8 function/jump alignment patch to see what effect
it has in GCC 4.6. Found no measurable improvement in EEMBC. I suggest
dropping this patch.
Brought the patch tracker up-to-date, and entered tracking tickets for
all outstanding patches.
Merged FSF trunk to Linaro GCC 4.6.
Committed Jie's Thumb2 testcase fix to FSF GCC trunk. Thanks to Ramana
for using his new found authority to approve it.
Investigated the suitability of several of the patches for
forward-porting. Corresponded with Benrd and Julian.
----
Upstream patched requiring review:
* Thumb2 constants:
http://gcc.gnu.org/ml/gcc-patches/2010-12/msg00652.html
* Kazu's VFP testcases:
http://gcc.gnu.org/ml/gcc-patches/2011-02/msg00128.html
* ARM EABI half-precision functions
http://gcc.gnu.org/ml/gcc-patches/2011-02/msg00874.html
* ARM Thumb2 Spill Likely tweak
http://gcc.gnu.org/ml/gcc-patches/2011-02/msg00880.html
* NEON scheduling patch
http://gcc.gnu.org/ml/gcc-patches/2011-02/msg01431.html
== Last week ==
* Launchpad #721021 GCC ICE on ARM/XScale: identified as case of
upstream PR45177; backported and pushed to Linaro.
* Launchpad #709453/CS Issue #7122: Neon vmov 0.0 issues; some progress
on my current WIP patch, but tests showed another 3 regressions, still
on-going.
* Launchpad #711819/GCC PR47719: ICE in push_minipool_fix. Ramana
reminded that my patch, which added some pool range attributes, were
actually removed earlier by Bernd in the fix for PR43137. Discussed and
mostly concluded that we should add them back for now. Will re-submit
patch with testcase to gcc-patches this week.
* Coremark ARMv7-A regressions: still work in progress.
== This week ==
* TW Public Holiday Feb.28 (Mon).
* Ping some of my upstream patch submissions.
* Get incompleted issues done.
* Coremark regression investigation.
Hello Linaro toolchain guys,
I have a few questions regarding GCC fully supporting the ARM Cortex M4,
I'm especially thinking of the additional DSP instructions and if these are supported and how optimal the code being produced is?
Thanks for your support,
Best Regards
Christian (ST-Ericsson)
Hi,
== Investigate developer tools ==
* Finished latrace investigation.
== PandaBoard ==
* The defective PandaBoard that was sent back in December is now repaired and
on my desk again. It doesn't show the behaviour of #708883 and works
flawlessly so far. :)
== libunwind ==
* Did some debugging of the test-async-sig testcase to get started with
libunwind. It will dead-lock if you add "--enable-debug" since libunwind does
printfs in this case which are not signal safe.
* Sorted out which of Zachs patches are upstream and which are not.
* Started to learn about the different unwind methods that libunwind provides
on ARM.
Regards
Ken
== ffi ==
* Sent variadic patch for libffi to libffi-discuss
* Worked through some suggestions from Chung-Lin, need to do some rework
== string routines ==
* memchr & strchr patch sent for inclusion in ubuntu packages
* tried sqlite's benchmarks - they don't spend too much time in the
C library; although
a few % in memcpy, and ~1% in memset (also seem to have found an
sqlite test case failure on
ARM and filed as bug 725052)
== porting jam ==
* There wasn't much traffic on #linaro during this related to the jam
* I closed bug 635850 (fastdep FTBFS) which was already fixed with
an explicit fix for ARM in the changelog
and bug 492336 (eglibc's tst-eintr1 failing) which seems to work now
but it's not clear when it was fixed.
* Looking at eglibc's test log there seem to be a bunch of others
that are failing and may well be worth investigating.
* bug 372121 (qemu/xargs stack/number of arguments limit) seems to
work ok, however the reporter did say it was quite a fragile test;
that needs more investigation to see
whether the original reason has actually been fixed.
== misc ==
* swapping notes with Peter on the PBX SD card investigation
Dave
RAG:
Red:
Amber:
Green:
Current Milestones:
| Planned | Estimate | Actual |
qemu-linaro 2011-03 | 2011-03-08 | 2011-03-08 | |
Historical Milestones:
finish virtio-system | 2010-08-27 | postponed | |
finish testing PCI patches | 2010-10-01 | 2010-10-22 | 2010-10-18 |
successful ARM qemu pull req | 2010-12-16 | 2010-12-16 | 2010-12-16 |
finish qemu-cont-integration | 2011-01-25 | 2011-01-25 | handed off |
first qemu-linaro release | 2011-02-08 | 2011-02-08 | 2011-02-08 |
== maintain-beagle-models ==
* rebased qemu-linaro on upstream
* checked omap_uart model for any issues with enabling the extended
(non-16550A) features which the new Linux drivers need. Sent meego
merge request for patchset which turns on the features, and does
a little cleanup. Now in meego, qemu-linaro.
== merge-correctness-fixes ==
* reviewed versions 5 and 6 of Christophe's vrecpe/vsqrte patchset;
v6 was good and has now been committed
* sent a version of "dummy cp14 debug registers" patch upstream;
however I've realised it triggers a false positive in the
temp-leak debugging code in target-arm/translate.c
* wrote/sent a patch which moves this temp-leak debugging code
into TCG proper (which I think makes it much simpler and cleaner
and avoids the false positives mentioned above)
* some work on the cp15 performance counter registers. I now
have some code which I think is a fully architecturally valid
implementation of an "implements no events" core, except that
we don't implement the cycle count register.
* started testing/review of Adam's VA-to-PA translation regs patch.
In the course of this discovered that qemu unconditionally
implements an ARM940 cp15 WFI register which clashes with these;
submitted patch to add correct not-for-v6/v7 feature gating.
* sent out patch fixing usermode seeks by 32 bit guest on 64 bit
host (based on a diagnosis and suggested fix by Eoghan Sherry)
* sent patch fixing compile error in vnc code
== vexpress model ==
* sent a patchset for fixing the MMC card detect wiring on
PBX upstream; this is needed for vexpress too
* finished vexpress cleanup and cross-checking against the docs; I
now have a patchset I'm happy to upstream and will post next week
== other ==
* took part in pgp keysigning event with emdebian folks
* meetings: toolchain, PDSW-tools
Current qemu patch status is tracked here:
https://wiki.linaro.org/PeterMaydell/QemuPatchStatus
Absences:
17/18 March: QEMU Users Forum, Grenoble
Holiday: 22 Apr - 2 May
9-13 May: UDS, Budapest
(maybe) ~17-19 August: QEMU/KVM strand at LinuxCon NA, Vancouver
== GDB ==
* Worked with Will Deacon and the Linaro kernel team to
make sure HW watchpoint and Versatile Express errata
fixes are included in the upcoming Linaro kernel release.
* Committed GDB HW watchpoint patches to mainline, and
backport to Linaro GDB. This completes work on the
HW watchpoint blueprint.
* Worked on fixing the GDB part of #620611 (Unable to
backtrace out of vector page 0xffff0000). Posted
(two versions of) mainline patch for discussion.
* Worked on kernel patch for #615974 (Interrupted system
call handling).
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E.
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk
Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294
== This week ==
* Looked at the poor code generated for Neon load/store intrinsics.
Looked into the history behind the treatment of VFP registers by
CANNOT_CHANGE_MODE_CLASS. Peter confirmed that the restrictions
apply only to VFPv1. Wrote a patch to improve the code, which
partly overlapped with Julian's.
* Looked at how the operations should be represented at the tree level.
Experimented with various combinations of tree codes and types
to see which felt right. Wrote this up in the message I sent today.
== Next week ==
* More vectorisation.
* Submit some queued patches.
* Maybe some bug fixing. (I see there's a reload bug just waiting
to be claimed by a lucky developer.)
On holiday the following week.
Richard
Services at ex.seabright.co.nz are back up.
On Tue, Feb 22, 2011 at 10:06 PM, Michael Hope <michael.hope(a)linaro.org> wrote:
> Hi there. We've had an earthquake. Family and friends are fine but i'll be
> unavailable for a few days. Services on ex.seabright.co.nz are down. I'll
> cancel Wednesdays standup call.
>
> See you soon,
>
> -- Michael
Hello,
Implemented a patch for SMS to support targets that their doloop part is
not decoupled from the rest of the loop's instructions (which is the
current assumption of SMS). ARM is an example of such target, where the
loop's instructions might use CC reg which is used in the doloop part.
Now testing the patch on ARM and other targets that have do-loop.
Thanks,
Revital
Hi,
* vectorizer cost model
- implemented builtin_vectorization_cost for NEON
- added register spilling considerations to the cost model
- started testing/tuning on EEMBC Telecom and DenBench (for now I
have only two examples for spilling: fdct_int32 mp4encode that
shouldn't get vectorized and viterbi that should)
* measured vectorization impact on Telecom autcor - it's about 5x
(initially I got run time segfault, but the bug is already fixed on
GCC trunk, I'll have to check gcc-linaro-4.5 as well)
* NEON-vs.non-NEON degradation
- started to look at aes. There are 6 loops that get vectorized with
4.6 (due to this patch
http://gcc.gnu.org/ml/gcc-patches/2010-05/msg01927.html that allows
cond_expr in number of loop iterations expressions) and vzip/vuzp
patch, but not with gcc-linaro-4.5. But it doesn't explain the
degradation of course.
- I don't understand mp4decodepsnr improvement, since I don't see
any loops or basic blocks vectorized.
Ira
One of the vectorisation discussions from last year was about the poor
code GCC generates for vld{2,3,4}_*() and vst{2,3,4}_*(). It forces the
result of the loads onto the stack, then loads the individual pieces from
there. It does the same thing in reverse for stores.
I think there are two major problems here:
1. The result of the vld*() is a record type such as:
typedef struct int16x4x3_t
{
int16x4_t val[3];
} int16x4x3_t;
Ideally, we'd like one of these structures to be stored in a pseudo
register. However, the ARM port currently limits in-register
record types to 64 bits, so something this big is always given
BLKmode and stored on the stack.
A simple "fix" for this is to increase MAX_FIXED_MODE_SIZE.
That would do the right thing for the structures in arm_neon.h,
but wouldn't be safe in general.
2. The vld*() returns values as a single integer (such as EI mode),
while uses of the value will typically be in a vector mode such
as V4SI. CANNOT_CHANGE_MODE_CLASS doesn't allow direct
"mode-punning" between the two in VFP_REGS, so this again
forces the punning to be done on the stack.
The code in question is:
/* FPA registers can't do subreg as all values are reformatted to internal
precision. VFP registers may only be accessed in the mode they
were set. */
#define CANNOT_CHANGE_MODE_CLASS(FROM, TO, CLASS) \
(GET_MODE_SIZE (FROM) != GET_MODE_SIZE (TO) \
? reg_classes_intersect_p (FPA_REGS, (CLASS)) \
|| reg_classes_intersect_p (VFP_REGS, (CLASS)) \
However, the VFP restriction appears to be specific to VFPv1 --
thanks to Peter for the archaeology -- and isn't a problem for v6+.
In that case, removing this restriction is an important optimisation.
I tried the patch below on the following simple testcase:
#include "arm_neon.h"
void
foo (uint16_t *a)
{
uint16x4x3_t x, y;
x = vld3_u16 (a);
y = vld3_u16 (a + 12);
x.val[0] = vadd_u16 (x.val[0], y.val[0]);
x.val[1] = vadd_u16 (x.val[1], y.val[1]);
x.val[2] = vadd_u16 (x.val[2], y.val[2]);
vst3_u16 (a, x);
}
(not necessarily sensible!). Before the patch, -O2 produced:
sub sp, sp, #48
add r3, r0, #24
vld3.16 {d16-d18}, [r3]
vld3.16 {d20-d22}, [r0]
add r3, sp, #24
vstmia sp, {d20-d22}
vstmia r3, {d16-d18}
fldd d19, [sp, #8]
fldd d16, [sp, #0]
fldd d17, [sp, #24]
fldd d20, [sp, #32]
vadd.i16 d18, d16, d17
vadd.i16 d17, d19, d20
fldd d19, [sp, #16]
fldd d20, [sp, #40]
vadd.i16 d16, d19, d20
fstd d18, [sp, #0]
fstd d17, [sp, #8]
fstd d16, [sp, #16]
vldmia sp, {d16-d18}
vst3.16 {d16-d18}, [r0]
add sp, sp, #48
bx lr
After the patch we get:
vld3.16 {d24-d26}, [r0]
add r3, r0, #24
vld3.16 {d20-d22}, [r3]
vmov q8, q12 @ ti
vadd.i16 d17, d17, d21
vadd.i16 d16, d24, d20
vadd.i16 d18, d26, d22
vst3.16 {d16-d18}, [r0]
bx lr
The VMOV is a bit disappointing, and needs further investigation.
The first hunk fixes (2), and I think is correct. The second hunk
hacks (1), and isn't suitable in itself. I'll next try to make
arm_neon.h use built-in record types that are explicitly EImode,
which should remove the need to change MAX_FIXED_MODE_SIZE.
Richard
Index: gcc/gcc/config/arm/arm.h
===================================================================
--- gcc.orig/gcc/config/arm/arm.h
+++ gcc/gcc/config/arm/arm.h
@@ -1171,10 +1171,12 @@ enum reg_class
/* FPA registers can't do subreg as all values are reformatted to internal
precision. VFP registers may only be accessed in the mode they
were set. */
-#define CANNOT_CHANGE_MODE_CLASS(FROM, TO, CLASS) \
- (GET_MODE_SIZE (FROM) != GET_MODE_SIZE (TO) \
- ? reg_classes_intersect_p (FPA_REGS, (CLASS)) \
- || reg_classes_intersect_p (VFP_REGS, (CLASS)) \
2+#define CANNOT_CHANGE_MODE_CLASS(FROM, TO, CLASS) \
+ (GET_MODE_SIZE (FROM) != GET_MODE_SIZE (TO) \
+ ? (reg_classes_intersect_p (FPA_REGS, (CLASS)) \
+ || (TARGET_VFP \
+ && reg_classes_intersect_p (VFP_REGS, (CLASS)) \
+ && arm_fpu_desc->rev == 1)) \
: 0)
/* The class value for index registers, and the one for base regs. */
@@ -2458,4 +2460,6 @@ enum arm_builtins
instruction. */
#define MAX_LDM_STM_OPS 4
+#define MAX_FIXED_MODE_SIZE GET_MODE_BITSIZE (XImode)
+
#endif /* ! GCC_ARM_H */
Hi there. We've had an earthquake. Family and friends are fine but i'll be
unavailable for a few days. Services on ex.seabright.co.nz are down. I'll
cancel Wednesdays standup call.
See you soon,
-- Michael
== GDB ==
* Working with Will Deacon, identified root cause of GDB
problems running on Versatile Express in SMP mode, and
verified that Errata workaround fixes the problem
* Finished testing GDB HW watchpoints patch on vexpress,
submitted complete patch set for mainline inclusion
* Reviewed Yao's mainline patch to enable displaced
stepping in Thumb mode
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E.
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk
Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294
== Last week ==
* PR46178, PR46002: both upstream issues related to the priority
coloring mode of IRA. Both patches submitted, the first already approved
and committed. Vladimir M. did mention that the priority algorithm
would be removed once his newer "cover class-less" patches goes in
during stage1. Anyways, I got more familiar with IRA during the process,
and the patches will still be applicable to 4.5/4.6.
* PR43872: incorrectly aligned VLAs under ARM. This turned out to be a
one-liner fix. Submitted upstream awaiting approval.
* Discussed on email/IRC with Revital Eres on SMS and ARM doloop pattern
issues.
* Launchpad #721021: Linaro GCC ICE under -mtune=xscale. Investigated a
bit; did not see ICE immediately, but GCC went into infinite loop (Khem
Raj, the reporter, says it runs for a while then ICEs).
* Coremark ARMv5TE vs ARMv7-A performance regression: reproduced
consistently using our own Tegra boards. Investigated and seem to have
found something, will post more detailed findings later.
== This week ==
* Coremark investigation.
* More GCC issues.
== GCC ==
Posted 2 of our 4.5 patches upstream.
My latest 4.6 build and test completed, so I've pushed an update to the
bzr branch. The branch is now up to mainline state as of the 12th.
Merged 3 4.5 patches into Linaro GCC 4.6. Upstream review isn't
happening, so I've decided to commit them anyway. The last upload (FSF
mainline as of 12th Feb) will therefore become the baseline I'm going to
use for Linaro GCC 4.6.
Begun benchmarking the questionable patches before forward porting them,
using EEMBC. Michael Hope has given me access to one of his A9 Panda
boards in New Zealand. This ought to have been straight-forward, but of
course it wasn't. It took me a while to convince myself I was getting
meaningful results and testing the right thing. Also the A9 seemed to be
able to complete the configured iterations in 'zero' time, which fooled
me for a while. I think I now have a set up that works. It seems to run
very slowly sometimes though - something to do with SSH?
----
Upstream patched requiring review:
* Thumb2 constants:
http://gcc.gnu.org/ml/gcc-patches/2010-12/msg00652.html
* Kazu's VFP testcases:
http://gcc.gnu.org/ml/gcc-patches/2011-02/msg00128.html
* Jie's thumb2 testcase fix:
http://gcc.gnu.org/ml/gcc-patches/2011-02/msg00670.html
* ARM EABI half-precision functions
http://gcc.gnu.org/ml/gcc-patches/2011-02/msg00874.html
* ARM Thumb2 Spill Likely tweak
http://gcc.gnu.org/ml/gcc-patches/2011-02/msg00880.html
RAG:
Red:
Amber:
Green: DATE/QEMU conference place confirmed, travel booked
Current Milestones:
| Planned | Estimate | Actual |
qemu-linaro 2011-03 | 2011-03-08 | 2011-03-08 | |
Historical Milestones:
finish virtio-system | 2010-08-27 | postponed | |
finish testing PCI patches | 2010-10-01 | 2010-10-22 | 2010-10-18 |
successful ARM qemu pull req | 2010-12-16 | 2010-12-16 | 2010-12-16 |
finish qemu-cont-integration | 2011-01-25 | 2011-01-25 | handed off |
first qemu-linaro release | 2011-02-08 | 2011-02-08 | 2011-02-08 |
* maintain-beagle-models:
+ implemented missing epoll syscalls for qemu usermode,
submitted upstream
https://bugs.launchpad.net/qemu-linaro/+bug/644961
+ tracked down the problem causing serial console to break:
the new Linux driver uses some extra features of the UART
which we weren't modelling
https://bugs.launchpad.net/qemu-linaro/+bug/714600
* merge-correctness-fixes:
+ reworked VZIP/VUZP patch as per review comments, resubmitted
+ reviewed CL's latest shift patches, added fixes of my own for
large shift counts and overlapping src/dest regs, submitted
a 10 patch rolled up series
+ reviewed a patch for adding cp15 VA-PA translation ops
+ reviewed various versions of vrecpe/vsqrte patches from CL
* versatile-express model:
B Labs kindly made available their Versatile Express board model:
https://github.com/bbalban/qemu/commits/universal-branch
and I've spent a few days getting it to boot a Linaro kernel,
fixing a few bugs and cleaning up the patchset in preparation
for upstreaming it.
This included discovering a bug in qemu's SD card model which
was causing Linux not to be able to detect cards on PL181,
and resulting in spurious qemu warnings on omap3:
https://bugs.launchpad.net/qemu-linaro/+bug/714606
* other:
+ ARM architecture Q&A for modelling engineers
+ booked travel/hotel for QEMU conference
* meetings: toolchain, PDSW-tools, PD comms, Linaro-in-ARM network
infrastructure, pdsw-doughnuts and 1st birthday celebration,
Current qemu patch status is tracked here:
https://wiki.linaro.org/PeterMaydell/QemuPatchStatus
Absences:
17/18 March: QEMU Users Forum, Grenoble
Holiday: 22 Apr - 2 May
9-13 May: UDS, Budapest
(maybe) ~17-19 August: QEMU/KVM strand at LinuxCon NA, Vancouver