RAG:
Red:
Amber:
Green:
Current Milestones:
| Planned | Estimate | Actual |
finish qemu-cont-integration | 2010-01-25 | 2010-01-25 | handed off |
first qemu-linaro release | 2011-01-11 | 2011-01-11 | |
Historical Milestones:
finish virtio-system | 2010-08-27 | postponed | |
finish testing PCI patches | 2010-10-01 | 2010-10-22 | 2010-10-18 |
successful ARM qemu pull req | 2010-12-16 | 2010-12-16 | 2010-12-16 |
* submitted meego patch upstream to print instruction
boundaries in TCG op-level debug dumps:
http://patchwork.ozlabs.org/patch/79298/
* reviewed a couple of patches by Christophe Lyon from ST
which fix some problems with VMULL and friends
(together with a qemu-meego patch in the same area)
* lots of rebasing work to get a qemu-linaro tree into
shape. Mostly putting together my exploded set of
ARM patches from the meego tree with a rebased set
of omap patches. I have something which I think is about
right but I need to test it.
* patches submitted to qemu upstream:
+ add -version option to qemu-user:
http://patchwork.ozlabs.org/patch/79635/
+ fix writing default vector address in PL190:
http://patchwork.ozlabs.org/patch/79712/
+ print instruction boundaries in TCG op-level debug dumps:
http://patchwork.ozlabs.org/patch/79298/
Current qemu patch status is tracked here:
https://wiki.linaro.org/PeterMaydell/QemuPatchStatus
Absences:
17/18 March: QEMU Users Forum, Grenoble
Holiday: 22 Apr - 2 May
I had a few spare cycles this afternoon on my A9 board so I got it to
build and run 72 variants of CoreMark with the 2011.01 compiler. The
results are here:
https://wiki.linaro.org/MichaelHope/Sandbox/CoreMark1
I haven't looked deeply into it and the usual disclaimer about
benchmarks applies, but there's some interesting stuff there. The ARM
to Thumb-2 drop is more than expected. Tuning for Cortex-A9 generally
makes things worse.
-- Michael
Hi,
One of the things that came up in yesterday's chat was about ARM vs
Thumb2. If some folks are interested in a high level overview that
doesn't go into too many details with respect to the ISA that I know a
compiler writer will be interested in you could read this presentation
unless you've seen it. At the minute the canonical way of figuring out
more about Thumb2 is really to read the ARM-ARM.
https://wiki.ubuntu.com/Specs/M/ARMGeneralArchitectureOverview?action=Attac…
This was a presentation that Dave Rusling gave at the UDS in Belgium
last year, it roughly covers the architecture at a high level and
talks about Thumb2 vs ARM , why Thumb2 , the evolution of the ARM
architecture etc. It's not a lot of detail but quite a high level overview.
HTH
cheers
Ramana
Hi,
* finished SLP for reduction patch. The loop in DenBench that needs
this feature also requires support of load permutation. I am
considering to implement that too. I looked for other occasions that
need this feature, but only found loops that are not vectorizable. So,
I am not sure I'll proceed in this direction.
* looked into extract_even/odd and interleave_high/low implementation
on ARM as a backup plan for the case we don't have special load/store
support on time for the next release. Even though NEON VZIP and VUZP
instructions can perform both even and odd (and high and low)
computations simultaneously, I don't see how we can express that at
the tree level.
* looking into ffmpeg
* non-Linaro issues
Ira
Hi ,
So rather than manually backporting a patch from upstream "trunk" I
decided to try using bzr for this yesterday. Even though this was a
slow process and thanks to some help from Michael on IRC last night I
think I got it right finally. However I can't get bzr visualize to
show me the revision history correctly with the merge from the
upstream repository.
So what I did was to do the following
bzr branch 4.5 new-branch
cd new-branch
bzr merge -c XXX lp:gcc where XXX is the bzr revision number of the
svn commit that I'm tracking upstream.
gcc/Changelog merge fails - there is a conflict. Expected because you
are merging trunk's gcc/Changelog into a version of the Changelog file
in our tree
Thus I reverted changes that were brought in by copying in a backup of
gcc/Changelog that I had made before the merge
bzr resolved gcc/Changelog
edit Changelog.linaro
Add changelog entry
bzr commit
bzr push lp:gcc private repository
Create merge request upstream.
Does this look sensible to people or do folks follow other recipes?
cheers
Ramana
Hi there. I've cancelled Monday's meeting as the CodeSourcery people
are at their annual meeting and others are travelling back from the
sprint. Please come to the Wednesday stand up call if you are
available.
-- Michael
Got a complete run of SPEC Train on Orion board - all working.
Kicked off a SPEC ref run - canis1 died.
Gathered a full set of 'perf record's for all of SPEC on silverbell;
and had a quick look through them;
there aren't too many surprises; a few things that might be worth a
look at though.
(Not as much using libc functions as I hoped).
There are some odd bits - chunks of samples landing apparently outside libraries
that aren't obvious what's going on.
Sent tentative patch for Thumb perf annotate issue (bug 677547) to
lkml for comments.
Started on libffi variadic fixing.
Caught the qemu pbx-a9 testing from PM; got qemu built and getting a
handful of lines of output
both from a kernel from arm.com's site and a linaro-2.6.37 that I built for it.
Dave
RAG:
Red:
Amber:
Green: issues I wanted to nail at this sprint all handled;
a couple of blueprints handed off to other people
Milestones:
| Planned | Estimate | Actual |
finish virtio-system | 2010-08-27 | postponed | |
finish testing PCI patches | 2010-10-01 | 2010-10-22 | 2010-10-18 |
successful ARM qemu pull req | 2010-12-16 | 2010-12-16 | 2010-12-16 |
finish qemu-cont-integration | 2010-01-25 | 2010-01-25 | |
At the Linaro/Ubuntu sprint/rally in Dallas this week
* qemu-continuous-integration
** spoke to Paul Larson (in Linaro Validation team) and
handed this blueprint off to him (with a clarification
of the requirements from qemu's point of view)
* maintain-beagle-models
** we've agreed to start doing official "Linaro Qemu"
releases which are essentially going to be the meego
tree plus point fixes for things. These will be every
month with the rest of the toolchain group releases.
Ubuntu will also take these releases to replace the
current use of the qemu-kvm tree to provide ARM models
* verify-a9-pbx-support
** this blueprint has been handed off to David Gilbert
* merge-correctness-fixes
** wrote and posted patches:
*** to restore IT bits after unexpected exceptions
(including fix of Linux usermode bug where it wasn't
clearing the IT bits when entering a signal handler)
*** to include opcode hex in disassembly of ARM insns
*** fixing a compile failure for a previous change when
the host linux system didn't have linux/fiemap.h
** I have managed to split the huge "lots of ARM TCG fixes"
commit in the meego tree up into 65 more self-contained
commits. These now need reordering and possibly some
may be recombined.
Current qemu patch status is tracked here:
https://wiki.linaro.org/PeterMaydell/QemuPatchStatus
Absences:
2011: Holiday 21 Jan, 22 Apr - 2 May.
Hi there. Unfortunately the Linaro GCC 4.5 2011.01-0 release has a
failure in the x86_64 compiler causing it to fail during the initial
build. We're working on triaging the problem at the moment.
ARM and i386 targets are not affected. I'll send a new announcement
when the problem has been fixed and the replacement 2011.01-1 release
is available.
-- Michael
The Linaro Toolchain Working Group is pleased to announce the release
of both Linaro GCC 4.4 and Linaro GCC 4.5.
The Linaro Toolchain Working Group is pleased to announce the release of
both Linaro GCC 4.4 and Linaro GCC 4.5.
Linaro GCC 4.4 is the sixth release in the 4.4 series. Based off the
latest GCC 4.4.5, it is a maintenance release that fixes one problem
found through use.
Linaro GCC 4.5 is the sixth release in the 4.5 series. Based off the
official GCC 4.5.2 release, it includes many ARM-focused performance
improvements and bug fixes.
Interesting changes include:
* Improved optimization of multiple load instructions, and
multiply-and-accumulate.
* -fshrink-wrap optimization for better use of function prologues and
epilogues.
* plus, various other bug fixes, and minor improvements.
The source tarballs are available from:
https://launchpad.net/gcc-linaro/+milestone/4.5-2011.01-0
and
https://launchpad.net/gcc-linaro/+milestone/4.4-2011.01-0
Downloads are available from the Linaro GCC page on Launchpad:
https://launchpad.net/gcc-linaro
Hi,
* Continued with testing and implementation of reduction support in SLP
* Found a major problem in vectorization of if-converted data
accesses. Looked into other ways to solve the problem.
* Spent some time on non-Linaro vectorization plans
* Unsuccessfully tried to make the board work
Ira
== Last Week ==
* Displaced stepping support for 32-bit Thumb insns are done, unless
new bugs are found.
Decode writeback for str/ldr in Thumb 32-bit. Fix bugs in supporting
LDR/STR/LDRH/STRH in both ARM and Thumb.
Test GDB to handle various 32-bit Thumb insns in displaced stepping.
Code cleanup and write some new test cases.
* One day New Year holiday on Jan. 3rd.
== This Week ==
* Linaro Sprint.
* Get patches for Linaro GDB bugs approved on upstreams as many as
possible.
--
Yao Qi
Hello,
While testing SMS on Crotex-A9 I see that the latency of load instruction
is 1
cycle when compiling with -mcpu=cortex-a9 -mthumb -mtune=cortex-a9 -O3.
Below is a snippet from the SMS dump file showing the DDG, created for the
loop in foo function, which depicts the edge between the load of input[i]
(insn 181) and the mult instruction (insn 184).
[181 -(T,1,0)-> 184] is the true dependence edge created between the
two insns; with latency of 1.
On Crotex-A8 the latency of the load is 3 as expected.
I've read in crotex-a9.md file that loads should have a latency of 4 cycles
so I just wanted to check if I should have used other combination of flags
for Crotex-A9 or the load latency should indeed be of 1 cycle here.
Thanks,
Revital
int foo (int max, signed short *input, int y)
{
int i, accum;
for (i = 0; i < max; i++) {
accum += (signed int) input[i] * (signed int) input[i+y];
}
return accum;
}
The snippet from the DDG:
Node num: 2
(insn 181 178 184 13 (set (reg:SI 216 [ D.2019 ])
(zero_extend:SI (mem:HI (plus:SI (reg:SI 319 [ ivtmp.34 ])
(reg:SI 345)) [2 MEM[base: D.2076_257, index:
D.2079_226, offset: 0B]+0 S2 A16]))) tmp.c:7 714
{*thumb2_zero_extendhisi2_v6}
(nil))
OUT ARCS: [181 -(A,0,1)-> 176] [181 -(T,1,0)-> 184]
IN ARCS: [184 -(A,0,1)-> 181] [176 -(T,1,0)-> 181]
Node num: 3
(insn 184 181 234 13 (set (reg/v:SI 209 [ accum ])
(plus:SI (mult:SI (sign_extend:SI (subreg/s/u:HI (reg:SI 212
[ D.2013 ]) 0))
(sign_extend:SI (subreg/s/u:HI (reg:SI 216 [ D.2019 ]) 0)))
(reg/v:SI 209 [ accum ]))) tmp.c:7 64 {maddhisi4}
(expr_list:REG_DEAD (reg:SI 216 [ D.2019 ])
(expr_list:REG_DEAD (reg:SI 212 [ D.2013 ])
(nil))))
== Last Week ==
* Worked to improve libunwind test suite results using new ARM-specific
unwinding. Managed to reduce the number of failures, but some tests
still fail.
* Tested latest ltrace upstream tree and determined that It Is Good.
Hopefully, this marks the cusp of a new release that includes my work.
== This Week ==
* Finish up work with libunwind. Update wiki brain dump.
--
Zach Welch
CodeSourcery
zwelch(a)codesourcery.com
(650) 331-3385 x743
== Last week ==
* Some discussion about libffi variadic function support. The current
proposed design is an additional API function to prepare a new call
interface (CIF) structure with the argument number settings for each
variadic call site. This IMHO, is slightly less flexible that doing a
call-time placement style interface, but should be faster and easier to
implement.
* Collecting a few cases of ARM/Thumb-2 integration in the GCC ARM backend.
* Prepared and traveled to Dallas for the Linaro sprint.
== This week ==
* The Linaro sprint.
== Linaro GCC ==
* Continued hacking on (cleaning up, testing) NEON
element/structure load/store (etc.) intrinsics-improvement patch. This
now passes all the auto-generated neon.exp tests (admittedly only a
very weak test for correctness), and the small number of hand-written
tests I threw at it, so hopefully we can declare victory soon. (This
patch provides huge improvements to generated code for the "fancy" NEON
load/store intrinsics and vtbl/vtbx lookup instructions, and hopefully
also provides part of the solution for allowing the vectorizer to use
the fancy loads/stores also.)
== Misc ==
* Public holiday Monday.
== GCC ==
Pulled down new commits from upstream GCC. My test build failed due to a
new cross-build configure problem. Found the problem in GCC Bugzilla,
and rolled back to the revision before the problem one. Those sources
built ok, so I've pushed the changes to Linaro GCC 4.6 branch. It it now
updated to 31st December.
Discussed cross-build problems with Alexander Sack on IRC.
Merged, tested and pushed all the outstanding Launchpad merge requests
into GCC 4.4 and 4.5.
Merged, tested and pushed all new CS SG++ patches into Linaro GCC 4.5.
Merged, tested and pushed GCC 4.5.2 from upstream into Linaro 4.5.
Spun releases for both Linaro GCC 4.4 and 4.5.
Brought the Linaro patch tracker up to date.
== Other ==
Catch up with lots of email following my two week holiday.
Updated some of the CS patch tracker.
Travel to Dallas for the Linaro sprint.
-------
Next week (10th-14th Jan):
Linaro sprint in Dallas
Following week (17th-21st Jan):
CS annual meeting in Phoenix
Got h264ref working in SPEC; this was another signed-char issue (very
difficult one to find - it didn't crash, it just came out with subtly
the wrong result);
compiling that and sphinx3 with -fsigned-char and they seem happy.
With Richard's fix for gromacs that leaves just the zeusmp binary
that's too large to run on silverberry; it seems to startup on canis1
(that has more RAM).
So that should be a full set; I just need to get the fortran stuff
going on canis1.
Kicked off discussion on libffi-discuss about variadic calls; people
seem OK with the idea of adding it (although unsure exactly how many
things really use it
- even though there are examples in Python documentation). Also
kicked off some of the internal paperwork to contribute code for it.
Dave
== This week ==
* Away Monday, and a fair bit of time on non-Linaro duties.
* Looked at Dave's gromacs bug (693502). Turned out to be a reload
inheritance problem. Tested a patch. Spent some time coming up with
a brute-force testcase that I can submit with the patch.
* Found a bug in the x86 and x86_64 ifunc support that would affect
ARM too if we weren't careful. Came up with an example testcase
and filed the bug upstream:
http://sourceware.org/bugzilla/show_bug.cgi?id=12366
It has been fixed by H.J. Lu. I'm wondering about taking a
slightly different approach for ARM.
* More ifunc work.
* Tried again to reproduce the chrome failure, making sure to use
DEB_BUILD_HARDENING=1. (I hadn't realised first time round that,
in reaction to this bug, debian/rules specifically excluded armel
from the automatic DEB_BUILD_HARDENING=1 setting.) The build
takes a couple of days on my BeagleBoard.
Heh, and just as I wrote that, the build failed with the reported
link error. Neat.
== Next week ==
* Finish testing the patch for 693502 and submit it upstream.
Backport the patch to our tree once accepted.
* Look at the chromium problem.
* More ifunc.
Richard
RAG:
Red:
Amber:
Green: qemu git pull request accepted, patches seem to be flowing
into qemu upstream more freely now
Milestones:
| Planned | Estimate | Actual |
finish virtio-system | 2010-08-27 | postponed | |
finish testing PCI patches | 2010-10-01 | 2010-10-22 | 2010-10-18 |
successful ARM qemu pull req | 2010-12-16 | 2010-12-16 | 2010-12-16 |
finish qemu-cont-integration | 2010-01-25 | 2010-01-25 | |
Bonus extended holiday edition:
This report includes a number of things that happened over
the Christmas holidays as well as this week (which is
a short one, only 3 days).
Progress:
* merge-correctness-fixes:
** my git pull request for various ARM qemu patches was merged!
** a number of other patches were merged to qemu master:
+ implement correct NaN propagation rules
+ rename softfloat float*_is_nan() functions
+ fix UMAAL (Aurelien's patch, reviewed by me)
+ VQSHL (reg) patchset
** diagnosed the segfault in
https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/604872
and wrote a patchset which fixes it (and related problems):
http://patchwork.ozlabs.org/patch/77887/ (n/7)
** wrote and posted a patchset which implements save/restore
for the versatile platform (so I didn't have to wait 10
minutes for the test case to reach the segfault)
http://patchwork.ozlabs.org/patch/76529/
** finished and posted a patchset implementing flushing of
denormals to zero on input:
http://patchwork.ozlabs.org/patch/77798/ (n/3), now committed
** reviewed/tested Aurelien's SMMLA/SMMLS patch (now committed)
** wrote and posted patches which implement the FS_IOC_FIEMAP
ioctl (http://patchwork.ozlabs.org/patch/77725/) and
the file_sync_range{,2} syscalls
(http://patchwork.ozlabs.org/patch/77723/) -- these are used
by apt, and so linaro-media-create was generating a lot of
warnings from qemu about their lack of implementation
** posted patch to clean up NaN handling in linux-user NWFPE
emulation (follow-on from earlier NaN cleanups):
http://patchwork.ozlabs.org/patch/77795/
* maintain-beagle-models:
** implemented minimal ARM cp14 debug registers
and a random register in the TWL4030 emulation (both needed
to get recent Linaro kernels to boot on the beagle model)
Submitted merge request upstream:
http://meego.gitorious.org/qemu-maemo/qemu/merge_requests/3
Current qemu patch status is tracked here:
https://wiki.linaro.org/PeterMaydell/QemuPatchStatus
Meetings: toolchain standup, pdsw doughnuts
Absences:
2011: Dallas Linaro sprint 9-15 Jan. Holiday 21 Jan, 22 Apr - 2 May.
Hi,
* implemented reduction support in SLP, I'll check if it helps
DenBench next week
* helping Sebastian Pop with if-conversion for vectorization
improvements (BTW, Sebastian's goal is to vectorize kernels from
ffmpeg)
* fixed GCC PR47139
Ira
Hi All,
Thanks for attending the call. I think we had some interesting discussions.
I've posted the minutes from the call on the same page as before:
https://wiki.linaro.org/AndrewStubbs/Sandbox/GCCoptimizations
I'll try to get the audio posted somewhere for anybody that's interested.
Andrew
You may have noticed that I have created a new BZR/Launchpad branch for
Linaro GCC 4.6:
lp:gcc-linaro/4.6
https://code.launchpad.net/~linaro-toolchain-dev/gcc-linaro/4.6
Up until now, this has not been buildable due to unfixed bugs. However,
upstream GCC have now straightened out the problems, so I have pushed a
buildable version into the branch.
I shall attempt to keep this branch as up-to-date as I can (at least, I
will once the holiday season and January travel are over), but I'll only
push updates if they build for me, so hopefully the branch should remain
fairly stable, at least for our purposes.
Note that so far I've only tested build-ability. Right now I'm not
making any promises about the quality of the compiler.
At some point, we'll want to use this branch to hold our own patches
(both those that will never go upstream, and those that are queued for
GCC 4.7), so it will diverge from upstream 4.6 a bit. For the moment,
it's merely a mirror.
Andrew
== Last Week ==
* Got a new ARM-specific unwind test case working, so its integration
with libunwind stands a marginally better chance of working (once it's
finally finished).
* Sent a ping to binutils mailing list about a patch to improve readelf
that I had sent in at the beginning of December. Still no response.
* Holidays and Vacation
== This Week ==
* Try to finish libunwind integration, as my time with Linaro is nearly
over. Update documentation on wiki to reflect current status.
* Ping ltrace list about a new release. It seemed so close, then the
list went abruptly silent.
--
Zach Welch
CodeSourcery
zwelch(a)codesourcery.com
(650) 331-3385 x743
== Linaro GCC ==
* Continued looking at element/structure load/store intrinsics
improvements. Some good initial results: it looks like the plan of
using "extra-wide" vectors for returning struct results works fine (at
least to a first approximation). Sent off WIP patch (internally to CS
only, so far).
Incidentally this looks like it'll be a good stepping-stone for the
"RTL half" (vs. the "tree half") of the representation of
element/structure loads/stores (e.g. vld2/vst2) also. The return type
(for loads) and argument type (for stores) of the RTL patterns for such
instructions is changed from the current wide-integer representation
(OImode, etc.) to a suitable wide vector instead (e.g. V16QImode). This
change might help lead to a more meaningful mapping from an equivalent
tree form -- though we haven't quite got the whole picture yet, as the
middle-end won't want to know about the ARM-specific and
non-standard-named patterns for the element/structure loads/stores.
(One might imagine a new standard-named RTL expander taking care of that
though.)
== Vacation ==
* Vacation Dec 20th-Jan 4th.
== GCC issues ==
* PR44557, Thumb-1 ICE, looked at the ARM specific secondary reload
parts, as well as some general reload internals context. Concluded that
concerns on Thumb-2 about my submitted patch should be unneeded, as the
reload_in/out patterns should never be used for Thumb-2. Also looked a
bit on how we should upgrade ARM to use TARGET_SECONDARY_RELOAD.
* PR45416, ARM code regression. Started working on this again, cleaning
up patch to submit.
* Had some email discussion with Revital Eres on Swing Modulo Scheduling
(SMS) for ARM issues, mainly on how the doloop_end pattern should be
done on ARM.
* Submitted and committed an obvious small patch for a VFP testsuite case.
== This week ==
* Continue on GCC issues.
* Flying to Dallas on Sunday, prepare for trip.
Hi,
* continued with my attempts to vectorize Viterbi:
- finished implementation of conditional store sinking in cselim
pass (I did only limited testing).
- reconsidered the idea of safe load if-conversion if an adjacent
field of the same structure is accessed unconditionally - this may be
incorrect. Instead I tried the last, not yet committed, patch by
Sebastian Pop that implements if-conversion for such cases of not-safe
data accesses. His patch if-converts the loop in Viterbi, however, it
also makes the loop not vectorizable - additional work should be done
in the data-refs analysis and the vectorizer to make it work.
Sebastian is working on the first part, and I'll help him with the
vectorizer part if necessary.
* analyzed EEMBC DenBench, couldn't find any action items for now. But
vld/vst support of strided data accesses should be very useful for
these benchmarks.
* fixed GCC PR testsuite/47057
* looking into SLP of reduction as in PR 41881. I saw similar patterns
several times in DenBench, but I'm not sure that SLP of reduction is
enough to vectorize all of these cases.
Happy New Year,
Ira
== Last Week ==
* Continue with libunwind. Wrote a new unit test for ARM-specific
unwinding code to help debug that new code's problems. Almost got it
working, which I hope means its integration with libunwind may be
nearing completion.
== This Week ==
* Try to finish ARM-specific improvements to libunwind. Famous Last Words.
--
Zach Welch
CodeSourcery
zwelch(a)codesourcery.com
(650) 331-3385 x743
== GCC related ==
* Launchpad #693686, GCC ARM segfault ICE when building Chromium in V8.
Spent some time reproducing; this ICE seems to be in the maverick
gcc-4.5, at the vectorizer phase. As the ICE happens in
tree-vect-stmts.c:supportable_widening_operation(), I'm suspecting
(without further verification yet) this might be due to vmovn not
backported? (Linaro 4.5 does has this ported I think)
* PR44557, Thumb-1 ICE. Looking further after seeing Richard Earnshaw's
comment on my patch. It would be nice if we could upgrade the entire
secondary reload bits, looking into this.
== This week ==
* Look into more GCC issues.
* Get some backports done.
== Linaro GDB ==
* LP:615972
Get patch approved upstreams. Committed to FSF tree. Propose merge
request to Linaro GDB tree.
* LP:616003 gdb.mi/mi-var-display.exp failure
Discussed in upstreams on how to handle fp in ARM/Thumb mode. Finally
work out a one-line patch. Approved and committed to FSF tree. Propose
merge request to Linaro GDB tree.
Draft another patch to clean up ARM register alias. Pending on upstreams.
* LP:616000 Handle -fstack-protector prologue code
Revise patch per Joel's comments. Approved, and committed to FSF tree.
Draft two patches to handle -fstack-protector prologue code on i386.
Sent them out for review. Due to lack of knowledge on i386 prologue
generate, not very confident on one of these patches.
* LP:615980 Support displaced stepping on Thumb
Get my test case to arm displaced stepping approved, and committed to
FSF tree.
A patch about supporting displace ARM insn in Thumb area is pending
upstreams. Tried the 2nd approach since the 1st approach is not
acceptable to upstreams reviewers. Without this patch, ARM displaced
stepping doesn't work on Linaro.
Support another three PC-related 16-bit Thumb insns (adr, ldr, and
cbz), and add test cases for them accordingly.
Spend some time splitting my big patch into three relatively small
patches in order to make them easier to be reviewed. Patches on
supporting Thumb 16-bit displaced stepping are sent out upstreams for
review.
== This Week ==
* Work from Mon. to Wed.
** Backport some approved upstreams patches to Linaro GDB
** Anything I should do for my pending patches.
* Vacation on Thu. and Fri. 3rd Jan. is China public holiday. Back to
work on 4th. Jan.
--
Yao (齐尧)
Khem Raj <raj.khem(a)gmail.com> wrote:
> The bug http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46883 files
> against GCC trunk also happens with linaro gcc 4.5
> My guess is that there is a backported patch from trunk into linaro
> 4.5 tree thats causing this ICE
>
> This ICE does not happen on upstream gcc-4.5 branch
Thanks for the bug report!
> I havent figured out the commit yet.
It looks like the regression was introduced by Bernd Schmidt's
patch to improve zero-/sign-extensions (PR 42172), which we
did indeed backport to Linaro GCC 4.5. (I've updated the
PR 46883 bugzilla with more details.)
> Should you need a bug in linaro
> bug tracker I will be happy to file one
Yes, please do so; this makes it easier to track the problem
on the Linaro side. Thanks!
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E.
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk
Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294
== GCC ==
* Checked in mainline fix for #617384 and submitted backport merge
requests (.debug_line is wrong with -fpic)
* Submitted backport merge requests for the fix for #662324
(Pointer type information lost in 4.5 debuginfo)
* Checked in mainline fix for #693425 and submitted backport merge
request (SPU back-end incompatible with extension elimination pass)
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E.
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk
Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294
Hi,
I was on vacation on Sunday and starting from Tuesday stayed home with
a sick child, so I only had a couple of days to work.
* vectorization of Viterbi:
- continued implementing conditional store sinking in cselim pass
- made if-conversion to work on loads of structure fields if other
field from the same structure is accessed unconditionally
* fixed GCC PR 47001.
Ira
Continued looking at SPEC 2006.
The two ICEs I mentioned last week are gone on the Natty version of the
compiler, however the 4 programs that run and give the wrong
results still happen with the Natty version and the latest version from bzr.
The 4 failures are:
h264ref - still fails on bzr 99447 with -O2 or -O0
sphinx3 - still fails on bzr 99447 with -O2 or -O0
gromacs - still fails on bzr 99447 with -O2 but works with -O1; I've
followed this through and detailed it in bug 693502; it looks to me like
a post-increment gone wrong (it's split so it's not
actually a post increment and the original rather than post inc'd value gets
used)
zeusmp - this fails to load the binary; it's got a >1GB bss section.
Interestingly it gets further on my beagle with less memory but a bit of
swap,
even though I think it's not really using all of the BSS
in the config I'm using.
I'm hoping to leave a 'ref' run going over the new year.
The canis1 Orion board I was also running Spec on last weekend died during
the run and hasn't come back.
perf
We now have silverberry using the -proposed kernel which has the fixed
PERF_EVENT config, and perf seems to work fine.
libffi
I've started building the page
https://wiki.linaro.org/WorkingGroups/ToolChain/FFIusers listing things
that use FFI; (generated by a bit of apt wrangling).
There are basically 3 sets:
a) Apps that just use ffi for something specific
b) Languages that then let the users of those languages have varying
degrees of freedom in themselves
c) Haskell - While some of the packages are actually probably ffi
users, I think a lot of these are false dependencies; almost every haskell
user seems
to gain a dependency on libffi directly.
I'm back on the 4th January.
Dave
Hi,
I continued looking into EEMBC benchmarks:
- telecom fft is not vectorized because unknown number of iterations.
It has both non-constant step and its loop bound may overflow. I
think, the solution here could be loop versioning, but since
versioning increases code size, this kind of optimization can be less
beneficial.
- telecom viterbi (vectorization potential gain is 4x) requires
conditional store sinking and load hoisting to enable if-conversion. I
worked on implementation of store sinking this week.
Ira
Ulrich Weigand/Germany/IBM wrote on 12/20/2010 06:01:21 PM:
> Mark Mitchell <mark(a)codesourcery.com> wrote:
> > On 12/20/2010 8:35 AM, Ulrich Weigand wrote:
> > > Now, I guess there's two ways forward: either the outcome of the
ongoing
> > > discussions on gcc-patches is that it is in fact not a good idea to
> > > generate such sets, and the EE pass is subsequently rewritten to
avoid
> > > them; or else, if those instructions are considered valid, I'll have
to
> > > extend the SPU move expander to handle them. Thoughts?
> >
> > I haven't participated in the upstream discussion -- I'm way behind on
> > that list :-( :-( -- but I think such sets should be considered valid.
>
> OK, I'll have a look at fixing the SPU back-end then.
I've now fixed this problem in the back-end upstream:
http://gcc.gnu.org/ml/gcc-patches/2010-12/msg01694.html
I've also created a back-port to Linaro GCC 4.5 and proposed the
branch for merge; you can find the details at:
https://bugs.launchpad.net/gcc-linaro/4.5/+bug/693425
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E.
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk
Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294
== Last Week ==
* Continued working on ARM unwinding in libunwind. Produced a draft
write-up of my progress in the event that I don't finish this work
before being swapped out of Linaro.
* (Re-)submitted patches to fix ltrace test suite. Hopefully, these will
be the last changes before the new release.
== This Week ==
* Continue working on libunwind.
--
Zach Welch
CodeSourcery
zwelch(a)codesourcery.com
(650) 331-3385 x743
Hi,
We would like to build Android with the Linaro tool chain. Do any of you
know what kind of work will be needed to adapt Linaro gcc to Android?
Regards,
Patrik
== GCC related ==
* CS Issue #10201 / PR46883, unrecognizable insn ICE when compiling
Samba. Fixed this by changing the predicates of two split patterns.
Patch reviewed in CS internally and upstream, committed upstream, will
backport to SG++ and Linaro soon.
* LP:641397/PR46888: bitfield insert optimization. Andrew Pinski found a
testcase that escapes the CSE patch gets handled by combine, and also
found another bug with REG_EQUIV notes. Only looked at this minimally
last week, will really work on this later.
* LP:687406/PR46865, -save-temps creating different code. Backported and
bzr-pushed the upstream fix by Jakub Jelinek.
* PR45416, ARM code regression. Mostly can generate what I wanted by
now, under ARM and x86, although patch is still not in a submittable state.
* VFP index patch. Uncommitted GCC patch of mine from last year; added
Thumb-2 bits and corrected some things in the testcase. Committed upstream.
== This week ==
* Really get January travel stuff nailed.
* Upstream patch review is probably going to start getting
slow/suspended this week. Will probably do some study stuff on larger
projects.
* Continue to look at GCC issues.
== Linaro GDB ==
* LP:615972 Different output of 'info register' w/ and wo/ corefile.
Understands gcore impl in gdb. Two patches are reviewed in upstreams.
One is approved by Dan, and the other is still be reviewed.
Evaluate the two approaches for NEON registers in corefile. Resume the
discussion on kernel support for dumping NEON registers. Need a
decision with kernel side, but no progress on it.
* LP:685494 Revise patch per Pedro's suggestion. Waiting for someone
to approve it.
* LP:685702 Get it approved for FSF GDB 7.2 branch. Committed to both
FSF 7.2 branch and Linaro tree.
* LP:616003 gdb.mi/mi-var-display.exp failure
GDB always assumes $fp is r11, even code is in thumb mode. Current GDB
infrastructure can't handle mapping the same alias to two different
registers. Proposed a new gdbarch took for this in upstreams, in order
to increase the flexibility of GDB. No reply yet.
* LP:616001 gdb.mi/mi-var-cmd.exp failure
Ulrich pointed out it is caused by stack randomization. Confirmed this
by setting "kernel.randomize_va_space" to zero. Figure out why this
case passes on x86, because it is more restricted to turn on stack
randomization on x86.
* LP:615980 Support displaced stepping on Thumb
Understands displaced stepping in GDB/ARM. Find a bug when GDB tries to
execute ARM instruction in copy area, which is in Thumb mode (copy area
starts from "_start + 4", and it is compiled in Thumb mode in Ubuntu).
The fix is an one-line patch, which doesn't update status register when
writing PC in displaced stepping.
Write a test case for arm displaced stepping. Write code in ARM asm
directly for the first time, which is very helpful to remember ARM asm
instructions.
Read ARM ARM and decode 16-bit thumb instructions in GDB for displaced
stepping. It doesn't work so far because breakpoint instruction after
instructions in copy area is still hard-coded to ARM breakpoint insn.
== Misc ==
* Linaro GCC optimization meeting.
== This Week ==
* LP:615980 Support displaced stepping on Thumb
Send my fix and test case to upstreams for review.
Make displaced stepping work on 16-bit instruction.
* Ping other GDB patches.
--
Yao (齐尧)
Mark Mitchell wrote:
> > If Profile Guiding could spot that a particular callsite to say strlen()
> > was often associated with strings
> > of at least 'n' characters we could call a different implementation.
>
> I don't believe this is possible current profile-guided optimization,
> but certainly it could be done.
It looks to me like a case of value profiling, see tree-profile.c, for
the various "stringops" optimizations. Unless I misunderstand David's
idea here or missing something else, it seems that this kind of
optimization should fit in the existing infrastructure without too much
effort.
Ciao!
Steven
Does anyone have any experience of what can be profiled in the profiled
guided optimisations?
One of the problems with some of the string routines is that you can write
pretty neat fast routines that
work well for long strings - but most of the calls actually pass short
strings and the overhead of the
fast routine means that for most cases you are slower than you would have
been with a simple routine.
If Profile Guiding could spot that a particular callsite to say strlen() was
often associated with strings
of at least 'n' characters we could call a different implementation.
Dave
* Linaro GCC
lp:686381: C++ link failure on ARM
Reproduced the bug and posted my findings to the bug report - user error.
Changed the way the Linaro GCC version numbers are handled. Hopefully
the new system should be less distasteful to Matthias. Updated the GCC
release procedure document to match.
Organised and chaired a meeting to discuss GCC optimization
opportunities for ARM. It was well attended, and I think we had some
useful discussion. Spend quite some time preparing beforehand, and
writing it up afterwards. Next step is to come up with some actual plans
to implement something. I imagine we can discuss this at the sprint in
Dallas next month. See
https://wiki.linaro.org/AndrewStubbs/Sandbox/GCCoptimizations
* Upstream GCC
My upstream patch to fix ARM smlabb has been approved and committed to
GCC 4.6 (mainline). Only another three patches need approval now!
Continued testing upstream GCC 4.6 with both cross and native builds. It
appears to be in a buildable state now, with no extra patches required.
I've updated the Linaro GCC 4.6 branch with the buildable state.
* Other
Updated my ESTA, and added my security details to the airline bookings.
------
Future availability
20th Dec .. 3rd Jan - Vacation/Holiday
4th Jan .. 8th Jan - Business as usual
9th Jan .. 14th Jan - Linaro Sprint, Dallas
15th Jan .. 21st Jan - CodeSourcery/Mentor Annual Meeting, Scottsdale
24th Jan onwards - normal service restored!
Got SPEC2006 building on Silverbell (VExpress) and Canis1 (Orion). There
are still some issues;
The builds are still going (6 hours so far on a 1GHz A9 for a build and
'test' case), and the Silverbell one has hit an ICE on one of the tests that
looks like 635409,
and also looks like it needs some help getting Perl to work. The build on
Canis has only just started,
but hasn't got Fortran installed.
(The SPEC2006 tools build also failed in the Perl testsuite on sprintf.t and
sprintf2.t which seem to test integer
overflow cases in sprintf % fields)
Added a few of the kernel string/memory routines and bionic routines into my
string/memory graphs and
also ran the tests on the Orion board (similar to other A9 performance - no
surprise).
Wrote up a draft of an email to libffi-dev describing the varargs state; and
as I was doing it realised that
one of the ways didn't quite work and was more messy.
Using rdepends to find all packages using ffi, need to figure out if any
actually care about varargs.
Dave
== GCC ==
* Completed first successful bootstrap and regression test run
of GCC mainline on my IGEPv2 board.
* Worked on implementing fix for #617384
(.debug_line is wrong with -fpic)
* Worked on backporting fix for #662324
(Pointer type information lost in 4.5 debuginfo)
* Analyzed root cause of PR target/46883
(GCC ICE with error: unrecognizable insn)
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E.
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk
Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294
Hi there. I've cancelled the weekly and standup calls for the next
two weeks. The next scheduled call is the standup call on Wednesday
the 5th of January. Please attend if you can as it's our last one
before the sprint.
See you then!
-- Michael
Hi there. The sprint is just around the corner and it's a good time
to think about how we can make best use of the week. I've put some
topics up at:
https://wiki.linaro.org/Events/2011-01-LinaroSprint/ToolChainWG
Please feel free to add to it. Have a think about anything that's
easier to do while everyone is in the same room - things like
discussions, kicking off some work, a bit of pair programming on a
problem, or anything that overlaps with another group or Ubuntu.
-- Michael
RAG:
Red:
Amber:
Green:
Milestones:
| Planned | Estimate | Actual |
finish virtio-system | 2010-08-27 | postponed | |
get valgrind into linaro PPA | 2010-09-15 | 2010-09-28 | 2010-09-28 |
complete a qemu-maemo update | 2010-09-24 | 2010-09-22 | 2010-09-22 |
finish testing PCI patches | 2010-10-01 | 2010-10-22 | 2010-10-18 |
Progress:
* merge-correctness-fixes:
** Submitted patchset upstream to fix NaN propagation to
follow ARM ARM rules rather than x87 semantics:
http://patchwork.ozlabs.org/patch/75742/http://patchwork.ozlabs.org/patch/75743/
* maintain-beagle-models:
** Finished implementation of the OMAP NAND prefetch/postwrite
engine including its DMA support. Patches submitted to the
qemu-maemo upstream tree and merged by Juha:
http://meego.gitorious.org/qemu-maemo/qemu/merge_requests/1
** Fixed the (cosmetic) bug
https://bugs.launchpad.net/qemu-maemo/+bug/622408
where we were complaining about "Unknown CMD52" when Linux probed
for the presence of SDIO cards. Fix merged into qemu-maemo:
http://meego.gitorious.org/qemu-maemo/qemu/merge_requests/2
* qemu-continuous-integration:
** Discussion with Loic about setting up jobs on his Hudson
instance for testing qemu against snapshots/hwpacks.
* packageselection-arm-n-more-stable-vm-solution-for-arm
** Discussion about Ubuntu moving to using a qemu-maemo based
qemu for ARM purposes. The Ubuntu blueprint is
https://blueprints.launchpad.net/ubuntu/+spec/packageselection-arm-n-more-s…
We need to come to agreement about what parts of this are
going to be done by variously Linaro toolchain, Linaro
foundations and Ubuntu.
** I'm going through doing another rebase-and-package of
the Linaro qemu, and finishing off writing up the notes
on the process:
https://wiki.linaro.org/WorkingGroups/ToolChain/QemuReleaseProcess
Meetings: toolchain, pdsw-tools, pdsw-tools xmas lunch :-)
Issues:
* a number of qemu patches in progress are logjammed behind
the outstanding git pull request
* the dbgsym debug packages for linaro kernels seem to have
vanished:
https://bugs.launchpad.net/linaro-images/+bug/691192
Absences: (complete to end of 2010)
Fri 17 Dec - Tue 4 Jan inclusive.
2011: Dallas Linaro sprint 9-15 Jan. Holiday 22 Apr - 2 May.
Hi,
* I've spent some time for testing the patches that allow the GCC trunk
to bootstrap again on ARM and posted the results to gcc-testresults
* finally tested and posted the patch that optimizes the __sync_*
builtins (#681138) on gcc-patches
* investigated on the state of the crash utility on ARM (or rather its
prerequisites like kexec)
https://wiki.linaro.org/KenWerner/Sandbox/crash-utility
* I'm on holiday now :)
Regards
Ken