== GCC ==
* Determined root cause of #Bug 598462 (corrupted profile info
with -O[23] -fprofile-use), identified work-around (using
-fprofile-correction), and verified on GCC 4.5 and 4.6
== GDB ==
* Worked on re-implementation of fix for #615978 (Failure to
software single-step into signal handler)
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E.
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk
Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294
== GCC ==
* Internal ARM tasks that kept me busy for some time this week.
* Reworked the divmodsi4 patch based on comments. Still some test
failures that need to be investigated.
* Started working on the performance improvement plan for GCC 4.5.
* Number of patch reviews upstream.
* Looking at A9 memset issue. .
* Got an internal bug report about the 4.4 tools from someone using it
which had to do with a missing backport. This is now lp:709329.
Backported and testing.
* Backported a missing upstream patch into GCC 4.5 for bswap patterns.
* Wrote up on how to cross-test with qemu and updated that on the
Linaro wiki. Comments are welcome.
Plans:
* Find out the default number of iterations needed for steady state
with EEMBC.
* Finish testing fix for lp:709329 and get that committed.
* Finish off divmodsi4 patch and get it's testing completed and
finished.
* Submit backport for bswapsi at Os fix (PR44392) for gcc-linaro
* Finish the plan for GCC performance improvements based on what we
discussed at the
sprint.
* Away 2-1/2 days next week for an internal engineering conference.
This week
=========
- On other IBM duties until today. Now finished!
- Looked at the neon failures that Peter reported. I'm testing patches
for the ICE and the "must be a constant" error now.
- In the background, I've been trying to pin down the chromium build
failure. I can only reproduce it when running under dpkg-buildpackage:
if I run the link line manually, it works. This is reminiscent of a
problem that Dave saw elsewhere: the linker segfaulted only when run
through the normal build system.
Next week
=========
- Bernd Schmidt has committed our combined patch for #695302.
Will backport to our sources.
- Submit the neon fixes above.
- More on STT_GNU_IFUNC. There are some more tests I want to write.
Richard
Some news from the qemu mailing list that I think might be
of interest to gcc folks here:
Christophe Lyon from ST has kindly released a large
set of test cases of Neon intrinsics:
http://gitorious.org/arm-neon-tests/arm-neon-tests
(the tests themselves are more aimed at testing qemu,
so they just produce output to be compared against a
reference generated from running on hardware).
However they don't currently compile with gcc (but
are ok with armcc). From the README:
# The tests currently fail to build with GCC/ARM:
# - no support for Neon_Overflow/fpsrc register
# - ICE when compiling ref_vldX.c, ref_vldX_lane.c, ref_vstX_lane.c
# - fails to compile vst1_lane.c
# - missing include files: dspfns.h, armdsp.h
Maybe it's worth somebody having a look at this,
at least enough to find out whether the ICEs are
things we already know about or have perhaps
already fixed in linaro gcc?
thanks
-- PMM
Hi,
I am working on implementation of interleave_high/low and
extract_even/odd for NEON. The pairs of high/low (even/odd) are
"magically" united into single vzip (vuzp) instruction in the back
end, so there is no need in special support from the tree level. There
are still some test failures that I need to solve.
Ira
Hi,
I've written up a wiki page finally on the cross-testing with qemu
here. It's taken me slightly longer than expected as I was playing
around with moin-moin syntax. These are based on qemu-0.13.0 which is
what I tried when I wrote this up for some testing of a patch that I
was playing with.
https://wiki.linaro.org/WorkingGroups/ToolChain/CrossTestingQemu
Comments are welcome .
Ramana
Hi,
* I looked into the perf utility with regard to ARMv7 and raw event support
* https://wiki.linaro.org/KenWerner/Sandbox/perf
* testsuite fixes for the OpenCL GDB
* started to setup the pandaboard (currently the headless snapshot hangs
shortly after I got the bash prompt - I'm not sure what's going on here)
* On Friday I'll attend a class.
Regards
Ken
Hello,
* Submitted to mainline the patch to model Doloop for ARM
(http://gcc.gnu.org/ml/gcc-patches/2011-01/msg01718.html) .
The ARM back-end part was reviews by Richard Earnshaw.
* Looking into EEMBC/DENbench for opportunities applying Modulo-Scheduling.
Based on partial profiling information there are SMSed hot loops. My next
step is to generate complete profiling information (including gcov info)
and execute the benchmarks on ARM machine.
Thanks,
Revital
Hi Vijay,
On Sat, Jan 22, 2011 at 9:59 AM, Vijay Kilari <vijay.kilari(a)gmail.com> wrote:
> Hello Dave,
>
> Thanks for this info.
>
> I have few more queries after looking at the results of memset on A9 & A8.
> I agree that externel bus speed matters in comparision across platforms.
>
> 1) Why memset is performance is good on A8 than A9?. any justification?
I've CC'd the linaro-toolchain list who have been working on this
topic and may be able to provide you with more information.
Cheers
---Dave
Hi there. I've had a think and done a write-up on the causes behind
the 2011.01 release failure and what should be changed. See:
https://wiki.linaro.org/WorkingGroups/ToolChain/Incidents/2011.01-X86_64
The changes involve being more explicit in the release process
document and changing the continuous build to give earlier warning.
Comments are appreciated.
-- Michael
Hello,
I have a patch for ARM that I want to test and I'm not sure what's the
procedure of testing in Linaro nor to where
it should be committed. (GCC trunk? it's currently under bug fixes mode
only).
I appreciate help with that.
Thanks,
Revital
Hello,
could you please provide some comments about the state of "-Os"
(optimising for size) in the gcc 4.5.x versions of Linaro's tool
chain?
It appears there are a number of issues with recent versions of GCC
that get triggered when optimising for size, for example
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45052http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44392
Some other projects like the Linux Foundation driven Poky (resp.
Yocto project) capitulated and stopped using -Os, see for example
here:
http://thread.gmane.org/gmane.linux.embedded.poky/2311/focus=2565
On the other hand, I can see that Linaro even adds improvements for
"-Os", see for example here:
http://thread.gmane.org/gmane.linux.linaro.toolchain/367
So I wonder what the state of these problems with "-Os" is in the
Linaro tool chain? Have these issues been solved, and is "-Os"
reliably working with the Linaro tool chain?
Thanks in advance.
Best regards,
Wolfgang Denk
--
DENX Software Engineering GmbH, MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd(a)denx.de
The optimum committee has no members.
- Norman Augustine
I assume we're moving the toolchain calls over to the
new confcall numbers -- but just to check, are we doing
so for this Wednesday's status call? The calendar
entry still has the old numbers...
-- PMM
==GCC==
Progress:
* Got invited to the Toolchain WG and was at the sprint in Dallas.
* Worked on fixing inconsistencies in the A9 scheduler. Now fixed up
upstream and submitted merge request for that.
* Introduced to other members of the Toolchain WG and had
conversations with everyone about what was going on within the
Toolchain WG.
* Spent some time trying to set up a Pandaboard that I borrowed
(Thanks Wookey) but the Panda didn't work. Turns out both the SD cards
were broken and or the versions of the boot loader / firmware were
broken.
* Worked through the patch list with Andrew trying to understand the
patches that exist in the backend for the delta between gcc-linaro and
upstream.
* Did some patch review upstream to help ease the backlog.
* Flushed some patches out from my patch queue upstream.
* Fixed the scheduler issue upstream and submitted a merge request for
it in linaro gcc-4.5 since it really fixes the A9 scheduler.
* Some ARM internal tasks with respect to transition into Linaro.
* Worked on the divmodsi4 issue and testing my patch with trunk.
Pushed my branch into launchpad if someone wants to see the code. Does
the right thing for the case we want but won't help with the SPEC2k
case really because that's a case with Os. But a good size improvement
anyhow and something that will help with divmod cases with 2 variables
rather than where a constant is used.
* Set up a natty chroot on one of our ve2 boxes which is now running a
bootstrap for all languages we are interested in and collecting test
results from trunk upstream.
* Worked through the new starter guide and still have an issue with
accessing wiki.linaro.org
Absences:
February 1-3:ARM Internal engineering conference.
== GCC ==
* Analyzed root cause of #685352 (libplymouth2_0.8.2-2ubuntu6 and
later give ragged splash and text rendering), opened mainline
bugzilla PR rtl-optimization/47299
* Submitted backport merge request for the #685352 fix
* Successfully completed profiled-bootstrap run on ARM
* Investigated profiled pyhton package build problems
== GDB ==
* Forward-ported GDB patch to support ARM hardware watchpoints
* Determined root cause of #615978 (Failure to software single-step
into signal handler), started discussion of proposed fix
* Miscellaneous Launchpad bug maintenance
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E.
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk
Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294
=A9 Qemu=
I've spent most of the week looking at QEmu emulation of SMP A9. The
model (of a realview-pbx-a9) doesn't have any working block IO;
I spent some time looking at trying to get SD working and got part
way, but I fell back to using NFS root.
It seems to work OK for basic CPU emulation, and SMP 'works' in the
sense that the guest sees multiple CPUs;
however QEmu is restricted to only using one host CPU core for
multiple guest CPUs, so it's of limited
help in debugging SMP code.
Video doesn't seem to work either.
To get to that point it does need a bunch of patches to QEmu, most of
which Peter Maydell already knows of;
I've put some notes here :
https://wiki.linaro.org/Internal/People/DaveGilbert/QEMUA9SMP
Note that the realview-pbx doesn't currently have a Linaro hardware
pack; I used a kernel from ARMs website
and a 2.6.37 I built myself.
=SPEC=
SPEC ref got quite a far way through on Canis (with half-duplex
ether), however 'lbm' failed when running
in ref mode (while having worked in test and train) giving a different
output; it takes quite a long time to fail.
= Perf =
I sent an updated version of my patch for perf's thumb annotation upstream.
and apparently we have a Panda on the way.
Dave
RAG:
Red:
Amber:
Green:
Current Milestones:
| Planned | Estimate | Actual |
finish qemu-cont-integration | 2010-01-25 | 2010-01-25 | handed off |
first qemu-linaro release | 2011-01-11 | 2011-01-11 | |
Historical Milestones:
finish virtio-system | 2010-08-27 | postponed | |
finish testing PCI patches | 2010-10-01 | 2010-10-22 | 2010-10-18 |
successful ARM qemu pull req | 2010-12-16 | 2010-12-16 | 2010-12-16 |
* submitted meego patch upstream to print instruction
boundaries in TCG op-level debug dumps:
http://patchwork.ozlabs.org/patch/79298/
* reviewed a couple of patches by Christophe Lyon from ST
which fix some problems with VMULL and friends
(together with a qemu-meego patch in the same area)
* lots of rebasing work to get a qemu-linaro tree into
shape. Mostly putting together my exploded set of
ARM patches from the meego tree with a rebased set
of omap patches. I have something which I think is about
right but I need to test it.
* patches submitted to qemu upstream:
+ add -version option to qemu-user:
http://patchwork.ozlabs.org/patch/79635/
+ fix writing default vector address in PL190:
http://patchwork.ozlabs.org/patch/79712/
+ print instruction boundaries in TCG op-level debug dumps:
http://patchwork.ozlabs.org/patch/79298/
Current qemu patch status is tracked here:
https://wiki.linaro.org/PeterMaydell/QemuPatchStatus
Absences:
17/18 March: QEMU Users Forum, Grenoble
Holiday: 22 Apr - 2 May
I had a few spare cycles this afternoon on my A9 board so I got it to
build and run 72 variants of CoreMark with the 2011.01 compiler. The
results are here:
https://wiki.linaro.org/MichaelHope/Sandbox/CoreMark1
I haven't looked deeply into it and the usual disclaimer about
benchmarks applies, but there's some interesting stuff there. The ARM
to Thumb-2 drop is more than expected. Tuning for Cortex-A9 generally
makes things worse.
-- Michael
Hi,
One of the things that came up in yesterday's chat was about ARM vs
Thumb2. If some folks are interested in a high level overview that
doesn't go into too many details with respect to the ISA that I know a
compiler writer will be interested in you could read this presentation
unless you've seen it. At the minute the canonical way of figuring out
more about Thumb2 is really to read the ARM-ARM.
https://wiki.ubuntu.com/Specs/M/ARMGeneralArchitectureOverview?action=Attac…
This was a presentation that Dave Rusling gave at the UDS in Belgium
last year, it roughly covers the architecture at a high level and
talks about Thumb2 vs ARM , why Thumb2 , the evolution of the ARM
architecture etc. It's not a lot of detail but quite a high level overview.
HTH
cheers
Ramana
Hi,
* finished SLP for reduction patch. The loop in DenBench that needs
this feature also requires support of load permutation. I am
considering to implement that too. I looked for other occasions that
need this feature, but only found loops that are not vectorizable. So,
I am not sure I'll proceed in this direction.
* looked into extract_even/odd and interleave_high/low implementation
on ARM as a backup plan for the case we don't have special load/store
support on time for the next release. Even though NEON VZIP and VUZP
instructions can perform both even and odd (and high and low)
computations simultaneously, I don't see how we can express that at
the tree level.
* looking into ffmpeg
* non-Linaro issues
Ira
Hi ,
So rather than manually backporting a patch from upstream "trunk" I
decided to try using bzr for this yesterday. Even though this was a
slow process and thanks to some help from Michael on IRC last night I
think I got it right finally. However I can't get bzr visualize to
show me the revision history correctly with the merge from the
upstream repository.
So what I did was to do the following
bzr branch 4.5 new-branch
cd new-branch
bzr merge -c XXX lp:gcc where XXX is the bzr revision number of the
svn commit that I'm tracking upstream.
gcc/Changelog merge fails - there is a conflict. Expected because you
are merging trunk's gcc/Changelog into a version of the Changelog file
in our tree
Thus I reverted changes that were brought in by copying in a backup of
gcc/Changelog that I had made before the merge
bzr resolved gcc/Changelog
edit Changelog.linaro
Add changelog entry
bzr commit
bzr push lp:gcc private repository
Create merge request upstream.
Does this look sensible to people or do folks follow other recipes?
cheers
Ramana
Hi there. I've cancelled Monday's meeting as the CodeSourcery people
are at their annual meeting and others are travelling back from the
sprint. Please come to the Wednesday stand up call if you are
available.
-- Michael
Got a complete run of SPEC Train on Orion board - all working.
Kicked off a SPEC ref run - canis1 died.
Gathered a full set of 'perf record's for all of SPEC on silverbell;
and had a quick look through them;
there aren't too many surprises; a few things that might be worth a
look at though.
(Not as much using libc functions as I hoped).
There are some odd bits - chunks of samples landing apparently outside libraries
that aren't obvious what's going on.
Sent tentative patch for Thumb perf annotate issue (bug 677547) to
lkml for comments.
Started on libffi variadic fixing.
Caught the qemu pbx-a9 testing from PM; got qemu built and getting a
handful of lines of output
both from a kernel from arm.com's site and a linaro-2.6.37 that I built for it.
Dave
RAG:
Red:
Amber:
Green: issues I wanted to nail at this sprint all handled;
a couple of blueprints handed off to other people
Milestones:
| Planned | Estimate | Actual |
finish virtio-system | 2010-08-27 | postponed | |
finish testing PCI patches | 2010-10-01 | 2010-10-22 | 2010-10-18 |
successful ARM qemu pull req | 2010-12-16 | 2010-12-16 | 2010-12-16 |
finish qemu-cont-integration | 2010-01-25 | 2010-01-25 | |
At the Linaro/Ubuntu sprint/rally in Dallas this week
* qemu-continuous-integration
** spoke to Paul Larson (in Linaro Validation team) and
handed this blueprint off to him (with a clarification
of the requirements from qemu's point of view)
* maintain-beagle-models
** we've agreed to start doing official "Linaro Qemu"
releases which are essentially going to be the meego
tree plus point fixes for things. These will be every
month with the rest of the toolchain group releases.
Ubuntu will also take these releases to replace the
current use of the qemu-kvm tree to provide ARM models
* verify-a9-pbx-support
** this blueprint has been handed off to David Gilbert
* merge-correctness-fixes
** wrote and posted patches:
*** to restore IT bits after unexpected exceptions
(including fix of Linux usermode bug where it wasn't
clearing the IT bits when entering a signal handler)
*** to include opcode hex in disassembly of ARM insns
*** fixing a compile failure for a previous change when
the host linux system didn't have linux/fiemap.h
** I have managed to split the huge "lots of ARM TCG fixes"
commit in the meego tree up into 65 more self-contained
commits. These now need reordering and possibly some
may be recombined.
Current qemu patch status is tracked here:
https://wiki.linaro.org/PeterMaydell/QemuPatchStatus
Absences:
2011: Holiday 21 Jan, 22 Apr - 2 May.
Hi there. Unfortunately the Linaro GCC 4.5 2011.01-0 release has a
failure in the x86_64 compiler causing it to fail during the initial
build. We're working on triaging the problem at the moment.
ARM and i386 targets are not affected. I'll send a new announcement
when the problem has been fixed and the replacement 2011.01-1 release
is available.
-- Michael
The Linaro Toolchain Working Group is pleased to announce the release
of both Linaro GCC 4.4 and Linaro GCC 4.5.
The Linaro Toolchain Working Group is pleased to announce the release of
both Linaro GCC 4.4 and Linaro GCC 4.5.
Linaro GCC 4.4 is the sixth release in the 4.4 series. Based off the
latest GCC 4.4.5, it is a maintenance release that fixes one problem
found through use.
Linaro GCC 4.5 is the sixth release in the 4.5 series. Based off the
official GCC 4.5.2 release, it includes many ARM-focused performance
improvements and bug fixes.
Interesting changes include:
* Improved optimization of multiple load instructions, and
multiply-and-accumulate.
* -fshrink-wrap optimization for better use of function prologues and
epilogues.
* plus, various other bug fixes, and minor improvements.
The source tarballs are available from:
https://launchpad.net/gcc-linaro/+milestone/4.5-2011.01-0
and
https://launchpad.net/gcc-linaro/+milestone/4.4-2011.01-0
Downloads are available from the Linaro GCC page on Launchpad:
https://launchpad.net/gcc-linaro
Hi,
* Continued with testing and implementation of reduction support in SLP
* Found a major problem in vectorization of if-converted data
accesses. Looked into other ways to solve the problem.
* Spent some time on non-Linaro vectorization plans
* Unsuccessfully tried to make the board work
Ira
== Last Week ==
* Displaced stepping support for 32-bit Thumb insns are done, unless
new bugs are found.
Decode writeback for str/ldr in Thumb 32-bit. Fix bugs in supporting
LDR/STR/LDRH/STRH in both ARM and Thumb.
Test GDB to handle various 32-bit Thumb insns in displaced stepping.
Code cleanup and write some new test cases.
* One day New Year holiday on Jan. 3rd.
== This Week ==
* Linaro Sprint.
* Get patches for Linaro GDB bugs approved on upstreams as many as
possible.
--
Yao Qi
Hello,
While testing SMS on Crotex-A9 I see that the latency of load instruction
is 1
cycle when compiling with -mcpu=cortex-a9 -mthumb -mtune=cortex-a9 -O3.
Below is a snippet from the SMS dump file showing the DDG, created for the
loop in foo function, which depicts the edge between the load of input[i]
(insn 181) and the mult instruction (insn 184).
[181 -(T,1,0)-> 184] is the true dependence edge created between the
two insns; with latency of 1.
On Crotex-A8 the latency of the load is 3 as expected.
I've read in crotex-a9.md file that loads should have a latency of 4 cycles
so I just wanted to check if I should have used other combination of flags
for Crotex-A9 or the load latency should indeed be of 1 cycle here.
Thanks,
Revital
int foo (int max, signed short *input, int y)
{
int i, accum;
for (i = 0; i < max; i++) {
accum += (signed int) input[i] * (signed int) input[i+y];
}
return accum;
}
The snippet from the DDG:
Node num: 2
(insn 181 178 184 13 (set (reg:SI 216 [ D.2019 ])
(zero_extend:SI (mem:HI (plus:SI (reg:SI 319 [ ivtmp.34 ])
(reg:SI 345)) [2 MEM[base: D.2076_257, index:
D.2079_226, offset: 0B]+0 S2 A16]))) tmp.c:7 714
{*thumb2_zero_extendhisi2_v6}
(nil))
OUT ARCS: [181 -(A,0,1)-> 176] [181 -(T,1,0)-> 184]
IN ARCS: [184 -(A,0,1)-> 181] [176 -(T,1,0)-> 181]
Node num: 3
(insn 184 181 234 13 (set (reg/v:SI 209 [ accum ])
(plus:SI (mult:SI (sign_extend:SI (subreg/s/u:HI (reg:SI 212
[ D.2013 ]) 0))
(sign_extend:SI (subreg/s/u:HI (reg:SI 216 [ D.2019 ]) 0)))
(reg/v:SI 209 [ accum ]))) tmp.c:7 64 {maddhisi4}
(expr_list:REG_DEAD (reg:SI 216 [ D.2019 ])
(expr_list:REG_DEAD (reg:SI 212 [ D.2013 ])
(nil))))
== Last Week ==
* Worked to improve libunwind test suite results using new ARM-specific
unwinding. Managed to reduce the number of failures, but some tests
still fail.
* Tested latest ltrace upstream tree and determined that It Is Good.
Hopefully, this marks the cusp of a new release that includes my work.
== This Week ==
* Finish up work with libunwind. Update wiki brain dump.
--
Zach Welch
CodeSourcery
zwelch(a)codesourcery.com
(650) 331-3385 x743
== Last week ==
* Some discussion about libffi variadic function support. The current
proposed design is an additional API function to prepare a new call
interface (CIF) structure with the argument number settings for each
variadic call site. This IMHO, is slightly less flexible that doing a
call-time placement style interface, but should be faster and easier to
implement.
* Collecting a few cases of ARM/Thumb-2 integration in the GCC ARM backend.
* Prepared and traveled to Dallas for the Linaro sprint.
== This week ==
* The Linaro sprint.
== Linaro GCC ==
* Continued hacking on (cleaning up, testing) NEON
element/structure load/store (etc.) intrinsics-improvement patch. This
now passes all the auto-generated neon.exp tests (admittedly only a
very weak test for correctness), and the small number of hand-written
tests I threw at it, so hopefully we can declare victory soon. (This
patch provides huge improvements to generated code for the "fancy" NEON
load/store intrinsics and vtbl/vtbx lookup instructions, and hopefully
also provides part of the solution for allowing the vectorizer to use
the fancy loads/stores also.)
== Misc ==
* Public holiday Monday.
== GCC ==
Pulled down new commits from upstream GCC. My test build failed due to a
new cross-build configure problem. Found the problem in GCC Bugzilla,
and rolled back to the revision before the problem one. Those sources
built ok, so I've pushed the changes to Linaro GCC 4.6 branch. It it now
updated to 31st December.
Discussed cross-build problems with Alexander Sack on IRC.
Merged, tested and pushed all the outstanding Launchpad merge requests
into GCC 4.4 and 4.5.
Merged, tested and pushed all new CS SG++ patches into Linaro GCC 4.5.
Merged, tested and pushed GCC 4.5.2 from upstream into Linaro 4.5.
Spun releases for both Linaro GCC 4.4 and 4.5.
Brought the Linaro patch tracker up to date.
== Other ==
Catch up with lots of email following my two week holiday.
Updated some of the CS patch tracker.
Travel to Dallas for the Linaro sprint.
-------
Next week (10th-14th Jan):
Linaro sprint in Dallas
Following week (17th-21st Jan):
CS annual meeting in Phoenix
Got h264ref working in SPEC; this was another signed-char issue (very
difficult one to find - it didn't crash, it just came out with subtly
the wrong result);
compiling that and sphinx3 with -fsigned-char and they seem happy.
With Richard's fix for gromacs that leaves just the zeusmp binary
that's too large to run on silverberry; it seems to startup on canis1
(that has more RAM).
So that should be a full set; I just need to get the fortran stuff
going on canis1.
Kicked off discussion on libffi-discuss about variadic calls; people
seem OK with the idea of adding it (although unsure exactly how many
things really use it
- even though there are examples in Python documentation). Also
kicked off some of the internal paperwork to contribute code for it.
Dave
== This week ==
* Away Monday, and a fair bit of time on non-Linaro duties.
* Looked at Dave's gromacs bug (693502). Turned out to be a reload
inheritance problem. Tested a patch. Spent some time coming up with
a brute-force testcase that I can submit with the patch.
* Found a bug in the x86 and x86_64 ifunc support that would affect
ARM too if we weren't careful. Came up with an example testcase
and filed the bug upstream:
http://sourceware.org/bugzilla/show_bug.cgi?id=12366
It has been fixed by H.J. Lu. I'm wondering about taking a
slightly different approach for ARM.
* More ifunc work.
* Tried again to reproduce the chrome failure, making sure to use
DEB_BUILD_HARDENING=1. (I hadn't realised first time round that,
in reaction to this bug, debian/rules specifically excluded armel
from the automatic DEB_BUILD_HARDENING=1 setting.) The build
takes a couple of days on my BeagleBoard.
Heh, and just as I wrote that, the build failed with the reported
link error. Neat.
== Next week ==
* Finish testing the patch for 693502 and submit it upstream.
Backport the patch to our tree once accepted.
* Look at the chromium problem.
* More ifunc.
Richard
RAG:
Red:
Amber:
Green: qemu git pull request accepted, patches seem to be flowing
into qemu upstream more freely now
Milestones:
| Planned | Estimate | Actual |
finish virtio-system | 2010-08-27 | postponed | |
finish testing PCI patches | 2010-10-01 | 2010-10-22 | 2010-10-18 |
successful ARM qemu pull req | 2010-12-16 | 2010-12-16 | 2010-12-16 |
finish qemu-cont-integration | 2010-01-25 | 2010-01-25 | |
Bonus extended holiday edition:
This report includes a number of things that happened over
the Christmas holidays as well as this week (which is
a short one, only 3 days).
Progress:
* merge-correctness-fixes:
** my git pull request for various ARM qemu patches was merged!
** a number of other patches were merged to qemu master:
+ implement correct NaN propagation rules
+ rename softfloat float*_is_nan() functions
+ fix UMAAL (Aurelien's patch, reviewed by me)
+ VQSHL (reg) patchset
** diagnosed the segfault in
https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/604872
and wrote a patchset which fixes it (and related problems):
http://patchwork.ozlabs.org/patch/77887/ (n/7)
** wrote and posted a patchset which implements save/restore
for the versatile platform (so I didn't have to wait 10
minutes for the test case to reach the segfault)
http://patchwork.ozlabs.org/patch/76529/
** finished and posted a patchset implementing flushing of
denormals to zero on input:
http://patchwork.ozlabs.org/patch/77798/ (n/3), now committed
** reviewed/tested Aurelien's SMMLA/SMMLS patch (now committed)
** wrote and posted patches which implement the FS_IOC_FIEMAP
ioctl (http://patchwork.ozlabs.org/patch/77725/) and
the file_sync_range{,2} syscalls
(http://patchwork.ozlabs.org/patch/77723/) -- these are used
by apt, and so linaro-media-create was generating a lot of
warnings from qemu about their lack of implementation
** posted patch to clean up NaN handling in linux-user NWFPE
emulation (follow-on from earlier NaN cleanups):
http://patchwork.ozlabs.org/patch/77795/
* maintain-beagle-models:
** implemented minimal ARM cp14 debug registers
and a random register in the TWL4030 emulation (both needed
to get recent Linaro kernels to boot on the beagle model)
Submitted merge request upstream:
http://meego.gitorious.org/qemu-maemo/qemu/merge_requests/3
Current qemu patch status is tracked here:
https://wiki.linaro.org/PeterMaydell/QemuPatchStatus
Meetings: toolchain standup, pdsw doughnuts
Absences:
2011: Dallas Linaro sprint 9-15 Jan. Holiday 21 Jan, 22 Apr - 2 May.
Hi,
* implemented reduction support in SLP, I'll check if it helps
DenBench next week
* helping Sebastian Pop with if-conversion for vectorization
improvements (BTW, Sebastian's goal is to vectorize kernels from
ffmpeg)
* fixed GCC PR47139
Ira
Hi All,
Thanks for attending the call. I think we had some interesting discussions.
I've posted the minutes from the call on the same page as before:
https://wiki.linaro.org/AndrewStubbs/Sandbox/GCCoptimizations
I'll try to get the audio posted somewhere for anybody that's interested.
Andrew
You may have noticed that I have created a new BZR/Launchpad branch for
Linaro GCC 4.6:
lp:gcc-linaro/4.6
https://code.launchpad.net/~linaro-toolchain-dev/gcc-linaro/4.6
Up until now, this has not been buildable due to unfixed bugs. However,
upstream GCC have now straightened out the problems, so I have pushed a
buildable version into the branch.
I shall attempt to keep this branch as up-to-date as I can (at least, I
will once the holiday season and January travel are over), but I'll only
push updates if they build for me, so hopefully the branch should remain
fairly stable, at least for our purposes.
Note that so far I've only tested build-ability. Right now I'm not
making any promises about the quality of the compiler.
At some point, we'll want to use this branch to hold our own patches
(both those that will never go upstream, and those that are queued for
GCC 4.7), so it will diverge from upstream 4.6 a bit. For the moment,
it's merely a mirror.
Andrew
== Last Week ==
* Got a new ARM-specific unwind test case working, so its integration
with libunwind stands a marginally better chance of working (once it's
finally finished).
* Sent a ping to binutils mailing list about a patch to improve readelf
that I had sent in at the beginning of December. Still no response.
* Holidays and Vacation
== This Week ==
* Try to finish libunwind integration, as my time with Linaro is nearly
over. Update documentation on wiki to reflect current status.
* Ping ltrace list about a new release. It seemed so close, then the
list went abruptly silent.
--
Zach Welch
CodeSourcery
zwelch(a)codesourcery.com
(650) 331-3385 x743
== Linaro GCC ==
* Continued looking at element/structure load/store intrinsics
improvements. Some good initial results: it looks like the plan of
using "extra-wide" vectors for returning struct results works fine (at
least to a first approximation). Sent off WIP patch (internally to CS
only, so far).
Incidentally this looks like it'll be a good stepping-stone for the
"RTL half" (vs. the "tree half") of the representation of
element/structure loads/stores (e.g. vld2/vst2) also. The return type
(for loads) and argument type (for stores) of the RTL patterns for such
instructions is changed from the current wide-integer representation
(OImode, etc.) to a suitable wide vector instead (e.g. V16QImode). This
change might help lead to a more meaningful mapping from an equivalent
tree form -- though we haven't quite got the whole picture yet, as the
middle-end won't want to know about the ARM-specific and
non-standard-named patterns for the element/structure loads/stores.
(One might imagine a new standard-named RTL expander taking care of that
though.)
== Vacation ==
* Vacation Dec 20th-Jan 4th.
== GCC issues ==
* PR44557, Thumb-1 ICE, looked at the ARM specific secondary reload
parts, as well as some general reload internals context. Concluded that
concerns on Thumb-2 about my submitted patch should be unneeded, as the
reload_in/out patterns should never be used for Thumb-2. Also looked a
bit on how we should upgrade ARM to use TARGET_SECONDARY_RELOAD.
* PR45416, ARM code regression. Started working on this again, cleaning
up patch to submit.
* Had some email discussion with Revital Eres on Swing Modulo Scheduling
(SMS) for ARM issues, mainly on how the doloop_end pattern should be
done on ARM.
* Submitted and committed an obvious small patch for a VFP testsuite case.
== This week ==
* Continue on GCC issues.
* Flying to Dallas on Sunday, prepare for trip.
Hi,
* continued with my attempts to vectorize Viterbi:
- finished implementation of conditional store sinking in cselim
pass (I did only limited testing).
- reconsidered the idea of safe load if-conversion if an adjacent
field of the same structure is accessed unconditionally - this may be
incorrect. Instead I tried the last, not yet committed, patch by
Sebastian Pop that implements if-conversion for such cases of not-safe
data accesses. His patch if-converts the loop in Viterbi, however, it
also makes the loop not vectorizable - additional work should be done
in the data-refs analysis and the vectorizer to make it work.
Sebastian is working on the first part, and I'll help him with the
vectorizer part if necessary.
* analyzed EEMBC DenBench, couldn't find any action items for now. But
vld/vst support of strided data accesses should be very useful for
these benchmarks.
* fixed GCC PR testsuite/47057
* looking into SLP of reduction as in PR 41881. I saw similar patterns
several times in DenBench, but I'm not sure that SLP of reduction is
enough to vectorize all of these cases.
Happy New Year,
Ira
== Last Week ==
* Continue with libunwind. Wrote a new unit test for ARM-specific
unwinding code to help debug that new code's problems. Almost got it
working, which I hope means its integration with libunwind may be
nearing completion.
== This Week ==
* Try to finish ARM-specific improvements to libunwind. Famous Last Words.
--
Zach Welch
CodeSourcery
zwelch(a)codesourcery.com
(650) 331-3385 x743
== GCC related ==
* Launchpad #693686, GCC ARM segfault ICE when building Chromium in V8.
Spent some time reproducing; this ICE seems to be in the maverick
gcc-4.5, at the vectorizer phase. As the ICE happens in
tree-vect-stmts.c:supportable_widening_operation(), I'm suspecting
(without further verification yet) this might be due to vmovn not
backported? (Linaro 4.5 does has this ported I think)
* PR44557, Thumb-1 ICE. Looking further after seeing Richard Earnshaw's
comment on my patch. It would be nice if we could upgrade the entire
secondary reload bits, looking into this.
== This week ==
* Look into more GCC issues.
* Get some backports done.
== Linaro GDB ==
* LP:615972
Get patch approved upstreams. Committed to FSF tree. Propose merge
request to Linaro GDB tree.
* LP:616003 gdb.mi/mi-var-display.exp failure
Discussed in upstreams on how to handle fp in ARM/Thumb mode. Finally
work out a one-line patch. Approved and committed to FSF tree. Propose
merge request to Linaro GDB tree.
Draft another patch to clean up ARM register alias. Pending on upstreams.
* LP:616000 Handle -fstack-protector prologue code
Revise patch per Joel's comments. Approved, and committed to FSF tree.
Draft two patches to handle -fstack-protector prologue code on i386.
Sent them out for review. Due to lack of knowledge on i386 prologue
generate, not very confident on one of these patches.
* LP:615980 Support displaced stepping on Thumb
Get my test case to arm displaced stepping approved, and committed to
FSF tree.
A patch about supporting displace ARM insn in Thumb area is pending
upstreams. Tried the 2nd approach since the 1st approach is not
acceptable to upstreams reviewers. Without this patch, ARM displaced
stepping doesn't work on Linaro.
Support another three PC-related 16-bit Thumb insns (adr, ldr, and
cbz), and add test cases for them accordingly.
Spend some time splitting my big patch into three relatively small
patches in order to make them easier to be reviewed. Patches on
supporting Thumb 16-bit displaced stepping are sent out upstreams for
review.
== This Week ==
* Work from Mon. to Wed.
** Backport some approved upstreams patches to Linaro GDB
** Anything I should do for my pending patches.
* Vacation on Thu. and Fri. 3rd Jan. is China public holiday. Back to
work on 4th. Jan.
--
Yao (齐尧)
Khem Raj <raj.khem(a)gmail.com> wrote:
> The bug http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46883 files
> against GCC trunk also happens with linaro gcc 4.5
> My guess is that there is a backported patch from trunk into linaro
> 4.5 tree thats causing this ICE
>
> This ICE does not happen on upstream gcc-4.5 branch
Thanks for the bug report!
> I havent figured out the commit yet.
It looks like the regression was introduced by Bernd Schmidt's
patch to improve zero-/sign-extensions (PR 42172), which we
did indeed backport to Linaro GCC 4.5. (I've updated the
PR 46883 bugzilla with more details.)
> Should you need a bug in linaro
> bug tracker I will be happy to file one
Yes, please do so; this makes it easier to track the problem
on the Linaro side. Thanks!
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E.
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk
Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294
== GCC ==
* Checked in mainline fix for #617384 and submitted backport merge
requests (.debug_line is wrong with -fpic)
* Submitted backport merge requests for the fix for #662324
(Pointer type information lost in 4.5 debuginfo)
* Checked in mainline fix for #693425 and submitted backport merge
request (SPU back-end incompatible with extension elimination pass)
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E.
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk
Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294
Hi,
I was on vacation on Sunday and starting from Tuesday stayed home with
a sick child, so I only had a couple of days to work.
* vectorization of Viterbi:
- continued implementing conditional store sinking in cselim pass
- made if-conversion to work on loads of structure fields if other
field from the same structure is accessed unconditionally
* fixed GCC PR 47001.
Ira