Hi there. The 2011.08 release has been spun and is testing up well.
The 4.5 and 4.6 branches are now open so feel free to commit any
approved patches.
-- Michael
> . Would you be interested in adding a Firefox-based benchmark? As a large
> application it is a good testbed for LTO, FDO and other aggressive
> optimizations.
Sorry about the delayed response. I did notice your mail last week but
I was busy with our conference and then the first couple of days this
week have just disappeared with some internal training.
I would be interested in hearing how you get on with LTO and FDO on
ARM. Listening to Honza talking at the GCC unconference in London
about the memory usage for full LTO with trunk I did wonder what would
happen if we tried it on the ARM target to see what we got, but I
never managed to get around to trying anything there :) . We did look
at getting FDO working with Linaro GCC last cycle but there are still
a couple of issues with PGO in Linaro GCC 4.5.
With respect to LTO , the one problem we have currently is that the
Neon intrinsics aren't streamed out and streamed back in. So you might
have a few issues if your code uses arm_neon.h .
https://bugs.launchpad.net/gcc-linaro/+bug/823548 is an example of
this problem. This was fixed upstream and we probably just need to
backport that into our 4.6 tree. I've tried a backport this morning
and I think I have this right finally.
If you could do a build and a firefox benchmark run in about 30-60
minutes by all means please do let us know how you get on and what you
find. We've been steadily trying to improve the performance of the ARM
toolchain and the biggest improvements you'll notice will be with the
vectorizer but there will be other small improvements that you'll
notice in other general areas of code generation. We would be
interested in feedback about what can be done and to add to our queue
of things to look at and improve for the ARM port of GCC.
With respect to the images, Kiko's probably answered that bit.
cheers
Ramana
* GCC
Continued tracking down problems in my various broken patches. Fixed one
bug, investigated two more. Re-submitted the widening multiplies for
testing, and this time it returned with no problems. Yay, I can now
check it in next week.
Merged from upstream GCC 4.5. The launchpad import bug still exists
(although should not for much longer) so I had to ask on #launchpad to
get the imports done. Submitted the merged branch for testing.
Tried to merge GCC 4.6 similarly, but failed. Bzr just refused to play
ball, which was very frustrating. Michael Hope has now done the merge
instead.
* Other
On leave Wednesday and Friday.
* libauqntum - running the SMSed version on ARM machine did not show
significant improvement. Discussed it with Richard Sandiford.
Apparently in the SMS phase the instructions are of DI mode due to the
fact the loop contains 64 bit operations while they later been
generated as 32 bit operations. This makes SMS less accurate and I'm
now looking into a version which disables DI mode operations.
* Started to look at the potential of SMS on libav. Initial runs of
Richard's microbenchmarks with SMS show some regressions as well as
improvements that I'm looking at.
Hi there. I've written up the standard configurations that we use to
build and test Linaro GCC:
https://wiki.linaro.org/WorkingGroups/ToolChain/Configurations/GCC
It includes such things as flags, libraries, and sysroots. You might
find it useful to see what we're testing or, if new to compilers, what
a good starting point is.
-- Michael
== QEMU ==
* Finished off a first cut of the 64bit helper patch to QEMU
- Gave it to Peter and have reworked most of the things he commented on
* This also lead into a bit of a rabbit hole of finding various
generic QEMU threading issues
* Tested Peter's 11.08 QEMU release
(I used linaro-fetch-image-ui for the first time to grab the
release images; quite nice, hit
a couple of issues but much nicer than crawling around the site
to find where the hwpacks
are).
== Other ==
* Pinged gcc patches list for more comments on 64bit atomic patch
I'm on holiday the week of 22nd (i.e. the week after next).
Dave
== GDB ==
* Re-tested Linaro GDB 7.3 on Versatile Express (native
& remote testing).
* Committed patch to re-enable remote thread test cases
(#804401) to mainline and Linaro GDB 7.3.
* Reviewed Yao's latest Thumb-2 displaced stepping patch.
== GCC ==
* Patch review week.
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E.
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk
Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294
RAG:
Red:
Amber: OMAP3 patch upstreaming is slower progress than hoped
Green:
Current Milestones:
|| || Planned || Estimate || Actual ||
||qemu-linaro 2011-08 || 2011-08-18 || 2011-08-18 || ||
Historical Milestones:
||qemu-linaro 2011-04 || 2011-04-21 || 2011-04-21 || 2011-04-21 ||
||qemu-linaro 2011-05 || 2011-05-19 || 2011-05-19 || n/a ||
||close out 1105 blueprints || 2011-05-28 || 2011-05-28 || 2011-05-19 ||
||complete 1111 planning || 2011-05-28 || 2011-05-28 || 2011-05-27 ||
||qemu-linaro-2011-06 || 2011-06-16 || 2011-06-16 || 2011-06-16 ||
||qemu-linaro-2011-07 || 2011-07-21 || 2011-07-21 || 2011-07-21 ||
== linaro-qemu-11.11 ==
* put together release candidate tarball for 2011.08 release, tested
* added a workaround for omap kernel bug LP:727781 which had been fixed
in 2.6.x but has resurfaced in 3.0
* tarball now ready and only needs releasing next week
== 64-bit-sync-primitives ==
* reviewed David Gilbert's qemu patches to support 64 bit sync primitives
== upstream-omap3-patches ==
* testing/reading Avi's memory API patches to see how they fit in or
clash with the qdevification and other omap3 patches
== other ==
* more investigation/thought about LP:823902 -- qemu bug running
multithreaded programs in linux-user mode
* Manned Linaro demo stand at ARM Partner Meeting (Tue, Wed)
* Meetings: GSoC student x2, toolchain, toolchain standup, 1-2-1
Current qemu patch status is tracked here:
https://wiki.linaro.org/PeterMaydell/QemuPatchStatus
Absences:
15-19 August: KVM Forum and LinuxCon NA, Vancouver
== GCC ==
=== Progress ===
* Linaro sprint last week - one day of fun with broken laptop.
* Looked at how we could get BUILTIN_VECTORIZE_CONVERT work to allow
vectorizing some of the floating point conversions.
* Fixed PR50022 . Couple of iterations.
* Internal training for 2 days.
* Dusted off a couple of my old patches and sent them out after testing.
* Next to get back to old VFP and ivopts patch.
* Looked at a testfailure with -mvectorize-with-neon-quads with Ira .
=== Plans ===
* Continue to look at the test failure with mvectorize-with-neon-quad
* Finish off optimize_size patch based on comments.
* finish off case for handling tbh instrucitons.
* Commit fix for PR50022
* Look at some of the issues with VFP moves and try and get forward with it.
* Look at BRANCH_COST results.
Meetings:
* 1-1s
* TCWG calls
* GNU Toolchain planning meeting.
* Some patch review and bugzilla triaging.
Absences.
* 1st Aug - 5th August - Linaro sprint.
* 8th - 9th August - Internal training.
* 29th Aug - Sept. 2 - Holiday booked and approved.
* 31st Oct - 4th Nov - Linaro Summit Orlando - Travel to be booked.
== This week ==
* Looked a bug report that the fix for LP #736007 had caused regressions
on powerpc-darwin. It turned out to be a target-specific bug; the
backend has the same const_vector code as i386 and spu, but the fix for
PR34856 was never applied there. I'll submit the patch (and backport to
Linaro 4.6) once the bug submitter has had a chance to test it.
* Experimented with -falign-loops. Found that it triggered a bug in the
ARM minipool layout code. Posted patch upstream and committed.
Backported to 4.6.
* Committed patch to allow globs in define_bypass.
* Updated auto inc/dec patch after comments from Bernd and Stephen.
I'm pretty happy with it now, but there are a couple of prerequisite
patches I need to sort out first.
* Started getting those prerequisites ready.
* Decided that we needed something a bit more subtle than my original
insn_rtx_cost patch: at the moment, we simply don't use rtx costs
for lvalues. Wrote a series of patches to "improve" the rtx_cost
interface, including providing the outer operand number and an
indication of whether the rtx is an lvalue or an rvalue.
* Upgraded my laptop. This turnted out to be more eventful than
anticipated, and ended up taking a whole day.
== Next week ==
* Post auto inc/dec preparatory patches for review. Hopefully post
an RFA for the pass itself.
Richard