== This week ==
* Applied patch for doing NEON high/low extraction using subregs.
Ramana pointed out that we do the same thing for insertion,
so I wrote a patch to handle that too. Both now merged into
Linaro sources.
* Looked at ARM bootstrap problem on trunk. Turned out to be
an aliasing problem. Submitted and applied patch.
* Reworked part of my SMS register-scheduling patch after feedback
from Ayal. Submitted new version upstream.
* Got SPEC2006 running on the powerpc boxes and tested one part
of my -fsched-pressure patch. Bit of a mixed bag. h264ref was
one of the worst sufferers, which was a bit worrying. I think
I'll need to make a third change too.
To recap, there are two pieces now:
1) Make -fsched-pressure honour the DFA
2) Make -fsched-pressure allow values that are live across a
loop to be spilled.
I naively hoped that (1) would be OK on its own, but h264 shows
that the current -fsched-pressure code is very conservative
when it comes to large blocks. It only considers register
deaths once there is a single remaining use; if there are two
unscheduled uses, it assumes that the register remains live
for the rest of the block.
So the problem that (1) was fixing was that -fsched-pressure was too
optimistic in terms of what it could schedule in a cycle. But with
that fixed, we seem to have too many sources of pessimism...
Richard
== GDB ==
* Worked on support for cross-platform core file generation.
* Followed up on patch to support disabling address space
randomization in gdbserver.
== GCC ==
* Followed up on patch for PR 50305.
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E.
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk
Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294
* Working on croos-compiling Firefox. Getting dependencies in place and
setting up the configuration file (.mozconfig). I have had the strategy to
fix one dependency at a time, picking prebuilt packages or building my self.
Michael told me at yesterday's meeting about multistrap, that could possibly
be used for fixing all dependencies at once. I will look into that next
week.
* During this process I have also spent some time reading up on cross
compilation in general and also on autoconf and the GNU build system.
Best Regards
Åsa
== String routines ==
* Got eglibc testing setup happy at last
- Note that -O3 builds generally seem to give a few more errors
that are probably worth looking at
- -march=armv6 -mthumb hit some non-thumb1 instructions (normally
non-lo registers), again worth looking at
- Cross testing to Qemu user mode often stalls, mostly on nptl
tests that abort/fail when run in system/natively
* Sent new version of eglibc/memchr patch upstream
* Now have working newlib test setup and reference set
- next step is to try adding my memchr there
== Other ==
* Testing a QEmu patch with Peter
* Looking at bug 861296 (difference in mmap layouts)
* Adding a few suggestions to the set of cpu hotplug tests.
* Dealing with the Manchester lab cold.
Short week; back on Monday
Dave
== GDB ==
* Committed hardware watchpoint support for gdbserver to mainline,
including two minor changes resulting from review comments;
backported those fixes to Linaro GDB as well.
* Implemented and tested support for disabling address space
randomization in gdbserver; patch posted for review.
* Investigated support for cross-platform core file generation.
== GCC ==
* Patch review week.
* Posted updated patch for PR 50305.
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E.
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk
Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294
Arnd Bergmann <arnd(a)arndb.de> wrote on 08/26/2011 04:44:26 PM:
> On Thursday 25 August 2011, Russell King - ARM Linux wrote:
> >
> > Arnd, can you test this to make sure your gdb test case still works,
and
> > Mark, can you test this to make sure it fixes your problem please?
>
> Hi Russell,
>
> The patch in question was not actually from me but from Ulrich Weigand,
> so he's probably the right person to test your patch.
> I'm forwarding it in full to Uli for reference.
Hi Arnd, hi Russell,
sorry for the late reply, I've just returned from vacation today ...
I've not yet run the test, but just from reading through the patch
it seems that this will at least partially re-introduce the problem
my original patch was trying to fix.
The situation here is about GDB performing an "inferior function
call", e.g. via the GDB "call" command. To do so, GDB will:
0. [ Have gotten control of the target process via some ptrace
intercept previously, and then ... ]
1. Save the register state
2. Create a dummy frame on the stack and set up registers (PC, SP,
argument registers, ...) as appropriate for a function call
3. Restart via PTRACE_CONTINUE
[ ... at this point, the target process runs the function until
it returns to a breakpoint instruction and GDB gets control
again via another ptrace intercept ... ]
4. Restore the register state saved in [1.]
5. At some later point, continue the target process [at its
original location] with PTRACE_CONTINUE
The problem now occurs if at point [0.] the target process just
happened to be blocked in a restartable system call. For this
sequence to then work as expected, two things have to happen:
- at point [3.], the kernel must *not* attempt to restart a
system call, even though it thinks we're stopped in a
restartable system call
- at point [5.], the kernel now *must* restart the originally
interrupted system call, even though it thinks we're stopped
at some breakpoint, and not within a system call
My patch achieved both these goals, while it would seem your
patch only solves the first issue, not the second one. In
fact, since any interaction with ptrace will always cause the
TIF_SYS_RESTART flag to be *reset*, and there is no way at all
to *set* it, there doesn't appear to be any way for GDB to
achive that second goal.
[ With my patch, that second goal was implicitly achieved by
the fact that at [1.] GDB would save a register state that
already corresponds to the way things should be for restarting
the system call. When that register set is then restored in [4.],
restart just happens automatically without any further kernel
intervention. ]
One way to fix this might be to make the TIF_SYS_RESTART flag
itself visible to ptrace, so the GDB could save/restore it
along with the rest of the register set; this would be similar
to how that problem is handled on other platforms. However,
there doesn't appear to be an obvious place for the flag in
the ptrace register set ...
Bye,
Ulrich
== String routines ==
* Having got agreement on ignoring the triplet for picking the
routine, I'm just testing a patch,
but fighting a qemu setup.
* Found the binfmt binding for armeb was wrong (runs the le
version); filed bug with fix in
Dave
==GCC==
Combined report for last 2 weeks -
===Progress===
* Committed conditional compares patch to Linaro GCC 4.6
* Looking at modelling auto-inc-decs better .
* Tried patch for PR19599 and that broke bootstrap with a segfault.
Needs some re-engineering.
* Looked at the latest bootstrap failure on trunk. Still narrowing down.
* Some work on some administrative stuff
* Bit of patch review.
* Went for LLVM dev meeting.
* Release week had a few issues and helped dry-run cbuild spawns of
jobs and think I now know how to do that.
=== Plans ===
* finish looking at bootstrap failure.
* Finish auto-inc-dec patch.
* some more patch review.
* Send out LLVM dev meeting report.
Absences.
* 5th October - Out of office.
* 13th -14th October - Internal ARM training.
* 31st Oct - 4th Nov - Linaro Summit Orlando
* 08 Nov - 11 Nov - Tentatively booked
* Dec 19 - 31st Dec - Tentatively booked
(short week: 4 days)
RAG:
Red:
Amber:
Green:
Current Milestones:
|| || Planned || Estimate || Actual ||
||add-omap3-networking || 2011-10-13 || 2011-10-13 || ||
||a15-systemmode-planning || 2011-10-13 || 2011-10-13 || 2011-09-22 ||
||a15-usermode-support || 2011-11-10 || 2011-11-10 || ||
||upstream-omap3-cleanup || 2011-11-10 || 2011-11-10 || ||
Historical Milestones:
||qemu-linaro 2011-04 || 2011-04-21 || 2011-04-21 || 2011-04-21 ||
||qemu-linaro 2011-05 || 2011-05-19 || 2011-05-19 || n/a ||
||close out 1105 blueprints || 2011-05-28 || 2011-05-28 || 2011-05-19 ||
||complete 1111 planning || 2011-05-28 || 2011-05-28 || 2011-05-27 ||
||qemu-linaro-2011-06 || 2011-06-16 || 2011-06-16 || 2011-06-16 ||
||qemu-linaro-2011-07 || 2011-07-21 || 2011-07-21 || 2011-07-21 ||
||qemu-linaro 2011-08 || 2011-08-18 || 2011-08-18 || 2011-08-18 ||
||qemu-linaro 2011-09 || 2011-09-15 || 2011-09-15 || 2011-09-15 ||
== a15-system-mode-planning ==
* now complete: we have generated blueprints/roadmap cards for the TSC
for the various options
== a15-usermode-support ==
* tested udiv/sdiv implementation
* fused mac: rough idea of what needs to be done, need to get all
the fiddly details right
== omap3 upstreaming ==
* rebased and sent pullreq for various outstanding patches
== other ==
* code/design walkthrough for upstream's new memoryregion API
* working on lightning talk for pdsw doughnut session next week
* investigated compile failure building QEMU in thumb mode with debug
enabled (we're trying to use the Thumb framepointer register as a
temporary...)
* meetings: toolchain, standup, 1-2-1
Current qemu patch status is tracked here:
https://wiki.linaro.org/PeterMaydell/QemuPatchStatus
Absences (to end of year):
Sep 29-Oct 07, Oct 17, Nov 21, Dec 15-Jan 03: leave
Oct 30-Nov 04: Linaro Connect Q4.11
== This week ==
* Submitted a fix for the performance regression caused by my
arm_comparison_operator patch. Applied upstream after approval
from Ramana (thanks). Will backport to Linaro towards the end
of next week if there are no reported problems.
* Went back to looking at -fsched-pressure. To recap, a colleague
ran SPEC for s390 comparing:
(a) normal -O3 based flags
(b) (a) + -fsched-pressure without my patch
(c) (a) + -fsched-pressure with my patch
(c) got the best geomean result, but there were some individual
tests for which (b) was significantly worse than (a), and for
which (c) only partly closed the gap.
Found one problem. It looks like -fsched-pressure only really
operates on the issue rate and instruction latencies; it doesn't
seem to use the DFA. This seems to be unintentional, and fixing
it showed some nice results.
Also, the -fsched-pressure patch that I wrote at Connect set the
starting pressure based on the set of registers that are both live
on entry to the block _and_ used within the enclosing loop,
This still seems to be a bit too conservative, in that it makes
the scheduler go out of its way to preserve loop invariants,
even if there are too many of them. Experimented with changing
"used" to "defined". This too seemed to be a win.
* Got access to some PowerPC GNU/Linux machines that are suitable
for running SPEC. Set up my account there and got SPEC building.
The idea is to use this to get more cross-target evidence for the
-fsched-pressure submission(s).
* Discussion about the SMS register-scheduling patches after great
feedback from Ayal. While drafting a still-unsent reply justifying
the main part of the patch, I found I was also explaining why another
part of the patch (specifically the prologue/epilogue part) was wrong.
Thought about that a bit today.
* Submitted fix for LP 641126.
== Next week ==
* More SMS register scheduling.
* More -fsched-pressure.
* Hopefully remerge the arm_comparison_operator patch with this week's fix.
Richard
* Working on getting everything in place for cross-compiling Firefox for
ARM. Trying to understand how the configuration script and make file works.
* Working on a test that will run Sunspider and extract the results. The
challenging part is that results are only presented on the page, not e.g.
written to stdout or to file. My approach to create an html file, embed the
page with the test in an iframe, and read out the results when the test is
done.
* Running SPEC2K on the Snowball board. An updated kernel solved the issue
with great variations in the test results. Some tests results look a bit
strange, so I will look at what those tests do to see what part of the
system is stressed.
Best Regards
Åsa
Hi,
* widening shifts patch - submitted upstream
* change default vector size patch - submitted to linaro-gcc
* automatic choice of vector size for basic block vectorization - testing
* vectorizer bug fixes
Next week we have New Year holiday on Wednesday (half day) and Thursday.
Ira
The Linaro Toolchain Working Group is pleased to announce the 2011.09
release of both Linaro GCC 4.6 and Linaro GCC 4.5.
Linaro GCC 4.6 2011.09-1 is the seventh release in the 4.6 series. Based
off the latest GCC 4.6.1+svn178681, it contains a range of vectoriser
and core performance improvements as well as fixing a number of
bugs.
Interesting changes include:
* Updates to 4.6.1+svn178681
* Improves performance by making better use of conditional compares
* Improves performance by properly scheduling widening multiplies
* Improves size and speed by improving constant generation in Thumb-2
* Implements support for widening multiples in toe core
* Improves vectorised code by reducing the over-promotion of intermediates
* Improves performance by reducing redundant moves between VFP and ARM
* Finishes off supporting the Android team in integrating Linaro GCC
Fixes:
* LP: #823548 Can't use -flto with skia
* LP: #823711 libvirt version 0.9.2-4ubuntu8 failed to build on armel
* LP: #827990 internal compiler error: in decode_addr_const, at varasm.c:2632
* LP: #836401 ICE on a | (b << negative-constant)
* LP: #838994 ICE building perl w/ -marm
* LP: #843775 ICE optimizing widening multiply-and-accumulate
Linaro GCC 4.5 2011.09 is the fourteenth release in the 4.5
series. Based off the latest GCC 4.5.3+svn178560, this is a
maintenance focused release.
Interesting changes in 4.5 include:
* Updates to 4.5.3+svn178560
Fixes:
* LP: #823711 libvirt version 0.9.2-4ubuntu8 failed to build on armel
The source tarballs are available from:
https://launchpad.net/gcc-linaro/+milestone/4.6-2011.09https://launchpad.net/gcc-linaro/+milestone/4.5-2011.09
Downloads are available from the Linaro GCC page on Launchpad:
https://launchpad.net/gcc-linaro
More information on the features and issues are available from the
release page:
https://launchpad.net/gcc-linaro/4.6/4.6-2011.09https://launchpad.net/gcc-linaro/4.5/4.5-2011.09
Mailing list: http://lists.linaro.org/mailman/listinfo/linaro-toolchain
Bugs: https://bugs.launchpad.net/gcc-linaro/
Questions? https://ask.linaro.org/
Interested in commercial support? inquire at support(a)linaro.org
-- Michael
I tried to bootstrap current GCC trunk and our latest gcc-linaro-4.6
in profile guided, link time optimisation, and SMS modes. The results
are here:
https://wiki.linaro.org/MichaelHope/Sandbox/PGOLTOSMSStatus1
Short story: you can't bootstrap in LTO or PGO on ARM as they run out
of memory. i686 LTO is broken on trunk and gcc-linaro-4.6. SMS is
fine in general.
I'll run these once a week and keep an eye on them. A -fwhopr instead
of -flto may help on ARM. I don't know why the PGO build runs out of
memory.
-- Michael
Måns pointed me at the IDCT throughput test that's included with
libav. I've written up a page on how to build and run it at:
https://wiki.linaro.org/MichaelHope/Sandbox/LibAvDCT
Included are results with and without the vectoriser. In all cases
the vectoriser improves things, including increasing the SIMPLE-C
version by 11 % and the peak by 17 %.
The coefficient of variance is low so the results are consistent. I
haven't investigated the benchmark itself to see if its valid - we
could be vectorising the loop overhead instead of the IDCT itself.
-- Michael
Please coordinate with Jon Masters at RedHat/Fedora and Adam Conrad at
Ubuntu/Debian on this. (Cc'ing the cross-distro list, through which the
recent ARM summit at Linux Plumbers was organized.)
Cheers,
- Michael
On Sep 16, 2011 8:41 AM, "David Gilbert" <david.gilbert(a)linaro.org> wrote:
> OK, so we seem to have agreement here that what we want is autodetect
> for eglibc and
> forget about the triplet; well technically that probably makes my life
> easier, and I don't
> think it's too hard a sell.
>
> Dave
>
> _______________________________________________
> linaro-toolchain mailing list
> linaro-toolchain(a)lists.linaro.org
> http://lists.linaro.org/mailman/listinfo/linaro-toolchain
* Linaro GCC
Spun 4.5 and 4.6 2011.09 GCC release tarballs. Uploaded them to
Michael's server, and kicked off the tests.
Continued work on my new constant optimization experiments. I now have
it tracking all the constants and am looking at how to detect the
optimization opportunities. So far it only calculates how exprensive it
would be to generate a value by adding to an existing constant, which is
a start at least. I'm having difficulties detecting whether changing an
insn will make it's parent (dependency-wise) obsolete, or not (and
therefore whether to count its costs - there's no problem for
instructions that overwrite an entire register, but ones that write to
portions of registers (such as MOVT) make more complex dependency
chains, and the def-use chains don't seem to be sorted into the order of
use.
* Other
Half day vacation on Thursday.
* Added testcases to Richard's micro benchmarks taken from libav.
* Discussed with Ayal the new version of the patch to support
instructions with
REG_INC_NOTE in SMS which causes bootstrap failure. I intend to debug
the bootstrap failure in order to find the cause for it.
(http://gcc.gnu.org/ml/gcc-patches/2011-08/msg01216.html)
== String routines ==
* Tidying up bits of cortex strings for the release process
* Nailing down the behaviour of config.sub and the config systems in
gcc, binutils and eglibc
== Other ==
* A discussion on synchronisation primitives on various CPUs that
started on the gcc list
- looking at http://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html
- pointing out the 64bit instructions
- asking why they used isb's when neither the kernel or gcc use
them (answer the DMBs should
be fine as well, but there is some debate over which is
quicker, oh and DMBs are
converted to slower dsb's on most A9s due to an errata).
* Looking for docs on the non-core bits of current SoCs
* Extracting some denbench stats from a few months back for Ramana
About a day of non-Linaro IBM stuff.
Dave