Hi,
== pandaboard ==
* noticed that hw perf events are not working on 2.6.38-1001-linaro-omap
* it seems that the omap kernel has not configured its PMU properly
* perf_event_open syscall returns ENODEV
* started discussion with agreen (#744458)
* noticed that natty puts its glibc into a multilib path
* prevents linaro gcc (and upstream) from being built
== libunwind ==
* created a generic and local variant of the extbl parser
* ran the test suite a few times using different unwind methods
* started to look into the test suite failures
* started to fix a couple of the failures on ARM
Regards
Ken
RAG:
Red:
Amber:
Green: the aircon has been fixed; blessed quiet again
Current Milestones:
| Planned | Estimate | Actual |
qemu-linaro 2011-04 | 2011-04-21 | 2011-04-21 | |
Historical Milestones:
finish qemu-cont-integration | 2011-01-25 | 2011-01-25 | handed off |
first qemu-linaro release | 2011-02-08 | 2011-02-08 | 2011-02-08 |
qemu-linaro 2011-03 | 2011-03-08 | 2011-03-08 | 2011-03-08 |
== maintain-beagle-models ==
* the board-ram-limits patchset has been expanded significantly to
address upstream suggestions; it now includes a lot of refactoring
of sun4m (sparc) board code to use the new generic max-ram
functionality instead of a sun4m-specific bit of code. Unfortunately
there is still some pushback upstream on the grounds that a simple
max-ram limit doesn't cater for complicated NUMA situations :-(
== merge-correctness-fixes ==
* working on moving implementation of VLD/VST "multiple structures" forms
into qemu helper functions; the current implementation is correct but
can expand to hundreds of TCG ops which is well beyond the maximum
permitted value, so could potentially overrun a TCG buffer
== other ==
* wrote up some technical/engineering input into what we ought to be
doing with qemu next cycle
* review of a patch by Dmitry Eremin-Solenikov adding ARMv4/v4T support
* some review of s390 TCG patches (not because we have a direct interest
in s390 but as part of being a good citizen upstream)
* sent a pull request for some neon patches that had been on the list
a few weeks; hopefully this will help drain the patch pipeline
* meetings: toolchain, standup, pdsw-tools
Current qemu patch status is tracked here:
https://wiki.linaro.org/PeterMaydell/QemuPatchStatus
Absences:
Holiday: 22 Apr - 2 May
9-13 May: UDS, Budapest
(maybe) ~17-19 August: QEMU/KVM strand at LinuxCon NA, Vancouver
Hello,
* Submitted merge requests for SMS patch to gcc-linaro and gcc-linaro/4.6.
* Testing SMS patch which extends the current implementation to
consider loops that contain
instructions with REG_INC_NOTE.
* Filed PRs 48336 48380 for recent fails of trunk on ARM.
* Had a chat with Ramana about the DENbench benchmarks, directions and findings.
* Filed PR 745743 in linaro gcc-bugzilla
Thanks,
Revital
Hi,
* continued bringing patches upstream
- auto-detection of vector size - committed
- changing default vector size to 128 - submitted and testing the
final version
- if-conversion improvement - submitted and now testing the final version
* gcc-linaro-4.6
- submitted a merge request for store sink patch (this patch is
already upstream)
Ira
For reference. We know that the NEON intrinsics in GCC have issues.
I came across this page:
http://hilbert-space.de/?p=22
which has a colour to greyscale conversion done using intrinsics.
gcc-linaro-4.5-2011.03-0 does poorly through saving intermediate
values on the stack. The core of the loop is:
.L3:
mov ip, r4
vld3.8 {d16-d18}, [r6]
vstmia r4, {d16-d18}
ldmia ip!, {r0, r1, r2, r3}
mov sl, r9
adds r7, r7, #1
adds r6, r6, #24
stmia sl!, {r0, r1, r2, r3}
fldd d16, [sp, #24]
fldd d18, [sp, #32]
ldmia ip, {r0, r1}
vmull.u8 q8, d16, d19
stmia sl, {r0, r1}
vmlal.u8 q8, d18, d20
fldd d18, [sp, #40]
vmlal.u8 q8, d18, d21
vshrn.i16 d16, q8, #8
vst1.8 {d16}, [r5]
adds r5, r5, #8
cmp r8, r7
bgt .L3
llvm-2.9~svn128540 does much better:
vld3.8 {d20, d21, d22}, [r1]!
add r3, r3, #1
cmp r3, r2
vmull.u8 q12, d21, d16
vmlal.u8 q12, d20, d17
vmlal.u8 q12, d22, d18
vshrn.i16 d19, q12, #8
vst1.8 {d19}, [r0]!
blt .LBB0_1
and may actually be better than the had-written assembler on Nils's
page due to scheduling the loop comparison earlier.
Richard S, were you looking into this?
-- Michael
Hi there. A reminder that today's call has shifted due to the
European daylight savings change. It's now at 0800 UTC which is 9 am
in the UK, 10 am in central Europe, and 10 am in Israel.
-- Michael
== Last week ==
* PR46934: Thumb-1 ICE, small fix in the "casesi" jump-table expand
code. Quickly approved and committed upstream.
* Enhance XOR patch for gcc/simplify-rtx.c. Updated comments and
committed upstream.
* PR48250 / CS Issue #9845 / Launchpad #723185. Unaligned DImode reload
under NEON. Submitted patch upstream, but still need to do some more
verification that older pre-ARMv5TE cases are safe. Should complete this
week.
* Working on a type of ICE seen currently on upstream trunk, a few
testcases failing under '-O3 -g'. It seems VTA related, but also might
have something to do with register elimination not fully done for
(var_location (entry_value ...)) expressions, leaving [afp+#num] memory
addresses existing in debug insns after reload. Still investigating.
* Launchpad #689887, ICE in get_arm_condition_code(). Pushed a merge
request to Linaro 4.5 for this patch. Also another LP#742961 appeared as
another case of this ICE...
* Still working on (what I think should be) the last of the CoreMark
ARMv6 regressions. The problem is to combine uxtb+cmp into ands #255.
This could be done by adding (set (cc) (compare (zero_extend...)))
patterns, implemented by ands assembly, but still looking if this can be
done (probably more elegantly) by something like CANONICALIZE_COMPARISON
(replacing compare operands) in the ARM backend.
* Launchpad #736007, ICE immed_double_const under -mfpu=neon -g. Some
discussion on gcc-patches about this, still unclear on what should be
done...
== This week ==
* Push forward on above issues.
Committed Dan's RVCT interoperation patch, both upstream and to Linaro
GCC 4.6.
Adjusted Benrd's "Discourage NEON on Cortex-A8" patch following Richard
Earnshaw's comments, and reposted upstream. The new version was
approved, and committed. I've also submitted a merge proposal to Linaro
GCC 4.6.
Dropped Tom's patch for marking smalls strings read-only. This
optimization seems to have no visible effect for ARM in GCC 4.6. I'll
leave it it to Tom to forward-port, if it's still meaningful for MIPS.
Julian has committed the patch for lp:675347, so I've submitted merge
requests to both Linaro GCC 4.5 and 4.6.
Bernd has posted the shrink wrapping patches upstream. I've posted this
info in all the relevant Linaro tracking tickets.
Talked Revital Eres through the Bazaar/Launchpad merge request system.
Tried to understand why GCC 4.6 does not use multiply-and-accumulate
efficiently, when used with 64-bit values. It seems that the compiler
sometimes uses (subreg:SI (reg:DI ...)) and sometimes just uses a plain
(reg:SI ..) and those don't combine to give useful patterns, but I
haven't got to the bottom of it yet.
Tested an FSF GCC 4.6 snapshot from the 23rd. All well, so I've merged
it to the Linaro GCC 4.6 branch.
* Future Absence
Away Monday 28th to Friday 1st April.
----
Upstream patched requiring review:
* Thumb2 constants:
http://gcc.gnu.org/ml/gcc-patches/2010-12/msg00652.html
* ARM EABI half-precision functions
http://gcc.gnu.org/ml/gcc-patches/2011-02/msg00874.html
* ARM Thumb2 Spill Likely tweak
http://gcc.gnu.org/ml/gcc-patches/2011-02/msg00880.html
* NEON scheduling patch
http://gcc.gnu.org/ml/gcc-patches/2011-02/msg01431.html