Hi there. We've had an earthquake. Family and friends are fine but i'll be
unavailable for a few days. Services on ex.seabright.co.nz are down. I'll
cancel Wednesdays standup call.
See you soon,
-- Michael
== GDB ==
* Working with Will Deacon, identified root cause of GDB
problems running on Versatile Express in SMP mode, and
verified that Errata workaround fixes the problem
* Finished testing GDB HW watchpoints patch on vexpress,
submitted complete patch set for mainline inclusion
* Reviewed Yao's mainline patch to enable displaced
stepping in Thumb mode
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E.
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk
Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294
== Last week ==
* PR46178, PR46002: both upstream issues related to the priority
coloring mode of IRA. Both patches submitted, the first already approved
and committed. Vladimir M. did mention that the priority algorithm
would be removed once his newer "cover class-less" patches goes in
during stage1. Anyways, I got more familiar with IRA during the process,
and the patches will still be applicable to 4.5/4.6.
* PR43872: incorrectly aligned VLAs under ARM. This turned out to be a
one-liner fix. Submitted upstream awaiting approval.
* Discussed on email/IRC with Revital Eres on SMS and ARM doloop pattern
issues.
* Launchpad #721021: Linaro GCC ICE under -mtune=xscale. Investigated a
bit; did not see ICE immediately, but GCC went into infinite loop (Khem
Raj, the reporter, says it runs for a while then ICEs).
* Coremark ARMv5TE vs ARMv7-A performance regression: reproduced
consistently using our own Tegra boards. Investigated and seem to have
found something, will post more detailed findings later.
== This week ==
* Coremark investigation.
* More GCC issues.
== GCC ==
Posted 2 of our 4.5 patches upstream.
My latest 4.6 build and test completed, so I've pushed an update to the
bzr branch. The branch is now up to mainline state as of the 12th.
Merged 3 4.5 patches into Linaro GCC 4.6. Upstream review isn't
happening, so I've decided to commit them anyway. The last upload (FSF
mainline as of 12th Feb) will therefore become the baseline I'm going to
use for Linaro GCC 4.6.
Begun benchmarking the questionable patches before forward porting them,
using EEMBC. Michael Hope has given me access to one of his A9 Panda
boards in New Zealand. This ought to have been straight-forward, but of
course it wasn't. It took me a while to convince myself I was getting
meaningful results and testing the right thing. Also the A9 seemed to be
able to complete the configured iterations in 'zero' time, which fooled
me for a while. I think I now have a set up that works. It seems to run
very slowly sometimes though - something to do with SSH?
----
Upstream patched requiring review:
* Thumb2 constants:
http://gcc.gnu.org/ml/gcc-patches/2010-12/msg00652.html
* Kazu's VFP testcases:
http://gcc.gnu.org/ml/gcc-patches/2011-02/msg00128.html
* Jie's thumb2 testcase fix:
http://gcc.gnu.org/ml/gcc-patches/2011-02/msg00670.html
* ARM EABI half-precision functions
http://gcc.gnu.org/ml/gcc-patches/2011-02/msg00874.html
* ARM Thumb2 Spill Likely tweak
http://gcc.gnu.org/ml/gcc-patches/2011-02/msg00880.html
RAG:
Red:
Amber:
Green: DATE/QEMU conference place confirmed, travel booked
Current Milestones:
| Planned | Estimate | Actual |
qemu-linaro 2011-03 | 2011-03-08 | 2011-03-08 | |
Historical Milestones:
finish virtio-system | 2010-08-27 | postponed | |
finish testing PCI patches | 2010-10-01 | 2010-10-22 | 2010-10-18 |
successful ARM qemu pull req | 2010-12-16 | 2010-12-16 | 2010-12-16 |
finish qemu-cont-integration | 2011-01-25 | 2011-01-25 | handed off |
first qemu-linaro release | 2011-02-08 | 2011-02-08 | 2011-02-08 |
* maintain-beagle-models:
+ implemented missing epoll syscalls for qemu usermode,
submitted upstream
https://bugs.launchpad.net/qemu-linaro/+bug/644961
+ tracked down the problem causing serial console to break:
the new Linux driver uses some extra features of the UART
which we weren't modelling
https://bugs.launchpad.net/qemu-linaro/+bug/714600
* merge-correctness-fixes:
+ reworked VZIP/VUZP patch as per review comments, resubmitted
+ reviewed CL's latest shift patches, added fixes of my own for
large shift counts and overlapping src/dest regs, submitted
a 10 patch rolled up series
+ reviewed a patch for adding cp15 VA-PA translation ops
+ reviewed various versions of vrecpe/vsqrte patches from CL
* versatile-express model:
B Labs kindly made available their Versatile Express board model:
https://github.com/bbalban/qemu/commits/universal-branch
and I've spent a few days getting it to boot a Linaro kernel,
fixing a few bugs and cleaning up the patchset in preparation
for upstreaming it.
This included discovering a bug in qemu's SD card model which
was causing Linux not to be able to detect cards on PL181,
and resulting in spurious qemu warnings on omap3:
https://bugs.launchpad.net/qemu-linaro/+bug/714606
* other:
+ ARM architecture Q&A for modelling engineers
+ booked travel/hotel for QEMU conference
* meetings: toolchain, PDSW-tools, PD comms, Linaro-in-ARM network
infrastructure, pdsw-doughnuts and 1st birthday celebration,
Current qemu patch status is tracked here:
https://wiki.linaro.org/PeterMaydell/QemuPatchStatus
Absences:
17/18 March: QEMU Users Forum, Grenoble
Holiday: 22 Apr - 2 May
9-13 May: UDS, Budapest
(maybe) ~17-19 August: QEMU/KVM strand at LinuxCon NA, Vancouver
Hi,
* continued to look into latrace and found an issue in case a dynamic
library gets unloaded. Otherwise latrace looks quite good on ARM.
https://wiki.linaro.org/KenWerner/Sandbox/latrace
* chasing bugs:
- After a lot of testing Andy Green has made a big step forward in
finding the root cause for the shut-down issue of my PandaBoard.
The PMIC is seeing an overcurrent and issues an interrupt that gets
ignored by current kernels. Then the PMIC shuts the board down for
safety reasons. As a workaround Andy has made a kernel patch for the
twl6030 driver that enables all interrupt sources. The kernel will
acknowledge the overcurrent reported by the PMIC and the board survives.
A patched kernel binary can be found at:
https://wiki.linaro.org/KenWerner/Sandbox/708883
- While testing Andys patches on the linaro natty kernels I ran into
https://bugs.launchpad.net/bugs/720055
- The flash-kernel utility doesn't work on the PandaBoard because the
subarch check expects omap4 instead of omap:
https://bugs.launchpad.net/bugs/721147
- Looked into the apr fail (process shared mutex's fail on armel v7).
Their mutex functionality can be mappped to various methods, but only
pthread is of interest here. The code relies on pthread_mutex_lock and
pthread_mutex_trylock which is implemented by the (e)glibc. The c library
uses GCCs __sync primitives if eglibc >= 2.12.1-0ubuntu11 and GCC >=4.5.
The testprocmutex testcase passes now.
https://bugs.launchpad.net/bugs/604753
Regards
Ken
"Will Deacon" <will.deacon(a)arm.com> wrote on 02/16/2011 01:07:09 PM:
> > I've now built a kernel with CONFIG_ARM_ERRATA_720789 enabled, and the
> > symptoms indeed seem to have disappeared completely ...
>
> Yup - that's because without it, invalidating a TLB entry for a
particular
> process isn't broadcast correctly, so you can end up using the old
(pre-COW)
> mappings if you're running on a different core.
OK. So I guess the only remaining questions is: if this hardware needs the
errata fix to work properly, shouldn't it be automatically selected by the
kernel configure logic? Note that this appears to happen for certain OMAP
boards, see arch/arm/mach-omap2/Kconfig:
config ARCH_OMAP4
bool "TI OMAP4"
default y
depends on ARCH_OMAP2PLUS
select CPU_V7
select ARM_GIC
select PL310_ERRATA_588369
select ARM_ERRATA_720789 <<=====
select USB_ARCH_HAS_EHCI
But this does not happen for the vexpress; arch/arm/mach-vexpress/Kconfig
has only:
config ARCH_VEXPRESS_CA9X4
bool "Versatile Express Cortex-A9x4 tile"
select CPU_V7
select ARM_GIC
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E.
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk
Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294
Hello,
* Continue looking into DENbench benchmarks.
* While testing SMS I realized that my current implementation of doloop
pattern for ARM does not follow SMS's requirement to have the doloop
instructions be decoupled from the other loop's instructions. This happens
because doloop uses CC register which might be used elsewhere in the loop.
I am looking into a solution for that.
Thanks,
Revital
Hi,
This week I looked into DENBench:
* sad8_c (hot function from mp4encode) needs SLP reduction, but it
also contains cond_expr which cannot be vectorized as reduction, so I
don't think there is anything I can do here
* fdct_int32 (another hot function from mp4encode) now gets vectorized
with vzip/vuzp patch, but the vectorization causes performance
degradation here because of multiple register spills. I also noticed
that vectorizer costs are not set for NEON, i.e., it uses default
costs. So, I am now working on costs for NEON and adding registers
consideration into vectorizer's cost model.
I also did some general vectorization research, checking opportunities
of collaboration with GRAPHITE pass and auto-parallelization.
Ira