Summary:
* Patch linaro crosstool-ng.
* Windows install package
Details:
* Patch linaro crosstool-ng:
* Back port upstream patches.
* Check-in the zlib/libiconv/expat/ncurses related patches to linaro branch.
* Create reference windows install package for linaro toolchain from
installjammer. The install process works well on Win7.
Plans:
* Investigate test on Windows.
Best regards!
-Zhenqiang
Hi,
OpenEmbedded:
* started on creating a receipts to compile the "core-image-minimal"
using an external prebuilt toolchain (csl arm-2011.03)
* there are still a lot of warnings at the do_package/do_package_qa task
* the good news is that the build process finishes and kernel plus root
file system image gets created
* the bad news is that the rootfs lacks some important libs like libc
and therefore won't run under qemu-system-arm
(since init, busybox, etc. are dynamically linked)
* currently a 3-lines hack on oe-core is required to be able to
overwrite a task of the generic glibc receipt; all other files could go
into a separate layer
Linaro Android:
* had a quick look into the EABI attribute tag issue
Regards
Ken
== String routines ==
* Sent updated memchr to the eglibc list
== 64 bit atomics ==
* Ran a set of timing consistency tests that a colleague had sent me
while I was off; Panda passed those, so time
doesn't appear to be going backwards or anything, so that's not the
problem with membase.
* Pushed the code into linaro-gcc.
== QEmu ==
* Tested Peter's prerelease - all good.
* Started looking at the issues for running in TCG mode on ARM
== Other ==
* Read through the ARMv8 instructions docs that landed on arm.com;
quite interesting. Note that multiple instruction
IT blocks are listed as being deprecated for 32bit mode on v8
(although this will work but it can be put in a mode to fault
you to make it easy to find the uses).
* Some debugging of Panda odd timing issue with Paul Mckenney.
Dave
RAG:
Red:
Amber:
Green:
Current Milestones:
|| || Planned || Estimate || Actual ||
||upstream-omap3-cleanup || 2011-11-10 || 2011-12-15 || ||
||cp15-rework || 2012-01-06 || 2012-01-06 || ||
||initial-a15-system-model || 2012-01-27 || 2012-01-27 || ||
||qemu-kvm-getting-started || 2012-03-04?|| 2012-03-04?|| ||
(for blueprint definitions: https://wiki.linaro.org/PeterMaydell/QemuKVM)
Historical Milestones:
||add-omap3-networking || 2011-10-13 || 2011-10-13 || 2011-10-13 ||
||a15-systemmode-planning || 2011-10-13 || 2011-10-13 || 2011-09-22 ||
||a15-usermode-support || 2011-11-10 || 2011-11-10 || 2011-10-27 ||
== qemu-kvm-getting-started ==
* now reasonably set up to run KVM under Fast Model; howto is here:
https://wiki.linaro.org/PeterMaydell/A15OnFastModels
* rebased kvm patches into qemu-linaro
* fixed bug where we weren't passing cpu number to kvm properly
when delivering an interrupt
* sent some minor patches to upstream qemu that will be needed for
kvm (eg configure script tweaks)
== initial-a15-system-model ==
* started on cleaning up a9/11mpcore private peripheral implementation;
now mostly done and looking much better as a base for a15
== other ==
* preparation for qemu-linaro release (rolled tarball, tested)
* submitted patch to fix buffer overrun in GIC model
* discussion: linux-user mode race conditions, and in particular
how we should handle signals that arrive during syscall emulation
* upstream patch review: imx31 round 3
-- PMM
== GDB ==
* Completed new set of patches to support both "info proc" and
core file generation across the remote protocol, and posted
them to the mailing list for review.
* Tested GDB trunk in preparation for 7.4 release branch point
on multiple platforms; analyzed and fixed a couple of problems,
some also present on ARM in remote testing. Patches checked
in to mainline.
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E.
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk
Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294
== This week ==
* More on -fsched-pressure. Testing on POWER7 showed a degenerate case
that I'd failed to handle well. Fixed that. Saw that part of the
problem on POWER7 was that IRA was using a combination of GENERAL_REGS
and CR_REGS as a single pressure class, so there appeared to be 39
registers available for storing integers. Fixed (or worked around) that.
Tweaked a few other things too. The only denbench result that I
wasn't happy with was RSA, where both forms of -fsched-pressure are
significantly worse than -fno-sched-pressure. Tracked down the cause
of that. We had a block BB1:
A: (set (reg:DI X) Y)
B: (clobber (reg:DI Z))
C: (set (subreg:SI (reg:DI Z) 0) (... X ...))
D: (set (subreg:SI (reg:DI Z) 4) (...))
where B makes sure that Z is treated as dead before C. Interblock
motion causes B to be scheduled in an earlier block, but none of
the other instructions can be. This means that, when we schedule BB1,
it still contains A, C and D, and Z now appears to be live on entry to
the block. C therefore appears to reduce register pressure, because
it contains the last use of X, and appears to leave Z's liveness
unaffected. In reality it should be treated as increasing register
pressure by 1 (-1 for the death of X, +2 for the birth of Z).
I "fixed" this by moving C's dependencies to B, a bit like we do
for scheduling groups (although none of the other handling of
scheduling groups should apply). This made a big difference,
so that the new code is a win on RSA.
There's still one SPEC2006 degradation on POWER7 that I want
to look at.
* Caught up on a lot of mail. gcc-patches backlog has gone down
from ~4900 when I got back to ~500.
* Briefly looked at x86's drap support, to see what would be needed
for ARM. Didn't look for long though: the overhead seems excessive
for optional alignment, and the agreement seemed to be that 128-bit
alignment wouldn't really make much of a difference anyway.
Richard
Hi!
* Continued with running eembc, coremark, denbench and spec2k on the ursas
with the latest of the Linaro and FSF series. The variants used were
o3-neon and o3-neon-novect. Something went wrong with the variants the
first time, so I had to rerun the tests once.
Discussed draft report with Michael, next week I will share with the rest
of the team.
* Did a rerun SPEC2K runs with "train" and "ref" data sets. I did -o2 and
-o3 runs on a panda with the two data sets. Asked for a sanity check of the
numbers.
* Prepared and held a presentation about the tcwg internally.
* Will be tied up with internal work for the most of w49.
Best regards
Åsa
Hi,
- Ran eon with gcc 4.7: there are much more loops similar to the one
in lp#831094 that get vectorized (due to some data ref analysis
improvement), so the impact of disabling peeling for such loops (i.e.
loops with low loop bound) is even bigger than for 4.6, and
vectorization improves the performance by 2.5%.
I prefer to understand the peeling/alignment situation better and not
just commit this patch (and I spent some time trying to do that).
- Fixed PR 51301 - a bug in over-promotion pattern. Proposed for merge
to gcc-linaro-4.6.
- Merged the last SLP patch to gcc-linaro-4.6.
Ira
This email is just a quick summary of what we (Linaro) are
planning in the way of QEMU work to support KVM on ARM Cortex-A15.
The idea is to let people know what's coming up, find out if we've
forgotten anything, and avoid people duplicating work unnecessarily.
Most of this is based on a useful session at the recent 'ARM server
mini-summit' in Orlando (UDS/Linaro Connect) at the beginning of
this month.
The work we're currently proposing to do falls into three parts:
* refactor QEMU's cp15 register handling
At the moment QEMU handles cp15 accesses by calling out to a single
helper function which is an enormous set of nested switch statements
to handle the different coprocessor registers. Access permissions are
checked separately at translate time. This design makes specifying
board-dependent or cpu-dependent registers somewhat painful; it's also
easy for the access permission checks to be out of sync. There is no
support for banked cp15 registers either (needed for trustzone and
virtualisation). We need a better design which lets a board or core
register handler routines for cp15 registers. This will make the code
cleaner and more maintainable as a base for new features.
This isn't strictly a requirement for KVM, but we're going to want
KVM to be able to hand off cp15 accesses to QEMU, and I don't think
that's going to be maintainable or reliable without this refactoring.
(https://blueprints.launchpad.net/qemu-linaro/+spec/cp15-rework)
* A15 system model
Basically a QEMU model of a Versatile-Express with a Cortex-A15
minus the virtualization and LPAE extensions. This needs the
A15 private peripherals (just the GIC in the right place in
the memory map, really; generic timer not required) and the
new memory map version of the vexpress board model, plus some
new cp15 registers. (Bill Carson has already done some patches
in this area but they need a little rework and may have minor
missing pieces.)
https://blueprints.launchpad.net/qemu-linaro/+spec/initial-a15-system-model
* miscellaneous integration work
We're aiming for a reasonable working prototype of A15 guest on
an A15 Fast Model host here; we need to fix at least some of
the bugs which currently mean upstream QEMU doesn't work on ARM hosts,
sort out which kernel and qemu trees we are developing from, and
get things running in our validation lab's continuous integration
setup.
https://blueprints.launchpad.net/qemu-linaro/+spec/qemu-kvm-getting-started
Also on the radar is a fourth piece of work:
* QEMU virtio-mmio support
This is adding support for the 'mmio' virtio transport, which will
allow virtio support in a versatile-express model. We're going to
need this at some point but the current thought is that we want
to do the above listed more important bits of work first...
(The exception would probably be if it turned out that this was
sufficiently useful for making early KVM development easier)
https://blueprints.launchpad.net/qemu-linaro/+spec/add-amba-virtio-support
So, questions:
(1) did we forget something important?
(2) is anybody else already planning to do any of this (or would
like to start)? if so we should coordinate...
(3) is there anything that the kernel folk need/want earlier
rather than later?
thanks
-- PMM
Hi,
Now that upstream trunk is in stage3 and we have a few patches that
won't really make it upstream until stage1 is reopened is it
worthwhile having a new status in the merge requests that moves it
into a to_upstream status . The other option is to have a common
spreadsheet that we keep updating with links to merge requests that
need to be upstreamed .
Thoughts ?
Ramana
PS - Any clue on what's happening with the branch diff bug that's been
open in launchpad forever now ?
Hi,
* Worked on peeling problem in eon (#831094). Wrote a patch that
checks if the number of vector iterations is going to be more than 2,
and disables peeling otherwise. With this patch I see about 1.5%
regression with vectorization (and about 7% without it).
* I am thinking to extend the patch for unknown number of iterations
by creating a run-time check. The threshold could be set by param.
Another option, could be doing it through the cost model, but it's
hard to evaluate costs when misalignments are unknown (and, I think,
the cost model handles known misalignment properly).
* Disabling peeling for low loop bounds also helps with one of EEMBC
benchmarks, for which vectorization with double-words is more
beneficial than with quad-words. It turns out that we are able to
force the alignment for double-words (and, therefore, avoid peeling),
because we check that the required alignment (64 in this case) is less
or equal to BIGGEST_ALIGNMENT, where
arm.h:#define BIGGEST_ALIGNMENT (ARM_DOUBLEWORD_ALIGN ?
DOUBLEWORD_ALIGNMENT : 32)
and
arm.h:#define DOUBLEWORD_ALIGNMENT 64
So, we can never force alignment for 128 bits on ARM. I wonder if
that's a real limitation.
* Proposed three SLP patches to gcc-linaro, and merged two of them.
Ira
Addressing the comments received from Richard and Ayal regarding the
patch to estimate register pressure.
Testing the patch on eembc and libav micro benchmarks.
Looking at the regressions seen with SMS.
== GDB ==
* Ongoing work on support for cross-platform core file generation.
Posted a new design proposal to the mailing list to include not
only "info proc mappings", but *all* "info proc" commands. This
would involve a remote protocol command to read arbitrary proc
files, instead of a specific command to retrieve the memory map.
* Investigated Launchpad bug:
#891970 msp430-gdb segmentation fault with target remote
== GCC ==
* Patch review week.
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E.
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk
Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294
Worked on adding support for 64-bit NEON integer shifts. I have this
working now, although I'm still not very happy about how the register
allocator chooses which mode to use - it prefers core-registers if the
values start or end in core-regs, even though moving to values to NEON
registers might be more efficient (general 64-bit shifts in core
registers require several instructions). I've also had to mark the CC
register clobbered in all cases, even though it only gets clobbered in
some of them, which might be necessary, but isn't very satisfactory.
The NEON shifts work showed that 32->64 bit extends could be done better
also. This hasn't been a great problem up to now, but the shift amount
(in particular) is typically a 32-bit value and yet needs to be
zero-extended to 64-bit for NEON's purposes. Right now, GCC prefers to
extend the value in core-registers, and then copy it to NEON. This
works, but burns another core-register - a scarce commodity - so I think
it would be better to copy it first, and then extend it after. NEON has
instructions for this, so I'm investigating how to get the compiler to
do it (this is all strictly post-combine, so the usual options are out,
and the register allocator has to be allowed to do it the old way in the
case where core-regs really are the best option, so it's tricky).
Summary:
* Upstream crosstool-ng patches.
* Create windows install package from installjammer.
* Investigate link issues.
Details:
* crosstool-ng patches.
* Patches for newlib extra config, gdb extra config, pch, nls option
are committed to crosstool-NG upstream.
* The dependant library patches are in discussion.
* Learn installjammer and integrate it to scripts to create windows
install package.
* Investigate warning message from link when linking the prebuilt zlib
for migw32 host.
It might be OK with static link, but migh fail with dynamic link on windows.
For i586-mingw32[msvc] host, lots of messages like
libtool: link: Could not determine host path corresponding to ...
For i386-mingw32 host: In addition to the message in i586-mingw32
build, output the following message
*** Warning: linker path does not have real file for library -lz. ...
Plans:
* Build and test.
Absences:
* Nov 29, 30: Trainings.
Thanks!
-Zhenqiang
Hi,
Good news -- I just built a version of ICS with the current version of
linaro-gcc.
Panda build here:
http://people.linaro.org/~bernhardrosenkranzer/boot.tar.bz2http://people.linaro.org/~bernhardrosenkranzer/system.tar.bz2http://people.linaro.org/~bernhardrosenkranzer/userdata.tar.bz2
Use linaro-android-media-create as usual to install.
This is not yet a build that we can reproduce inside android-build
because I've had to cheat by swapping out linkers in a couple of
places (just using current binutils the way we normally do produces a
build that doesn't boot, using binutils built from the AOSP source
release works, but the prehistoric linker doesn't know about "dmb st",
can't link u-boot, can't link the kernel, and strangely enough can't
link some components of ICS - apparently the binaries they ship have
some extra patches in).
But the good news is that every part is built with our compiler -
there's nothing in the way of using that (aside from the code
insanities I've already fixed).
I'll work on sorting out binutils now...
ttyl
bero
Hi,
I've spent most of my time to dig into OE. First I started with OE
(classic); then realized that OE-core is where the future happens and
switched to it. I've set up a build system and got a ARM minimal image
to build that boots in QEMU *yay*. In parallel I've been reading the
manual and looked into the receipts to find out what toolchain they are
using (gcc-4_6-branch plus patches). Next step is to get the OE-core
built using the Linaro-GCC.
Regards
Ken
== This week ==
* Looked at the MIPS _unpack_d bug. libgcc.a did have a definition,
and Michael couldn't reproduce with his build, so the bug report
is now marked as Incomplete.
* Backported patch for PR 48190 to upstream 4.6 and 4.5.
* Reviewed Revital's SMS register-pressure patch.
* More on -fsched-pressure. I now have a version that I'm happy with
as far as ARM goes, in that it usually seems to produce code that is
no worse than the better of currect -fsched-pressure and current
-fno-sched-pressure. (I'm sure there's a better way of saying that.)
In some cases it is better than both.
* Continued trying to catch up on mail.
== Next week ==
* Clean up the -fsched-pressure code (it's still in its "experimental mess"
state). Try it on Power.
* Resurrect vzip and vunzp patch after Richard E said he wouldn't object.
Richard
Hi!
* Ran eembc, coremark, denbench and spec2k on the ursas with the latest of
the Linaro and FSF series. The variants used were o3-neon and
o3-neon-novect.
I first got a c++ related build error when using 4.4.x compilers, the was
error caused by symbol versioning. Michael's explanation: "We want to use
the gcc-4.4.5 libstdc++ when building and running. However, when running
c++ itself, it links in /usr/lib/libppl_c.so, which was built with the host
4.5 compiler, which needs the 4.5 libstdc++!"
The work around is to remove the LD_LIBRARY_PATH from build.mk (the
gcc-%/benchmarks.stamp target) and run the C only tests.
* Continued documentation of running benchmarks:
https://wiki.linaro.org/AsaSandahl/Sandbox/RunningBenchmark. Tips of more
efficient ways of doing things are always welcome.
* Collected the results for SPEC2K runs with "train" and "ref" data sets. I
did -o2 and -o3 runs on a panda with the two data sets. The results for -o2
and -o3 looks almost the same though. I will double check the "*build.txt"
files from the benchmark runs, and if needed do a complementary run.
Best regards
Åsa
Dear All,
I am using arch/arm/configs/vexpress_defconfig to configure and build Linux
Kernel 3.1.1
http://launchpad.net/linux-linaro/3.1/3.1-2011.11/+download/linux-linaro-3.…
and then if I booth the zImage crated on Linaro QEMU
http://launchpad.net/qemu-linaro/trunk/2011.10/+download/qemu-linaro-0.15.5…
,it works properly.
But if i enable the LPAE support in the config file, the kernel builds and
when I boot the kernel image on QEMU, it just prints the output as :
Uncompressing Linux... done, booting the kernel.
And, then it hangs ... Can anyone please tell how to fix this issue?
Looking forward to your reply.
Thanks and Regards,
Jubi
I discovered some excessive memory usage in gas recently when
defining macros. It turns out that this is a weird implementation
feature rather than a bug.
This patch has a possible fix for the issue, but I'd be interested
in people's views before I go so far as cleaning it up and
discussing it upstream.
Cheers
---Dave
Dave Martin (1):
gas: Allow for a more sensible number of macro arguments
gas/as.c | 17 +++++++++++++++++
gas/doc/as.texinfo | 9 ++++++++-
gas/hash.c | 5 +++--
gas/hash.h | 1 +
gas/macro.c | 22 +++++++++++++++++++++-
gas/macro.h | 1 +
6 files changed, 51 insertions(+), 4 deletions(-)
--
1.7.4.1
[Jubi, I'm afraid this is the second copy of this you'll see, because
you accidentally sent your reply to linaro-toolchain-request rather
than to the actual mailing list, and so my first reply was misdirected.
This reply is to the correct list address...]
On 22 November 2011 13:28, Jubi Taneja <jubitaneja(a)gmail.com> wrote:
> Thanks for your reply. Please find the response inline ..
> On Tue, Nov 22, 2011 at 6:44 PM, Peter Maydell <peter.maydell(a)linaro.org>
> wrote:
>> On 22 November 2011 13:06, Jubi Taneja <jubitaneja(a)gmail.com> wrote:
>> > But if i enable the LPAE support in the config file, the kernel builds
>> > and
>> > when I boot the kernel image on QEMU, it just prints the output as :
>> >
>> > Uncompressing Linux... done, booting the kernel.
>>
>> Does your kernel boot OK on real hardware?
>>
>> (ie, is a kernel with LPAE support expected to boot on a CPU like the
>> A9 which doesn't have LPAE?)
>
> Yes, it is expected to boot ARM Cortex A15 CPU.
The A9 and the A15 are different CPUs. QEMU currently supports
only the A9. This is why I asked if this kernel boots OK on real
Versatile Express A9 hardware.
>> Also if your config/kernel command line don't turn on earlyprintk it's
>> worth enabling this as it usually gets you better diagnostic messages
>> for early kernel boot failures.
>
> Ok, I will try to check this. But, unfortunately now I again tried enabling
> LPAE in config file and the current status is that when I boot the kernel
> image on Qemu. it simply hangs. It now don't show that message of
> Uncompressing kernel.. I am trying to debug it using gdb, but could not find
> much. Please guide me how shall I proceed ahead.
If you've turned on kernel support for the Versatile Express A15
rather than the Versatile Express A9 then this is expected behaviour:
the VE-A15 has a different memory layout and in particular the serial
ports are in a different place. So if you try to boot the kernel on
a VE-A9 system (which is what QEMU is modelling) then it will display
nothing because the kernel is trying to write to UARTs which aren't
there.
What are you actually trying to achieve here?
-- PMM
Continued looking at constant reuse optimizations, as a background task.
I've fiddled with the costs a bit more to remove false positives.
Continued benchmarking different generic tuning ideas. With each test
run taking most of a day this is slow going.
Took Michael's rootfs that is used for all the toolchain testing and
benchmarking, unpacked it, and repacked it so that it is compatible with
"linaro-media-create", then tested that I could use it to run tests on
LAVA successfully. I was hoping to use this for extra benchmarking
bandwidth, but there's a permissions problem in the LAVA website
software that means it's not yet possible to post private results to the
system, so no proprietary benchmarks yet. I can still continue
pipe-cleaning my process, and maybe run some benchmarks without actually
reporting the results (or perhaps posting them somewhere write-only).
Begun work on adding GCC support for 64-bit shifts with NEON. This is
not quite as simple as it ought to be because a) it's inefficient to
move a value to NEON registers just to do a shift, so it needs to detect
where the value is, and b) right shifts are encoded as left shift by a
negative amount, and negative shift amounts are normally considered
undefined behaviour.