== GDB ==
* Russell King now wants to revert my kernel patch that
fixed #615974; discussed alternative options.
== GCC ==
* Patch review week.
* Analyzed root cause of ICE when building Linux kernel
with mainline GCC (reported by Arnd).
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E.
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk
Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294
== QEmu ==
* Sent 64bit atomic helper fix upstream
* Basic boot time and simple benchmarks v Panda board
* Tested prebuilt images and Peter's latest post-merge QEmu tree
- The full Ubuntu desktop on an emulated Overo is a bit slow -
it's rather short on RAM
- The full Ubuntu desktop on an emulated VExpress isn't bad; it's
got the full 1G; (with particularly grim
line of awk to mount vexpress images based on Peter's
suggestion of the use of 'file')
== String routines ==
* Pushed memcpy and memset up to cortex-strings bzr
* Working through memset issue with Michael
- Made my code a little less sensitive to initial alignment
== Hard float ==
* Testing libffi 3.0.11rc1 - still hasn't got variadic patch in, but
hopeing it will land later in the cycle.
== Other ==
* Excavating inbox after week off.
* Build LMbench and kicked run off on Panda. (Got stuck in some
heuristics under emulation)
Dave
== This week ==
* Looked at the get_arm_condition_code ICE. Seems to be a popular bug:
was reported as #589887 #823708 and #809761 in Lauchpad and as PR49030
in bugzilla. Sent a patch upstream.
* Submitted SMS register-dependency patch upstream.
* Reviewed Bernd's new shrink-wrap patch.
* Tried to clean up my microbenchmarks. Found that preloading the caches
at the start of the benchmark fixed the variations I was seeing on a
Beagleboard. (As Dave says, it seems that there's no allocation
on write.) Added code to check the results of each loop. Packaged
it up and pushed into bzr.
* An IBM colleague kindly tried my -fsched-pressure patch/hack
on s390. Although it performed the best of the three runs
(trunk, -fsched-pressure, patch+-fsched-pressure), there were
some disappointing outliers. -fsched-pressure still introduces a 7%
regression in one test (down from a 14% regression without the patch).
Another test benefited from -fsched-pressure without my patch but
regressed with it.
== Next week ==
* Look more at SMS.
* Look more at the sched-pressure thing (if I get time).
Richard
* SPEC2K week. Experimented with building and running and finally did a full
run on the Panda board.
* Running SPEC2K on the Snowball board as well. It is troublesome to work
with the board because of the ethernet problem that makes the board freeze
after some time. I have to use the SD-card for file transfer. It seems to be
a known issue on the Snowball V3. Tested a V5 board which was much more
stable. Have asked for a V5 board from ST-E Linaro internal project manager.
(Do not know when I will get it.)
* Preliminary travel booking done for Linaro Connect in Orlando.
Best Regards
Åsa
I've tried to clean up the libav microbenchmarks that I did for the strided
load/store stuff. They're on Launchpad at:
lp:~rsandifo/+junk/loop-microbenchmarks
The main changes are that the benchmarks now preload the caches (for CPUs
that don't allocate on write) and that they now check the optimised loop
against an unoptimised one.
The usual big caveat applies: these loops were chosen because they were
affected by strided load/stores. They aren't necessarily interesting
for any other reason, and some were even explicitly marked as cold.
I'm going to add some of the video decode routines from Michael's
benchmark soon. These microbenchmarks aren't supposed to be
libav-specific though, so if you have other interesting ones,
please do add them.
Richard
Trip report: KVM Forum/LinuxCon NA 2011
KVM Forum is an annual conference; this year it was colocated with
LinuxCon NA in Vancouver. There were about 150 attendees; many of them
are simply users of KVM and so many of the talks are aimed at KVM
users. However it's also an opportunity for the KVM and QEMU developer
community to get together, with a number of informal BoF sessions and
an all-day hackathon later in the week.
The talk schedule is here, together with the slides for all talks:
http://www.linux-kvm.org/page/KVM_Forum_2011
Some brief highlights:
* Keynotes
ARM/Linaro got positive mentions in both keynotes; Avi Kivity said of
the ARM/A15 KVM work that he had "every reason to expect it to be very
successful". Anthony Liguori's keynote summarising the year in QEMU
development included some statistics about commits: Linaro came third
in the list of "companies with most commits", behind only Red Hat and
IBM; I came top of the "individual authors with most commits" list,
being apparently responsible for 7% of all QEMU patches this year :-)
* KVM on POWER/PPC
There were several talks about KVM on PPC architectures; interestingly
this is seeing use not just on the server end but also in the
embedded/realtime space (including a talk from Freescale where they
said they are working on KVM on embedded PPC because of customer
demand for KVM). It was also reassuring to see that another
architecture has preceded us in shaking out x86-isms from KVM.
* KVM Tool
This has got headlines recently as a potential replacement for the
userspace launcher/device model role which QEMU currently plays when
starting guest OSes under KVM. It's intended to be minimal and
lightweight and only to run Linux guests (with paravirtualised devices
for most purposes). The general reaction seemed to be that although
the implementation is currently minimal it will become larger and
bloatier as they add features off their wishlist. There's also some
ill-feeling about the effective namespace grab of calling the
userspace binary "kvm". From an ARM-centric point of view we can just
wait and see whether it gets much traction. Possibly it may turn into
a testbed for technology which is easier to develop on than the
'mature' QEMU which has to deal with backwards compatibility and
supporting users.
* QEMU Object Model and Device Model issues and redesign
For me one of the most important strands of conversation at the
conference was replacing QEMU's device model abstraction with
something better. QEMU's current device model abstraction is "qdev";
this is the (vaguely object-oriented) framework which lets you create
devices, configure them and connect them together. It models the world
as a tree: a root device exposes a bus, to which child devices can
connect; those child devices may expose further buses, and so on.
This works quite well in the PC world where mostly you're interested
in plugging in USB devices, PCI cards, etc; it is rather less well
matched to the embedded board models where things are much less
hierarchical. qdev's major flaws include:
+ insists on bus hierarchy, but not everything is a bus, and in any
case there are often several trees (memory transactions, clock,
interrupts) which don't necessarily coincide
+ no support for composition ("device foo is actually devices bar
and baz glued into one box")
+ just barely supports having devices expose signal (gpio/irq) lines
and memory regions (typically registers), but doesn't let you give
them useful names, so you have to access them by index number
We spent just about all of Thursday's hacking session going through
this. I felt we got good agreement on the problems, and perhaps
80-90% agreement on Anthony Liguori's proposed new QEMU Object Model
as a solution to them; some loose ends still need to be worked
through.
* LinuxCon NA
KVM Forum was colocated with LinuxCon NA this year. My opinion (which
seemed to be shared with the other Linaro attendees I talked to about it)
was that LinuxCon NA suffered from being not very technical and not
very focused -- it wasn't clear to me who they thought their target
audience was. A few points of interest:
+ Linus Torvalds' keynote was reported in some places as more
complaints about ARM hardware but I actually thought it was pretty
positive about the progress we're making in sorting out the issues
+ Matthew Garrett's talk about x86 platform drivers (those things that
deal with LEDs, funny keys, batteries and other odd laptop hardware)
revealed that actually PC hardware manufacturers do just as much
random non-standard undocumented silliness, it's just that accident
of history has limited them to only doing so in the minor bits at
the edges...
-- PMM
Hello Ulrich (or anyone else acquainted with gdb),
Could the gdb test suite be run on a kernel with the below patch applied
please? A confirmation that this patch doesn't regress gdb is required
before this can move ahead. Quick feedback would be greatly
appreciated.
Thanks.
---------- Forwarded message ----------
Date: Thu, 25 Aug 2011 15:55:58 +0100
From: Russell King - ARM Linux <linux(a)arm.linux.org.uk>
To: Tejun Heo <tj(a)kernel.org>, Arnd Bergmann <arnd(a)arndb.de>,
Mark Brown <broonie(a)opensource.wolfsonmicro.com>
Cc: Rafael J. Wysocki <rjw(a)sisk.pl>, linux-kernel(a)vger.kernel.org,
linux-arm-kernel(a)lists.infradead.org
Subject: Re: try_to_freeze() called with IRQs disabled on ARM
On Thu, Aug 25, 2011 at 03:09:07PM +0200, Tejun Heo wrote:
> Hey, Russell.
>
> If you can fix it properly without going through temporary step,
> that's awesome. Let's put the arguments behind, okay?
Here's the patch. As the kernel I've run this against doesn't have the
change to try_to_freeze(), I added a might_sleep() in do_signal() during
my testing to verify that it fixes Mark's problem (which it does.)
I've tested functions returning -ERESTARTSYS, -ERESTARTNOHAND and
-ERESTART_RESTARTBLOCK, all of which seem to behave as expected with
signals such as SIGCONT (without handler) and SIGALRM (with handler).
I haven't tested -ERESTARTNOINTR.
I don't have a test case for the race condition I mentioned (which is
admittedly pretty difficult to construct, requiring an explicit
signal, schedule, signal sequence) but this should plug that too.
How do we achieve this? Effectively the steps in this patch are:
1. Undo Arnd's fixups to the syscall restart processing (but don't worry,
we restore it in step 3).
2. Introduce TIF_SYS_RESTART, which is set when we enter signal handling
and the syscall has returned one of the restart codes. This is used
as a flag to indicate that we have some syscall restart processing to
do at some point.
3. Clear TIF_SYS_RESTART whenever ptrace is used to set the GP registers
(thereby restoring Arnd's fixup for his gdb testsuite problem - it
would be good if Arnd could reconfirm that.)
4. When we setup a user handler to run, check TIF_SYS_RESTART and clear it.
If it was set, we need to set things up to return -EINTR or restart the
syscall as appropriate. As we've cleared it, no further restart
processing will occur.
5. Once we've run all work (signal delivery, and rescheduling events), and
we're about to return to userspace, make a final check for TIF_SYS_RESTART.
If it's still set, then we're returning to userspace having not setup
any user handlers, and we need to restart the syscall. This is mostly
trivial, except for OABI restartblock which requires the user stack to
be written. We have to re-enable IRQs for this write, which means we
have to manually re-check for rescheduling events, abort the restart,
and try again later.
One of the side effects of reverting Arnd's patch is that we restore the
strace behaviour which we've had for years on ARM, and can still be seen
on x86: strace can see the -ERESTART return codes from the kernel syscalls,
rather than what seems to be the signal number:
Before:
rt_sigsuspend([] <unfinished ...>
--- SIGIO (I/O possible) ---
<... rt_sigsuspend resumed> ) = 29
sigreturn() = ? (mask now [])
vs:
rt_sigsuspend([]) = ? ERESTARTNOHAND (To be restarted)
--- SIGIO (I/O possible) @ 0 (0) ---
sigreturn() = ? (mask now [])
x86:
rt_sigsuspend([]) = ? ERESTARTNOHAND (To be restarted)
--- {si_signo=SIGIO, si_code=SI_USER} (I/O possible) ---
sigreturn() = ? (mask now [])
So, this patch should fix:
1. The race which I identified in the signal handling code (I think x86
and other architectures can suffer from it too.)
2. The warning from try_to_freeze.
3. The unanticipated change to strace output.
Arnd, can you test this to make sure your gdb test case still works, and
Mark, can you test this to make sure it fixes your problem please?
Thanks.
arch/arm/include/asm/thread_info.h | 3 +
arch/arm/kernel/entry-common.S | 11 ++
arch/arm/kernel/ptrace.c | 2 +
arch/arm/kernel/signal.c | 209 ++++++++++++++++++++++++------------
4 files changed, 155 insertions(+), 70 deletions(-)
diff --git a/arch/arm/include/asm/thread_info.h b/arch/arm/include/asm/thread_info.h
index 7b5cc8d..40df533 100644
--- a/arch/arm/include/asm/thread_info.h
+++ b/arch/arm/include/asm/thread_info.h
@@ -129,6 +129,7 @@ extern void vfp_flush_hwstate(struct thread_info *);
/*
* thread information flags:
* TIF_SYSCALL_TRACE - syscall trace active
+ * TIF_SYS_RESTART - syscall restart processing
* TIF_SIGPENDING - signal pending
* TIF_NEED_RESCHED - rescheduling necessary
* TIF_NOTIFY_RESUME - callback before returning to user
@@ -139,6 +140,7 @@ extern void vfp_flush_hwstate(struct thread_info *);
#define TIF_NEED_RESCHED 1
#define TIF_NOTIFY_RESUME 2 /* callback before returning to user */
#define TIF_SYSCALL_TRACE 8
+#define TIF_SYS_RESTART 9
#define TIF_POLLING_NRFLAG 16
#define TIF_USING_IWMMXT 17
#define TIF_MEMDIE 18 /* is terminating due to OOM killer */
@@ -147,6 +149,7 @@ extern void vfp_flush_hwstate(struct thread_info *);
#define TIF_SECCOMP 21
#define _TIF_SIGPENDING (1 << TIF_SIGPENDING)
+#define _TIF_SYS_RESTART (1 << TIF_SYS_RESTART)
#define _TIF_NEED_RESCHED (1 << TIF_NEED_RESCHED)
#define _TIF_NOTIFY_RESUME (1 << TIF_NOTIFY_RESUME)
#define _TIF_SYSCALL_TRACE (1 << TIF_SYSCALL_TRACE)
diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S
index b2a27b6..e922b85 100644
--- a/arch/arm/kernel/entry-common.S
+++ b/arch/arm/kernel/entry-common.S
@@ -45,6 +45,7 @@ ret_fast_syscall:
fast_work_pending:
str r0, [sp, #S_R0+S_OFF]! @ returned r0
work_pending:
+ enable_irq
tst r1, #_TIF_NEED_RESCHED
bne work_resched
tst r1, #_TIF_SIGPENDING|_TIF_NOTIFY_RESUME
@@ -56,6 +57,13 @@ work_pending:
bl do_notify_resume
b ret_slow_syscall @ Check work again
+work_syscall_restart:
+ mov r0, sp @ 'regs'
+ bl syscall_restart @ process system call restart
+ teq r0, #0 @ if ret=0 -> success, so
+ beq ret_restart @ return to userspace directly
+ b ret_slow_syscall @ otherwise, we have a segfault
+
work_resched:
bl schedule
/*
@@ -69,6 +77,9 @@ ENTRY(ret_to_user_from_irq)
tst r1, #_TIF_WORK_MASK
bne work_pending
no_work_pending:
+ tst r1, #_TIF_SYS_RESTART
+ bne work_syscall_restart
+ret_restart:
#if defined(CONFIG_IRQSOFF_TRACER)
asm_trace_hardirqs_on
#endif
diff --git a/arch/arm/kernel/ptrace.c b/arch/arm/kernel/ptrace.c
index 2491f3b..ac8c34e 100644
--- a/arch/arm/kernel/ptrace.c
+++ b/arch/arm/kernel/ptrace.c
@@ -177,6 +177,7 @@ put_user_reg(struct task_struct *task, int offset, long data)
if (valid_user_regs(&newregs)) {
regs->uregs[offset] = data;
+ clear_ti_thread_flag(task_thread_info(task), TIF_SYS_RESTART);
ret = 0;
}
@@ -604,6 +605,7 @@ static int gpr_set(struct task_struct *target,
return -EINVAL;
*task_pt_regs(target) = newregs;
+ clear_ti_thread_flag(task_thread_info(target), TIF_SYS_RESTART);
return 0;
}
diff --git a/arch/arm/kernel/signal.c b/arch/arm/kernel/signal.c
index 0340224..42a1521 100644
--- a/arch/arm/kernel/signal.c
+++ b/arch/arm/kernel/signal.c
@@ -649,6 +649,135 @@ handle_signal(unsigned long sig, struct k_sigaction *ka,
}
/*
+ * Syscall restarting codes
+ *
+ * -ERESTARTSYS: restart system call if no handler, or if there is a
+ * handler but it's marked SA_RESTART. Otherwise return -EINTR.
+ * -ERESTARTNOINTR: always restart system call
+ * -ERESTARTNOHAND: restart system call only if no handler, otherwise
+ * return -EINTR if invoking a user signal handler.
+ * -ERESTART_RESTARTBLOCK: call restart syscall if no handler, otherwise
+ * return -EINTR if invoking a user signal handler.
+ */
+static void setup_syscall_restart(struct pt_regs *regs)
+{
+ regs->ARM_r0 = regs->ARM_ORIG_r0;
+ regs->ARM_pc -= thumb_mode(regs) ? 2 : 4;
+}
+
+/*
+ * Depending on the signal settings we may need to revert the decision
+ * to restart the system call. But skip this if a debugger has chosen
+ * to restart at a different PC.
+ */
+static void syscall_restart_handler(struct pt_regs *regs, struct k_sigaction *ka)
+{
+ if (test_and_clear_thread_flag(TIF_SYS_RESTART)) {
+ long r0 = regs->ARM_r0;
+
+ /*
+ * By default, return -EINTR to the user process for any
+ * syscall which would otherwise be restarted.
+ */
+ regs->ARM_r0 = -EINTR;
+
+ if (r0 == -ERESTARTNOINTR ||
+ (r0 == -ERESTARTSYS && !(ka->sa.sa_flags & SA_RESTART)))
+ setup_syscall_restart(regs);
+ }
+}
+
+/*
+ * Handle syscall restarting when there is no user handler in place for
+ * a delivered signal. Rather than doing this as part of the normal
+ * signal processing, we do this on the final return to userspace, after
+ * we've finished handling signals and checking for schedule events.
+ *
+ * This avoids bad behaviour such as:
+ * - syscall returns -ERESTARTNOHAND
+ * - signal with no handler (so we set things up to restart the syscall)
+ * - schedule
+ * - signal with handler (eg, SIGALRM)
+ * - we call the handler and then restart the syscall
+ *
+ * In order to avoid races with TIF_NEED_RESCHED, IRQs must be disabled
+ * when this function is called and remain disabled until we exit to
+ * userspace.
+ */
+asmlinkage int syscall_restart(struct pt_regs *regs)
+{
+ struct thread_info *thread = current_thread_info();
+
+ clear_ti_thread_flag(thread, TIF_SYS_RESTART);
+
+ /*
+ * Restart the system call. We haven't setup a signal handler
+ * to invoke, and the regset hasn't been usurped by ptrace.
+ */
+ if (regs->ARM_r0 == -ERESTART_RESTARTBLOCK) {
+ if (thumb_mode(regs)) {
+ regs->ARM_r7 = __NR_restart_syscall - __NR_SYSCALL_BASE;
+ regs->ARM_pc -= 2;
+ } else {
+#if defined(CONFIG_AEABI) && !defined(CONFIG_OABI_COMPAT)
+ regs->ARM_r7 = __NR_restart_syscall;
+ regs->ARM_pc -= 4;
+#else
+ u32 sp = regs->ARM_sp - 4;
+ u32 __user *usp = (u32 __user *)sp;
+ int ret;
+
+ /*
+ * For OABI, we need to play some extra games, because
+ * we need to write to the users stack, which we can't
+ * do reliably from IRQs-disabled context. Temporarily
+ * re-enable IRQs, perform the store, and then plug
+ * the resulting race afterwards.
+ */
+ local_irq_enable();
+ ret = put_user(regs->ARM_pc, usp);
+ local_irq_disable();
+
+ /*
+ * Plug the reschedule race - if we need to reschedule,
+ * abort the syscall restarting. We haven't modified
+ * anything other than the attempted write to the stack
+ * so we can merely retry later.
+ */
+ if (need_resched()) {
+ set_ti_thread_flag(thread, TIF_SYS_RESTART);
+ return -EINTR;
+ }
+
+ /*
+ * We failed (for some reason) to write to the stack.
+ * Terminate the task.
+ */
+ if (ret) {
+ force_sigsegv(0, current);
+ return -EFAULT;
+ }
+
+ /*
+ * Success, update the stack pointer and point the
+ * PC at the restarting code.
+ */
+ regs->ARM_sp = sp;
+ regs->ARM_pc = KERN_RESTART_CODE;
+#endif
+ }
+ } else {
+ /*
+ * Simple restart - just back up and re-execute the last
+ * instruction.
+ */
+ setup_syscall_restart(regs);
+ }
+
+ return 0;
+}
+
+/*
* Note that 'init' is a special process: it doesn't get signals it doesn't
* want to handle. Thus you cannot kill init even with a SIGKILL even by
* mistake.
@@ -659,7 +788,6 @@ handle_signal(unsigned long sig, struct k_sigaction *ka,
*/
static void do_signal(struct pt_regs *regs, int syscall)
{
- unsigned int retval = 0, continue_addr = 0, restart_addr = 0;
struct k_sigaction ka;
siginfo_t info;
int signr;
@@ -674,32 +802,16 @@ static void do_signal(struct pt_regs *regs, int syscall)
return;
/*
- * If we were from a system call, check for system call restarting...
+ * Set the SYS_RESTART flag to indicate that we have some
+ * cleanup of the restart state to perform when returning to
+ * userspace.
*/
- if (syscall) {
- continue_addr = regs->ARM_pc;
- restart_addr = continue_addr - (thumb_mode(regs) ? 2 : 4);
- retval = regs->ARM_r0;
-
- /*
- * Prepare for system call restart. We do this here so that a
- * debugger will see the already changed PSW.
- */
- switch (retval) {
- case -ERESTARTNOHAND:
- case -ERESTARTSYS:
- case -ERESTARTNOINTR:
- regs->ARM_r0 = regs->ARM_ORIG_r0;
- regs->ARM_pc = restart_addr;
- break;
- case -ERESTART_RESTARTBLOCK:
- regs->ARM_r0 = -EINTR;
- break;
- }
- }
-
- if (try_to_freeze())
- goto no_signal;
+ if (syscall &&
+ (regs->ARM_r0 == -ERESTARTSYS ||
+ regs->ARM_r0 == -ERESTARTNOINTR ||
+ regs->ARM_r0 == -ERESTARTNOHAND ||
+ regs->ARM_r0 == -ERESTART_RESTARTBLOCK))
+ set_thread_flag(TIF_SYS_RESTART);
/*
* Get the signal to deliver. When running under ptrace, at this
@@ -709,19 +821,7 @@ static void do_signal(struct pt_regs *regs, int syscall)
if (signr > 0) {
sigset_t *oldset;
- /*
- * Depending on the signal settings we may need to revert the
- * decision to restart the system call. But skip this if a
- * debugger has chosen to restart at a different PC.
- */
- if (regs->ARM_pc == restart_addr) {
- if (retval == -ERESTARTNOHAND
- || (retval == -ERESTARTSYS
- && !(ka.sa.sa_flags & SA_RESTART))) {
- regs->ARM_r0 = -EINTR;
- regs->ARM_pc = continue_addr;
- }
- }
+ syscall_restart_handler(regs, &ka);
if (test_thread_flag(TIF_RESTORE_SIGMASK))
oldset = ¤t->saved_sigmask;
@@ -740,38 +840,7 @@ static void do_signal(struct pt_regs *regs, int syscall)
return;
}
- no_signal:
if (syscall) {
- /*
- * Handle restarting a different system call. As above,
- * if a debugger has chosen to restart at a different PC,
- * ignore the restart.
- */
- if (retval == -ERESTART_RESTARTBLOCK
- && regs->ARM_pc == continue_addr) {
- if (thumb_mode(regs)) {
- regs->ARM_r7 = __NR_restart_syscall - __NR_SYSCALL_BASE;
- regs->ARM_pc -= 2;
- } else {
-#if defined(CONFIG_AEABI) && !defined(CONFIG_OABI_COMPAT)
- regs->ARM_r7 = __NR_restart_syscall;
- regs->ARM_pc -= 4;
-#else
- u32 __user *usp;
-
- regs->ARM_sp -= 4;
- usp = (u32 __user *)regs->ARM_sp;
-
- if (put_user(regs->ARM_pc, usp) == 0) {
- regs->ARM_pc = KERN_RESTART_CODE;
- } else {
- regs->ARM_sp += 4;
- force_sigsegv(0, current);
- }
-#endif
- }
- }
-
/* If there's no signal to deliver, we just put the saved sigmask
* back.
*/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Hi,
the current gcc-4.6 packages build for both softfp and hard, so that the armel
and (not yet existing) armhf packages can be installed together in the system.
To enable multilib, I currently use the rather complicated arm-multilib.diff,
which works, but doesn't seem to be correct. With the much simpler arm-ml2.diff,
the directory for the default multilib is not resolved to . (as done for e.g.
amd64).
[amd64] $ gcc -print-multi-directory
.
[armel] $ gcc -print-multi-directory
sf
If I understand the code correctly, this comes from the hard setting of
MULTILIB_DEFAULTS in the arm target. If you look at mips, you see
#ifndef MULTILIB_DEFAULTS
#define MULTILIB_DEFAULTS \
{ MULTILIB_ENDIAN_DEFAULT, MULTILIB_ISA_DEFAULT, MULTILIB_ABI_DEFAULT }
#endif
which records the proper selected defaults.
Should something similiar be done for arm?
Matthias
Hi; I've just completed a tricky rebase of qemu-linaro on upstream;
there were several invasive upstream changes which have landed
recently and which meant that I had to tweak a lot of the qemu-linaro
patches as I did the rebase. I've tested the results but it's possible
that some breakage may have slipped through...
So if you're a regular user of qemu-linaro's system mode and feel
like checking the sources out of git:
git://git.linaro.org/qemu/qemu-linaro.git
building them and testing that the things you regularly do with it
haven't regressed, then I'd appreciate it, and you can help us avoid
any nasty surprises in the next (2011.09) release.
Thanks!
-- Peter Maydell
See:
http://builds.linaro.org/toolchain/gcc-4.7~svn178154
The problem is -Werror triggering on:
../../../gcc-4.7~/gcc/config/arm/arm.c: In function 'int
optimal_immediate_sequence_1(rtx_code, long long unsigned int,
four_ints*, int)':
../../../gcc-4.7~/gcc/config/arm/arm.c:2690:46: error: comparison
between signed and unsigned integer expressions [-Werror=sign-compare]
../../../gcc-4.7~/gcc/config/arm/arm.c:2690:60: error: comparison
between signed and unsigned integer expressions [-Werror=sign-compare]
../../../gcc-4.7~/gcc/config/arm/arm.c:2691:20: error: comparison
between signed and unsigned integer expressions [-Werror=sign-compare]
../../../gcc-4.7~/gcc/config/arm/arm.c:2691:34: error: comparison
between signed and unsigned integer expressions [-Werror=sign-compare]
../../../gcc-4.7~/gcc/config/arm/arm.c:2701:16: error: comparison
between signed and unsigned integer expressions [-Werror=sign-compare]
../../../gcc-4.7~/gcc/config/arm/arm.c:2702:18: error: comparison
between signed and unsigned integer expressions [-Werror=sign-compare]
../../../gcc-4.7~/gcc/config/arm/arm.c:2703:18: error: comparison
between signed and unsigned integer expressions [-Werror=sign-compare]
-- Michael
Booked hotel and travel for Linaro Connect in Orlando.
Fixed a couple of bugs in my thumb2 constants patch and retested. The
test results came back clean, so I've committed it upstream.
Bernd claimed he has found some test failures that might be caused by my
patch, but I couldn't reproduce them at first. I've now got the failure,
but I've not yet investigated the cause. Next week ...
Committed my widening multiplies patches to Linaro GCC, after first
convincing Richard Sandiford that it wasn't totally bonkers.
Started work on ARM GCC tuning options:
* Submitted a patch for -m{arch,cpu,tune}=native
http://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg14225.html
* Submitted a patch for -m{arch,cpu,tune}=generic-armv7-a
http://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg14231.html
Joseph found an issue with those patches, but that was easily resolved
and I've reposted both.
RAG:
Red:
Amber: OMAP3 patch upstreaming is (still) slower progress than hoped
Green:
Current Milestones:
|| || Planned || Estimate || Actual ||
||qemu-linaro 2011-09 || 2011-09-15 || 2011-09-15 || ||
Historical Milestones:
||qemu-linaro 2011-04 || 2011-04-21 || 2011-04-21 || 2011-04-21 ||
||qemu-linaro 2011-05 || 2011-05-19 || 2011-05-19 || n/a ||
||close out 1105 blueprints || 2011-05-28 || 2011-05-28 || 2011-05-19 ||
||complete 1111 planning || 2011-05-28 || 2011-05-28 || 2011-05-27 ||
||qemu-linaro-2011-06 || 2011-06-16 || 2011-06-16 || 2011-06-16 ||
||qemu-linaro-2011-07 || 2011-07-21 || 2011-07-21 || 2011-07-21 ||
||qemu-linaro 2011-08 || 2011-08-18 || 2011-08-18 || 2011-08-18 ||
== upstream-omap3-patches ==
* finished the fairly nasty rebase of qemu-linaro onto upstream master
(several invasive changes went into master that meant a number of
our local patches needed updating)
* omap_gpmc changes cleaned up, updated to use MemoryRegions, and
submitted to upstream (17 patch patchset)
== gsoc-support ==
* final meeting/evaluation writeup now the GSoC project has ended
* we now have a first pass at what some upstream-acceptable versions
of the Android goldfish platform devices might look like, and a
much better idea of the degree of difference between the android
and upstream qemu trees, and where the pitfalls/issues lie
== other ==
* fixed some breakage upstream in n810 and integratorcp models caused
by landing of MemoryRegion changes
* trying to write up what my preferred model of device connections
would look like in concrete C implementation terms
* interesting discussion on boot-architecture list about how boot
loaders should start hypervisor-aware software (xen, kvm kernel):
http://www.mail-archive.com/boot-architecture@lists.linaro.org/msg00053.html
Current qemu patch status is tracked here:
https://wiki.linaro.org/PeterMaydell/QemuPatchStatus
Absences:
Oct 30-Nov 04: Linaro Connect Q4.11
== This week ==
* Wrote some patches to make SMS schedule register moves. They made a
significant difference to some libav loops. I'm running a regression
test on pwoerpc-ibm-aix5.3.0 and will submit upstream next week if
all goes OK.
* Looked at why mjpegenc was so much worse with SMS. Turned out to be
a register spilling problem. Found that -fira-algorithm=priority
avoids the regression and makes several other tests better too.
(I just tested that to see whether there was a feasible register
allocation for these cases; -fira-algorithm=priority isn't the
way to go.)
* Saw that the register allocator seemed to be tripping over the
XImode "structure" values, and that we still had one vector move
per structure element by the time we get to the scheduling passes.
Eliminated those with a combination of one fix and one hack.
It seemed to avoid the allocation problems.
* Patch review (Linaro and upstream).
* Backported libgcc visibility fix to 4.6 and 4.5.
== Next week ==
* Submit register-scheduling patch.
* Submit memory cost patch (from auto-inc-dec changes)
* Possibly submit the auto-inc-dec changes themselves, depending on
how the rtx cost discussion goes.
Richard
==GCC==
===Progress===
* Looked at the vectorize_with_neon_quad failure again and decided
that I had to handle another case but not convinced that the extra
stall we'd get in this case was worth it. In any case it would have
been a workaround but Richard Sandiford fixed this by getting df to do
the right thing which would have been the right fix.
* Backported tbh patch.
* Backported conditional execution improvements patch from Jiangning
to Linaro 4.6 branch.
* Committed the LTO + Neon / Android intrinsics patch.
* Panda seems more reliable this week but I suspect that's the room
cooling more .
* Broke up a few blueprints and marked some as done.
* BRANCH_COST results show not a huge variation in SPEC and there are
some results that are inconsistent.. Need to run a few benchmarks
again Sigh :( .
* Finished the A9 scheduler patch for smull and friends and committed
upstream and into Linaro 4.6.
* Reviewed the shrink-wrapping patch and the widening multiplies patch
for a short duration.
* Looked at the failures in the "popular embedded benchmark" for
sometime with Asa.
* Tried one of the ICE patches and that seemed to work just fine with
bootstrap on FSF trunk. Need to figure out why this was breaking in
the Linaro 4.6 tree. https://bugs.launchpad.net/gcc-linaro/+bug/689887
=== Plans ===
Next Week - Holiday :) Feet not up but walking in what looks like
typical bank holiday weather ... Might check email later in the week.
Meetings:
* 1-1s
* TCWG calls
* Thumb2 performance call.
Absences.
* 29th Aug - Sept. 2 - Holiday booked and approved.
* 31st Oct - 4th Nov - Linaro Summit Orlando - Travel booked - hotel
to be booked.
* Investigated the errors in the automotive test and concluded that they are
CRC-errors, but not depending on the test case result (non intrusive crc
check). We decided these errors need to be cleared out once and for all.
Michael and Ramana helping out with continued investigation.
* EEMBC run on both Panda and Snowball with gcc4.5.2. Results look
reasonable, but Michael will also have a look. I will spend a little more
time comparing the results from the two boards.
* Started to run SPEC2K on the Panda board.
Best Regards
Åsa
Following on from yesterday's call about what it would take to enable
SMS by default: one of the problems I was seeing with the SMS+IV patch
was that we ended up with excessive moves. E.g. a loop such as:
void
foo (int *__restrict a, int n)
{
int i;
for (i = 0; i < n; i += 2)
a[i] = a[i] * a[i + 1];
}
would end up being scheduled with an ii of 3, which means that in the
ideal case, each loop iteration would take 3 cycles. However, we then
added ~8 register moves to the loop in order to satisfy dependencies.
Obviously those 8 moves add considerably to the iteration time.
I played around with a heuristic to see whether there were enough
free slots in the original schedule to accomodate the moves.
That avoided the problem, but it was a hack: the moves weren't
actually scheduled in those slots. (In current trunk, the moves
generated for an instruction are inserted immediately before that
instruction.)
I mentioned this to Revital, who told me that Mustafa Hagog had
tried a more complete approach that really did schedule the moves.
That patch was quite old, so I ended up reimplementing the same kind
of idea in a slightly different way. (The main functional changes
from Mustafa's version were to schedule from the end of the window
rather than the start, and to use a cyclic window. E.g. moves for
an instruction in row 0 column 0 should be scheduled starting at
row ii-1 downwards.)
The effect on my flawed libav microbenchmarks was much greater
than I imagined. I used the options:
-mcpu=cortex-a8 -mfpu=neon -mfloat-abi=softfp -mvectorize-with-neon-quad
-fmodulo-sched -fmodulo-sched-allow-regmoves -fno-auto-inc-dec
The "before" code was from trunk, the "after" code was trunk + the
register scheduling patch alone (not the IV patch). Only the tests
that have different "before" and "after" code are run. The results were:
a3dec
before: 500000 runs take 4.68384s
after: 500000 runs take 4.61395s
speedup: x1.02
aes
before: 500000 runs take 20.0523s
after: 500000 runs take 16.9722s
speedup: x1.18
avs
before: 1000000 runs take 15.4698s
after: 1000000 runs take 2.23676s
speedup: x6.92
dxa
before: 2000000 runs take 18.5848s
after: 2000000 runs take 4.40607s
speedup: x4.22
mjpegenc
before: 500000 runs take 28.6987s
after: 500000 runs take 7.31342s
speedup: x3.92
resample
before: 1000000 runs take 10.418s
after: 1000000 runs take 1.91016s
speedup: x5.45
rgb2rgb-rgb24tobgr16
before: 1000000 runs take 1.60513s
after: 1000000 runs take 1.15643s
speedup: x1.39
rgb2rgb-yv12touyvy
before: 1500000 runs take 3.50122s
after: 1500000 runs take 3.49887s
speedup: x1
twinvq
before: 500000 runs take 0.452423s
after: 500000 runs take 0.452454s
speedup: x1
Taking resample as an example: before the patch we had an ii of 27,
stage count of 6, and 12 vector moves. Vector moves can't be dual
issued, and there was only one free slot, so even in theory, this loop
takes 27 + 12 - 1 = 38 cycles. Unfortunately, there were so many new
registers that we spilled quite a few.
After the patch we have an ii of 28, a stage count of 3, and no moves,
so in theory, one iteration should take 28 cycles. We also don't spill.
So I think the difference really is genuine. (The large difference
in moves between ii=27 and ii=28 is because in the ii=27 schedule,
a lot of A--(T,N,0)-->B (intra-cycle true) dependencies were scheduled
with time(B) == time(A) + ii + 1.)
I also saw benefits in one test in a "real" benchmark, which I can't
post here.
Richard
Hello,
Following today performance call
(https://wiki.linaro.org/WorkingGroups/ToolChain/Meetings/2011-08-23)
here are some points raised regarding the steps towards enabling SMS by default:
* Benchmarks testing:
-- Running benchmarks as EEMBC and SPEC2006 with SMS enabled is
crucial to expose loops where SMS degrades the performance. those
loops need to be analysed to construct a cost model.
-- SMS increases code size by introducing prologue and epilogue to the
loop kernel. This should also be measured.
-- Measure increase in compile time: on native or cross build?
Currently SMS fails to bootstrap trunk on ARM machine. this should
also be taken into account when considering enabling it by default.
Should it be turned on with -O2 or -O3?
SMS flags to use for testing:
-O3 -fmodulo-sched-allow-regmoves -fmodulo-sched
-funsafe-loop-optimizations -fno-auto-inc-dec
Thanks,
Revital
Hi
Some time ago we agreed that not everyone here uses Ubuntu distribution
and decided to provide so called 'generic linux' cross toolchain.
Recently I managed to get it done and now need brave testers to tell is
it working or not.
Get it here: http://people.linaro.org/~hrw/generic-linux/ (64bit only)
Needed files are toolchain-11.07.tar.xz and init.sh script. Unpack
tarball from / so /opt/linaro/11.07/ will be populated and put init.sh
anywhere you want (it will be integrated into tarball later).
How to use:
$ source init.sh
this will add cross toolchain into PATH and also set LD_LIBRARY_PATH to
two directories:
- one with binutils libraries
- second with all extra libraries which may be needed
Feel free to experiment with second dir by removing files from there and
checking are system provided libs are fine too.
So far I checked this toolchain under few distributions:
- Ubuntu 10.04 'lucid' LTS
- Ubuntu 11.04 'natty'
- Fedora 14
- OpenSUSE 11.4
- CentOS 5.6
It failed only under CentOS (which was expected due to it's age).
How did I checked? So far compilation of 'gpm' and 'zlib' were tested.
==GCC==
===Progress===
* Continue to look at the test failure with mvectorize-with-neon-quad.
Should be able to commit the backend workaround in on Monday .
* Having some problems getting my panda board working reliably. I'm
not sure if its the temperature or what but when it gets hot in the
office as it was on Tuesday keeping it working reliably is hard. The
board locks up and then crashes quite often.
* Looked at VFP moves again for some more time.
* Committed tbh range change.
* Committed fixes for PR50022
=== Plans ===
* Finish off VFP moves patch.
* Look at BRANCH_COST results.
* Breakdown the T2 performance blueprints into smaller blueprints.
* Backport tbh range changes to Linaro 4.6
* Test the intrinsics patch once with some more intrinsics tests and
then merge it in to Linaro gcc 4.6
Meetings:
* 1-1s
* TCWG calls
Absences.
* 29th Aug - Sept. 2 - Holiday booked and approved.
* 31st Oct - 4th Nov - Linaro Summit Orlando - Travel booked - hotel
to be booked.
Hi all,
I'm having real trouble here :(
I just can't seem to get bzr to work! I've tried to branch
gcc-linaro/4.6 again and again, and it just won't. My other machine
refuses to do the merge from lp:gcc/4.6, presumable because the bzr on
there is too old.
I'm stuck. Can anybody else do the merge from upstream?
I'm going to keep trying.
Andrew
* GCC
Completed merging GCC 4.5 from FSF to Linaro.
Spun release tarballs for Linaro GCC 4.5 and 4.6. Uploaded them to
Michael's server, and kicked off the test builds remotely.
Submitted expenses for Linaro Connect.
Finally (!) committed my widening multiplies patches to FSF. :)
Continued trying to figure out what's wrong with my thumb2 constants
patches. I think I have identified a possible flaw, but I'm having
trouble reproducing the problem as I have been unable to pin down a
specific constant/expression combination that makes it through all the
other optimizations intact, and triggers the problem. I've not run out
of idea yet though ...
* Other
On leave all day Wednesday.
Prepared for the big CodeSourcery to Mentor switch-over by moving all my
work-in-progress data over to the new servers.
== String routines ==
* Working through updating my eglibc patch for memchr, I think I'm
nearly there - took way too long
to persuade an eglibc make check to work cross (can't get native
build happy).
== QEMU ==
* Sent a new version of my QEMU patch for the atomic helpers to Peter.
* Tested the Android beagle image on a real beagle - it fails in
pretty much the same way as the
QEMU run.
== Other ==
* Had a brief look at bug 825711 - scribus ftbfs on ARM - this is QT
being built to define qreal as
float on ARM when it's double on most other things, scribus
having a qreal variable and something
it's defined as a double and then passing it to a template that
requires two arguments of the same type;
not really sure which one is to blame here!
I'm on holiday next week.
Dave
== GDB ==
* Created and published Linaro GDB 7.3 2011-08 release.
* Analyzed --with-sysroot=remote: testsuite failures,
and opened bug LP #829595.
* Reviewed Yao's latest Thumb-2 displaced stepping patch.
== Schedule ==
* I'll be on vacation 08/23 through 08/31.
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E.
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk
Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294
Hi,
* continued to work on getting libunwind support for remote unwinding
upstream
* reworked some of the code to address concerns from the ml
* now upstream!
* made smaller fixes to have another libunwind testcase passing
* interfaced with the Linaro Android group to solve an issue where a
compile was failing when using -O3
* turned out that the Linaro GCC vectorizes a loop by generating some
neon instructions
* unfortunately the gas of the 2.20.1 binutils (that is currently
used by the Linaro Android toolchain) doesn't properly understand the
alignment restrictions of the generated asm code and throws an error
* this has be fixed upstream and using a gas from recent binutils
fixes the issue
* Bernhard is already working on getting newer binutils in their
Androuid toolchain build system
* continued the work to get libunwind building on Linaro Android
* wrote an Android.mk and got an initial libunwind.so built (ugly
hacks involved)
* next step is modify the debuggerd to make use of libunwind.so
Note: Next week I'll be on vacation.
Regards
Ken
== This week ==
* Looked at LP #823711. Turned out to be a problem with symbol
visibility in libgcc.a. Tested a fix that was accepted and applied
upstream. Will backport to upstream release branches, so we should
be able to pull the fix in that way.
* Backported the fix for BZ PR49987 to Linaro 4.6 and 4.5.
* Looked at the regrename bug that Ramana reported on gcc@.
* Looked at why libav wasn't being vectorised. Discussed with Ira.
I think we now have a Plan.
* Submitted address writeback scheduling patches upstream.
* Submitted and applied some tweaks to the rtx cost interface upstream.
* Spent a while trying to figure out what the targetm.rtx_costs
API actually is, and how rtx_cost should use it to evaluate the
cost of a SET. Discussed on gcc@.
* Found that ARM was giving SETs a base cost of 4 instructions.
Benchmarked the cost of "fixing" this. It generally seemed positive.
* Wrote a couple of other rtx cost patches.
== Next week ==
* Backport fix for #823711 to upstream branches.
* Hopefully finish off rtx costs stuff.
* Unless there's a clear outcome from the gcc@ discussion, I think
I'll abandon my idea of using insn_rtx_cost in the new auto inc/dec
patch, and simply sum the cost of every SET. Should be a small change.
Richard
* Started running EEMBC on Panda. Got three errors in the automotive test at
this point.
* Started documenting necessary steps for my start-up task:
https://wiki.linaro.org/Internal/ToolChain/Benchmarks/First%20time%20notes
* Upgraded the Snowball board to the latest version (V3). Created a
corresponding test image for Snowball (Linaro 11.06). There is a problem
with the serial console freezing after a couple of minutes without any
error, not sure if it is a complete crash or just the serial output. The
people I have talked to so far has not experienced the same problem. I will
set up the networking for the board and see where ssh gets me.
Best Regards
Åsa
Hi,
- change of default vector size for auto-vectorization on NEON -
submitted and approved
- continued working on vectorization of widening shifts
- looked into SLP vectorization for libav
- two vacation days
I'll be on vacation on Aug 22-30.
Ira
I put a build harness around libav and gathered some profiling data. See:
bzr branch lp:~linaro-toolchain-dev/+junk/libav-suite
It includes a Makefile that builds a C only, h.264 only decoder and
two Creative Commons licensed videos to use as input.
README.rst has the basic commands for running ffmpeg and initial perf
results showing the hot functions. Dave, 20 % of the time is spent in
memcpy() so you might want to have a look.
The vectoriser has no effect. GCC 4.5 is ~17 % faster than 4.6. I'll
look into extracting and harnessing the functions themselves later
this week.
-- Michael
Hi,
is the Linaro toolchain (esp. gcc) useful on x86/x86_64, or is an
attempt to use the Linaro toolchain with such a target just asking for
trouble?
(No, I'm not secretly an Intel spy ;) Just trying to have some fun
with my desktop machine ;) )
ttyl
bero
The Linaro Toolchain Working Group is pleased to announce the release
of Linaro GDB 7.3.
Linaro GDB 7.3 2011.08 is the first release in the 7.3 series. Based
off the latest GDB 7.3, it includes a number of ARM-focused bug fixes.
This release includes all bug fixes from the latest Linaro GDB 7.2
release that were not already included in FSF GDB 7.3.
In addition, this release fixes:
* LP: #804401 [remote testsuite] Thread support
* LP: #804387 [remote testsuite] Shared library test problems
* LP: #804392 [remote testsuite] Rebuilt executables not copied
* LP: #804396 [remote testsuite] Spurious failures
The source tarball is available at:
https://launchpad.net/gdb-linaro/+milestone/7.3-2011.08
More information on Linaro GDB is available at:
https://launchpad.net/gdb-linaro
The Linaro Toolchain Working Group is pleased to announce the release
of Linaro QEMU 2011.08.
Linaro QEMU 2011.08 is the latest monthly release of qemu-linaro. Based
off upstream (trunk) QEMU, it includes a number of ARM-focused bug fixes
and enhancements.
This month's release is primarily minor improvements:
- Fixes LP:816791: ARMv6 cp15 barrier instructions now work
in linux-user mode as well as system mode
- Support for ARM1176JZF-S core has been added (thanks to
Jamie Iles <jamie(a)jamieiles.com>)
- Add workaround for kernel bug LP:727781 (which has resurfaced
in 3.0) to suppress warnings about bad-width omap i2c accesses
Plus of course new upstream fixes and improvements.
Performance:
When running qemu in system mode with an SD card image we have
determined that performance is best when the image is in writeback
caching mode. This significantly increases the performance of the SD
card (by factors of 10 or more). An example command line option is:
-drive if=sd,cache=writeback,file=my-sd-card.img
Note that cache=writeback may result in data not being written to
disk if the host system powers down unexpectedly (guest crashes
or powerdowns are not a problem).
Known issues:
- The beagle and beaglexm models still do not support USB networking
- There may be some problems with running multithreaded programs in
linux-user mode (LP:823902)
The source tarball is available at:
https://launchpad.net/qemu-linaro/+milestone/2011.08
Binary builds of this qemu-linaro release are being prepared and
will be available shortly for users of Ubuntu. Packages will be in
the linaro-maintainers tools ppa:
https://launchpad.net/~linaro-maintainers/+archive/tools/
More information on Linaro QEMU is available at:
https://launchpad.net/qemu-linaro
The Linaro Toolchain Working Group is pleased to announce the 2011.08
release of both Linaro GCC 4.6 and Linaro GCC 4.5.
Linaro GCC 4.6 2011.08 is the sixth release in the 4.6 series. Based
off the latest GCC 4.6.1+svn177703, it focuses on fixing bugs found
during the Android integration and in SMS. This is a quiet release
due to Linaro Connect.
Interesting changes include:
* Updates to 4.6.1+r177703
Fixes:
* LP: #736007 ICE immed_double_const at emit-rtl.c
* LP: #809768 ICE when compiling bionic's libm
* LP: #815777 Inconsistent packaging between tarball and root
directory names
Linaro GCC 4.5 2011.08 is the thirteenth release in the 4.5
series. Based off the latest GCC 4.5.3+svn177552, the release is
focused on maintenance.
Interesting changes in 4.5 include:
* Updates to 4.5.3+r177552
* Now builds for PowerPC
Fixes:
* LP: #736007 ICE immed_double_const at emit-rtl.c
* LP: #809768 ICE when compiling bionic's libm
* LP: #815435 ICE: insn does not satisfy its constraints
The source tarballs are available from:
https://launchpad.net/gcc-linaro/+milestone/4.6-2011.08https://launchpad.net/gcc-linaro/+milestone/4.5-2011.08
Downloads are available from the Linaro GCC page on Launchpad:
https://launchpad.net/gcc-linaro
Mailing list: http://lists.linaro.org/mailman/listinfo/linaro-toolchain
Bugs: https://bugs.launchpad.net/gcc-linaro/
Questions? https://ask.linaro.org/
Interested in commercial support? inquire at support(a)linaro.org
-- Michael
On Thu, Aug 18, 2011 at 4:16 AM, Richard Earnshaw
<Richard.Earnshaw(a)arm.com> wrote:
> I was just browsing libgmp this afternoon and noticed that it really
> could do with an overhaul to support recent ARM chips.
>
> The ARM code seems to have been written for StrongARM; which is now
> almost obsolete (for example, it loads from a cache line it is about to
> write to in order to pre-allocate the line in the cache).
>
> It doesn't support v4T interworking.
>
> It doesn't make any use of v5 or later instructions.
>
> There is some Thumb(1) code, but again it has no support for
> interworking, is pretty poor and limited in scope.
>
> I'm not sure overall how useful this is to gcc performance; the library
> is needed to build GCC, but I think it's mostly there to support libmpfr.
>
> Nevertheless, there are other apps out there that make use of this
> stuff, including some crypto code, IIRC.
I looked at using gmp as a benchmark some time ago. The assembly
version is twice as fast as the C version already, which is nice. I
assume NEON would be a big improvement as well.
I had a quick poke through the dependencies in Ubuntu and came up with
the following popular packages that use libgmp or libmpfr:
* guile
* python-crypto
* gch (Haskel)
* maxima
* darcs
Nothing earth shattering but probably worthwhile. I've registered:
https://blueprints.launchpad.net/linaro-toolchain-misc/+spec/improve-libgmp
so that we don't lose it.
-- Michael
Hi there. The 2011.08 release has been spun and is testing up well.
The 4.5 and 4.6 branches are now open so feel free to commit any
approved patches.
-- Michael
> . Would you be interested in adding a Firefox-based benchmark? As a large
> application it is a good testbed for LTO, FDO and other aggressive
> optimizations.
Sorry about the delayed response. I did notice your mail last week but
I was busy with our conference and then the first couple of days this
week have just disappeared with some internal training.
I would be interested in hearing how you get on with LTO and FDO on
ARM. Listening to Honza talking at the GCC unconference in London
about the memory usage for full LTO with trunk I did wonder what would
happen if we tried it on the ARM target to see what we got, but I
never managed to get around to trying anything there :) . We did look
at getting FDO working with Linaro GCC last cycle but there are still
a couple of issues with PGO in Linaro GCC 4.5.
With respect to LTO , the one problem we have currently is that the
Neon intrinsics aren't streamed out and streamed back in. So you might
have a few issues if your code uses arm_neon.h .
https://bugs.launchpad.net/gcc-linaro/+bug/823548 is an example of
this problem. This was fixed upstream and we probably just need to
backport that into our 4.6 tree. I've tried a backport this morning
and I think I have this right finally.
If you could do a build and a firefox benchmark run in about 30-60
minutes by all means please do let us know how you get on and what you
find. We've been steadily trying to improve the performance of the ARM
toolchain and the biggest improvements you'll notice will be with the
vectorizer but there will be other small improvements that you'll
notice in other general areas of code generation. We would be
interested in feedback about what can be done and to add to our queue
of things to look at and improve for the ARM port of GCC.
With respect to the images, Kiko's probably answered that bit.
cheers
Ramana
* GCC
Continued tracking down problems in my various broken patches. Fixed one
bug, investigated two more. Re-submitted the widening multiplies for
testing, and this time it returned with no problems. Yay, I can now
check it in next week.
Merged from upstream GCC 4.5. The launchpad import bug still exists
(although should not for much longer) so I had to ask on #launchpad to
get the imports done. Submitted the merged branch for testing.
Tried to merge GCC 4.6 similarly, but failed. Bzr just refused to play
ball, which was very frustrating. Michael Hope has now done the merge
instead.
* Other
On leave Wednesday and Friday.
* libauqntum - running the SMSed version on ARM machine did not show
significant improvement. Discussed it with Richard Sandiford.
Apparently in the SMS phase the instructions are of DI mode due to the
fact the loop contains 64 bit operations while they later been
generated as 32 bit operations. This makes SMS less accurate and I'm
now looking into a version which disables DI mode operations.
* Started to look at the potential of SMS on libav. Initial runs of
Richard's microbenchmarks with SMS show some regressions as well as
improvements that I'm looking at.
Hi there. I've written up the standard configurations that we use to
build and test Linaro GCC:
https://wiki.linaro.org/WorkingGroups/ToolChain/Configurations/GCC
It includes such things as flags, libraries, and sysroots. You might
find it useful to see what we're testing or, if new to compilers, what
a good starting point is.
-- Michael
== QEMU ==
* Finished off a first cut of the 64bit helper patch to QEMU
- Gave it to Peter and have reworked most of the things he commented on
* This also lead into a bit of a rabbit hole of finding various
generic QEMU threading issues
* Tested Peter's 11.08 QEMU release
(I used linaro-fetch-image-ui for the first time to grab the
release images; quite nice, hit
a couple of issues but much nicer than crawling around the site
to find where the hwpacks
are).
== Other ==
* Pinged gcc patches list for more comments on 64bit atomic patch
I'm on holiday the week of 22nd (i.e. the week after next).
Dave
== GDB ==
* Re-tested Linaro GDB 7.3 on Versatile Express (native
& remote testing).
* Committed patch to re-enable remote thread test cases
(#804401) to mainline and Linaro GDB 7.3.
* Reviewed Yao's latest Thumb-2 displaced stepping patch.
== GCC ==
* Patch review week.
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E.
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk
Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294
RAG:
Red:
Amber: OMAP3 patch upstreaming is slower progress than hoped
Green:
Current Milestones:
|| || Planned || Estimate || Actual ||
||qemu-linaro 2011-08 || 2011-08-18 || 2011-08-18 || ||
Historical Milestones:
||qemu-linaro 2011-04 || 2011-04-21 || 2011-04-21 || 2011-04-21 ||
||qemu-linaro 2011-05 || 2011-05-19 || 2011-05-19 || n/a ||
||close out 1105 blueprints || 2011-05-28 || 2011-05-28 || 2011-05-19 ||
||complete 1111 planning || 2011-05-28 || 2011-05-28 || 2011-05-27 ||
||qemu-linaro-2011-06 || 2011-06-16 || 2011-06-16 || 2011-06-16 ||
||qemu-linaro-2011-07 || 2011-07-21 || 2011-07-21 || 2011-07-21 ||
== linaro-qemu-11.11 ==
* put together release candidate tarball for 2011.08 release, tested
* added a workaround for omap kernel bug LP:727781 which had been fixed
in 2.6.x but has resurfaced in 3.0
* tarball now ready and only needs releasing next week
== 64-bit-sync-primitives ==
* reviewed David Gilbert's qemu patches to support 64 bit sync primitives
== upstream-omap3-patches ==
* testing/reading Avi's memory API patches to see how they fit in or
clash with the qdevification and other omap3 patches
== other ==
* more investigation/thought about LP:823902 -- qemu bug running
multithreaded programs in linux-user mode
* Manned Linaro demo stand at ARM Partner Meeting (Tue, Wed)
* Meetings: GSoC student x2, toolchain, toolchain standup, 1-2-1
Current qemu patch status is tracked here:
https://wiki.linaro.org/PeterMaydell/QemuPatchStatus
Absences:
15-19 August: KVM Forum and LinuxCon NA, Vancouver
== GCC ==
=== Progress ===
* Linaro sprint last week - one day of fun with broken laptop.
* Looked at how we could get BUILTIN_VECTORIZE_CONVERT work to allow
vectorizing some of the floating point conversions.
* Fixed PR50022 . Couple of iterations.
* Internal training for 2 days.
* Dusted off a couple of my old patches and sent them out after testing.
* Next to get back to old VFP and ivopts patch.
* Looked at a testfailure with -mvectorize-with-neon-quads with Ira .
=== Plans ===
* Continue to look at the test failure with mvectorize-with-neon-quad
* Finish off optimize_size patch based on comments.
* finish off case for handling tbh instrucitons.
* Commit fix for PR50022
* Look at some of the issues with VFP moves and try and get forward with it.
* Look at BRANCH_COST results.
Meetings:
* 1-1s
* TCWG calls
* GNU Toolchain planning meeting.
* Some patch review and bugzilla triaging.
Absences.
* 1st Aug - 5th August - Linaro sprint.
* 8th - 9th August - Internal training.
* 29th Aug - Sept. 2 - Holiday booked and approved.
* 31st Oct - 4th Nov - Linaro Summit Orlando - Travel to be booked.
== This week ==
* Looked a bug report that the fix for LP #736007 had caused regressions
on powerpc-darwin. It turned out to be a target-specific bug; the
backend has the same const_vector code as i386 and spu, but the fix for
PR34856 was never applied there. I'll submit the patch (and backport to
Linaro 4.6) once the bug submitter has had a chance to test it.
* Experimented with -falign-loops. Found that it triggered a bug in the
ARM minipool layout code. Posted patch upstream and committed.
Backported to 4.6.
* Committed patch to allow globs in define_bypass.
* Updated auto inc/dec patch after comments from Bernd and Stephen.
I'm pretty happy with it now, but there are a couple of prerequisite
patches I need to sort out first.
* Started getting those prerequisites ready.
* Decided that we needed something a bit more subtle than my original
insn_rtx_cost patch: at the moment, we simply don't use rtx costs
for lvalues. Wrote a series of patches to "improve" the rtx_cost
interface, including providing the outer operand number and an
indication of whether the rtx is an lvalue or an rvalue.
* Upgraded my laptop. This turnted out to be more eventful than
anticipated, and ended up taking a whole day.
== Next week ==
* Post auto inc/dec preparatory patches for review. Hopefully post
an RFA for the pass itself.
Richard
Hi,
* worked on getting the remote unwind support for ARM upstream
* noticed when building a recent android image of the
linaro_android_2.3.4 branch for the panda the init.rc attempts to mount
wrong partitions
* tracked down the commit and opened a bug
* linaro android team fixed it real quick
* started to integrate libunwind into Andriod
* two issues here:
- the build system requires an Android.mk (while libunwind is
autoconf+libtool based)
- libunwind uses some interfaces/headers that are not provided by the
bionlic libc
Regards
Ken
On Thu, Aug 04, 2011 at 12:03:00PM -0700, Taras Glek wrote:
> Recently we have been looking at how to squeeze more performance out
> of our toolchain for building Firefox on Android. Mike Hommey
> integrated GCC 4.6 into the android NDK and has been testing
> performance (with mixed results
> http://gcc.gnu.org/ml/gcc/2011-08/msg00096.html).
You should definitely be trying to build using the Linaro 4.5 and 4.6
compiler branches; they are pretty much guaranteed to give you better
performance, and if they don't, we're on the hook to fix it quickly! All
the patches go upstream, so there is no risk of you being stuck on a
fork -- it just makes everything you need available right now.
I'm copying the linaro-toolchain list to make sure that you get the
right people's attention (though if they weren't all coming back from
Connect in Cambridge this week they would have picked the email up
already).
> I like how Linaro is doing regular arm benchmarking, ie
> https://wiki.linaro.org/Platform/Android/AndroidToolchainBenchmarking/2011-…
We do much more than that, but it's not as easy to find right now; for
instance http://ex.seabright.co.nz/helpers/benchcompare is Michael's
regular release benchmark.
> . Would you be interested in adding a Firefox-based benchmark? As a
> large application it is a good testbed for LTO, FDO and other
> aggressive optimizations.
Totally. Let's do it. Can you give me an idea of what boards you are
testing the build on today? Do you have a test suite that we could run
in a reasonable timeframe (hours, not days)?
> We are also looking at setting a developer-friendly android ROM with
> oprofile, perf, systemtap, gdb, debug symbols, etc. It might even be
> beneficial for us to use newer kernels as we exlore options like
> kernel-assisted ld.so relocations, etc. That seems to similar to
> what Linaro provides in the evaluation ROMS. Is there any chance of
> Linaro providing developer-friendly "evaluation" ROMs for retail
> phones like the Nexus S?
It's indeed pretty similar (we just call them LEBs), and Zach will be
really interested in working with you on this.
As for supporting actual released phones, it lies somewhat outside of
our optimal operating model, and we don't have any hardware available. I
guess we could do a spin for a specific model if we had enough of them
to use by a set of engineers in the different teams. They are so
expensive, though. Do you guys have lots of them?
--
Christian Robottom Reis, Engineering VP
Brazil (GMT-3) | [+55] 16 9112 6430 | [+1] 612 216 4935
Linaro.org: Open Source Software for ARM SoCs
Hi there. This is a heads-up that the name of the Toolchain group
releases will change slightly with next weeks release. We're dropping
the respin suffix (the -0) to line up with the new whole of Linaro
naming convention.
What was:
gcc-linaro-4.6-2011.xx-0.tar.bz2
gdb-linaro-7.2-2011.xx-0.tar.bz2
qemu-linaro-0.15-2011.xx-0.tar.bz2
will now be:
gcc-linaro-4.6-2011.xx.tar.bz2
gdb-linaro-7.2-2011.xx.tar.bz2
qemu-linaro-0.15-2011.xx.tar.bz2
Earth shattering, eh? I've taken the opportunity to write up our
naming convention at the same time:
https://wiki.linaro.org/WorkingGroups/ToolChain/Naming
-- Michael
Dave Martin <dave.martin(a)linaro.org> writes:
> However, there's not really anything fundamentally
> architecture-specific about this problem, and ideally the solution and
> the directives should not be architecture-specific either.
> One option which appeals to me is to have some directives which can
> exist across all architectures, and do something analogous to what
> .set push and ,set pop do on MIPS.
FWIW, this sounds like a really good idea to me. I won't argue about
the syntax (I have no particular preference).
> I feel that the environment should also include global,
> target-independent state such as the current macro mode (.altmacro
> versus .noaltmacro) and current ELF section stack state, but not
> symbols or macro definitions themselves.
Sounds reasonable. To state the obvious, we'd have to make the existing
target-dependent groupings (like .set push/pop on MIPS) work with this
new scheme, but those directives musn't affect this extra target-independent
information. So the new directives would interact with both the
traditional .pushsection and the traditional target-dependent directives,
even though those two features would otherwise remain independent.
That is, .pushsection and .set push/pop operate on conceptually
separate stacks whoses pushes and pops can be freely mixed.
But .pushsection and the new directives would need to be
strictly stacked; pops must have the same form as their
corresponding pushes. Combinations of .set push/pop and
the new directives would also need to be strictly stacked.
Nothing a bit of code can't handle though.
Richard
Hi all,
On ARM, we've now hit the problem a few times of temporarily
overriding the assembler state (or rather, not being able to do this
reliably). For example, sometimes there's a need to assemble a few
instructions for a different architecture version so we can optionally
execute or skip them at run-time is not really possible at present.
This sort of feature is especially useful in macros but can be useful
elsewhere too.
There seem to be some target-specific solutions to this problem
already. MIPS has its "option stack", maintained by .set push and
.set pop directives. From the documentation, it sounds like this
saves/restores a somewhat comprehensive set of state, but doesn't make
much syntactic sense on arches which use .set to define symbols (i.e.,
most arches). PowerPC also has .machine push and .machine pop, but
those only act on one specific aspect of the assembler state, and
therefore aren't as portable a concept.
However, there's not really anything fundamentally
architecture-specific about this problem, and ideally the solution and
the directives should not be architecture-specific either.
One option which appeals to me is to have some directives which can
exist across all architectures, and do something analogous to what
.set push and ,set pop do on MIPS.
My names would be .pushenv and .popenv, but obviously, they can be
named any way people like. (For now I'm stealing groff's
"environment" terminology to refer to such saved and restored state --
hence "env". Again, the nomenclature is arbitrary.)
These directives would save and restore a target-specific set of
state, which the philosophy that anything that can reasonably be
changed with a directive mid-file can also be saved and restored with
.pushenv/.popenv. Effectively, .popenv would be equivalent to issuing
the necessary set of assembler directives to restore the assembler
state to whatever it was at the last .pushenv (including the state of
the environment stack itself)
I feel that the environment should also include global,
target-independent state such as the current macro mode (.altmacro
versus .noaltmacro) and current ELF section stack state, but not
symbols or macro definitions themselves. Currently, neither the macro
mode nor the behaviour of .previous is reliably restorable after being
changed (unless I missed something). This can result in unexpected
behaviour after a macro which switches sections or changes the macro
mode. This seems unfortunate since on most arches there is no
syntactic difference between a machine instruction and a macro
invocation -- hence in the presence of macros, the only time you're
really 100% certain what .previous will do is immediately after a
.pushsection or .section directive (which obviously is not much use).
Comments are welcome -- at the moment this is just a fuzzy idea for a
feature which might prove useful.
I haven't investigated the implementation implications -- maybe it
could be built straightforwardly around the current MIPS directives.
Cheers
---Dave
Hi,
* fixed PR 50014 and 50039 - to be backported to linaro-gcc
* tested the patch to change the default vector size on NEON
* found one test that fails with quad-words -
gcc.c-torture/execute/mode-dependent-address.c. Debugging it with
Ramana.
* started looking into widening shifts
Vacation plans:
next week Monday and Wednesday
and August 22 - 30.
Ira
Hi,
ld in the current (4.6-2011.07-0-8-2011-07-25_12-42-06) Android
toolchain fails to link uboot:
arm-eabi-ld: /mnt/user/bero/android-iMX53-20110716151649/out/target/product/iMX53/obj/u-boot/lib/libgeneric.o:
Unknown mandatory EABI object attribute 44
arm-eabi-ld: failed to merge target specific data of file
/mnt/user/bero/android-iMX53-20110716151649/out/target/product/iMX53/obj/u-boot/lib/crc16.o
arm-eabi-ld: /mnt/user/bero/android-iMX53-20110716151649/out/target/product/iMX53/obj/u-boot/lib/libgeneric.o:
Unknown mandatory EABI object attribute 44
arm-eabi-ld: failed to merge target specific data of file
/mnt/user/bero/android-iMX53-20110716151649/out/target/product/iMX53/obj/u-boot/lib/crc32.o
arm-eabi-ld: /mnt/user/bero/android-iMX53-20110716151649/out/target/product/iMX53/obj/u-boot/lib/ctype.o:
Unknown mandatory EABI object attribute 44
arm-eabi-ld: failed to merge target specific data of file
/mnt/user/bero/android-iMX53-20110716151649/out/target/product/iMX53/obj/u-boot/lib/ctype.o
arm-eabi-ld: /mnt/user/bero/android-iMX53-20110716151649/out/target/product/iMX53/obj/u-boot/lib/div64.o:
Unknown mandatory EABI object attribute 44
arm-eabi-ld: failed to merge target specific data of file
/mnt/user/bero/android-iMX53-20110716151649/out/target/product/iMX53/obj/u-boot/lib/div64.o
arm-eabi-ld: /mnt/user/bero/android-iMX53-20110716151649/out/target/product/iMX53/obj/u-boot/lib/errno.o:
Unknown mandatory EABI object attribute 44
arm-eabi-ld: failed to merge target specific data of file
/mnt/user/bero/android-iMX53-20110716151649/out/target/product/iMX53/obj/u-boot/lib/errno.o
arm-eabi-ld: /mnt/user/bero/android-iMX53-20110716151649/out/target/product/iMX53/obj/u-boot/lib/ldiv.o:
Unknown mandatory EABI object attribute 44
arm-eabi-ld: failed to merge target specific data of file
/mnt/user/bero/android-iMX53-20110716151649/out/target/product/iMX53/obj/u-boot/lib/ldiv.o
I believe this is already fixed in upstream binutils (or at least in
hjl's 2.21.52.0.2 release from kernel.org /pub/linux/devel/binutils).
ttyl
bero
== Last week (Linaro Connect) ==
* Reran libav comparisons after Ira's fix for excessive promotion.
The vectorized versions are now at least as good as the non-vectorised
ones. Updated wiki page with new asm output and microbenchmark results.
* More work on SMS. I have some patches that wire up the ddg code
to IV analysis. It gave some nice benchmark improvements, but also
some regressions. Traced the regressions down to cases where the
schedule for small iis generated too many moves. E.g. in a small
microbenchmark, we were able to schedule 6 instructions with an
ii of 3 (i.e. in a loop iteration of 3 cycles), but then needed
to add ~9 moves in order to keep the dependencies correct.
We got much better code with a larger ii and fewer moves.
Wrote a patch to estimate how many moves would be added, and to try to
a larger ii if the number of moves is too high. This improved the
results for one benchmark independently of the iv patch, and had no
effect on the others.
Discussed this with Revital, who said that Mustafa had tried a similar
thing but seen no benefit.
* Got powerpc-ibm-aix5.3 bootstraps working. Needs a few local fixes
due to C++ bootstrapping. Used it to test a couple of preparatory
patches for the IV work. Submitted those patches upstream.
* Ran benchmarks with -fno-schedule-insns after seeing that the first
scheduling pass was responsible for the main NEON-vs.-non-NEON
regression in EEMBC. It fixed that case, but as expected,
made others worse. Mentioned this to Ramana, who pointed me at
-fsched-pressure.
Reran the benchmarks with -fsched-pressure instead of
-fno-schedule-insns. It too fixed the main regression,
and improved a couple of other tests too. It showed a regression
in another test though. Looked at that regression. It was a case
where many registers were live across a loop, but not used in it.
This was causing the loop to have a very conservative schedule.
It would be better to spill some of the other registers instead.
Wrote a patch to take loops into account, and it seemed to do
the right thing for EEMBC. Sent it to Andreas, after Ulrich
mentioned that he had been looking at -fsched-pressure problems
on s390. Andreas is away for a while, though, so I might put this
on the back burner until he gets back.
== This week ==
* SMS
* auto inc/dec
* libav, perhaps
Richard
Hi,
* committed upstream a patch that reduces over-promotion of vector operations
* started to work on a new version of the patch to change the default
vector size for Neon
* attended Linaro connect
Ira
* Committed a set of SMS patches to trunk and gcc-linaro branch.
* Implemented a hack to evaluate the potential of SMS on SPEC2006/libqauntum.
* involved in non linaro issue
== QEMU ==
* After discussion with Peter started writing QEMU fixup for 64bit
atomic helper version location.
* Sent fixes for soc-dma code to qemu list
* Trying to understand just how much of omap_dma's code is needed.
== Other ==
* Travelling to/from connect
* Wanted to dial into some of the seessions in Corpus and Magdelen
rooms but the remote audio from them was unusable.
Dave
Hi,
Libunwind:
* finished initial ARM support for remote unwinding (libunwind-ptrace)
Android:
* took a closer look at the debuggerd
* got the perflab benchmark running on my PandaBoard using Linaro GCC
Misc:
* remotely attended some Linaro Connect Android sessions
Regards
Ken
== GDB ==
* Created Linaro GDB 7.3 branch
* Ported all remaining feature patches from Linaro GDB 7.2
* Backported mainline patches to fix remote test issues:
- Fixed #804387 Shared library test problems
- Fixed #804392 Rebuilt executables not copied
- Fixed #804396 Spurious failures
* Committed mainline patch to fix dlopen test cases
for remote testing (#804387).
* Committed mainline patches to fix misc. other remote
test problems (#804396).
== Misc ==
* Attended Linaro Connect in Cambourne.
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E.
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk
Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294
I've updated:
https://wiki.linaro.org/RichardSandiford/Sandbox/NeonLibAv
so that it gives the output for current trunk, including Ira's commit
yesterday to reduce the amount of overpromotion. I also reran the
microbenchmarks. The good news is that the vectorised code is now
better in all cases than the non-vectorised code.
The biggest winner from last time was rgb24tobgr16_C(). It used to be
much worse with vectorisation due to lots of excessive widening.
Thanks to Ira's patch, the loop now looks pretty respectable,
and is ~3.25x faster than the non-vectorised code.
As well as using a more recent compiler, the new version also uses
-mvectorize-with-neon-quad. Once again it shows a significant improvement
over the default.
Richard
Continued work on widening multiplied. I've identified another cause for
the bootstrap failure, and submitted the new version for testing.
Continued trying to find out how my thumb2 constants patches are broken.
This is taking ages due to the time it takes to turn around a bootstrap
build on my IGEP board.
Tried to get the CS Panda boards to work again. They'll do the bootstrap
builds much faster (if still not quickly), but are no longer very well.
All my attempts to bring them back up remotely have failed. I've
discovered that the device the serial console on one was connected to
has been relocated to the new Mentor Graphics board lab, so this might
explain some of it ....
Chaired the Monday and Thursday meetings in Michael's absence.
Travelled to the Linaro Connect event in Cambourne, near Cambridge.
Other:
More machine trouble. I keep thinking I have the display issues solved,
and then it starts up with all the windows displayed double sized, but
requiring mouse clicks in the correct location .... typically this
happened just when I needed access to the pin number for the Monday
meeting. This hasn't happened since Monday, so hopefully it's now ironed
out ... this sort of thing does not happen with Windows. :(
----
Upstream patched requiring review:
* NEON scheduling patch
http://gcc.gnu.org/ml/gcc-patches/2011-02/msg01431.html
* Looking into SMS patches sent to mainline which expands SMS
functionally to avoid using doloop. The patches resolve the recent
bootstrap failure on mainline.
http://gcc.gnu.org/ml/gcc-patches/2011-07/msg01807.html
* Continue looking into 462.libquantum.
Valgrind wants a less stripped ld-2.12.1.so or it won't work. The build
process (that Michael Hope put together) just downloads the
libc6_2.12.1-0ubuntu6_armel.deb, and the ld-2.12.1.so in there is fully
stripped. I thought I'd be able to just get the
libc6-dbg_2.12.1-0ubuntu6_armel.deb instead, thinking that was just the
pre-stripped version of these libs -- but apparently it's not, because
trying to use those libs instead of the stripped ones results in undefined
symbols. For example, ld-2.12.1.so defines _rtld_global -- but
libc-2.12.1.so is looking for _rtld_global@@GLIBC_PRIVATE, so
_rtld_global@@GLIBC_PRIVATE
ends up undefined. (Ditto for __tls_get_addr, __libc_enable_secure,
_dl_argv, etc.)
I'm not sure who actually builds these packages (they're retrieved from:
http://ports.ubuntu.com/pool/main/e/eglibc/), but if anyone has any
suggestions on how to get past this, I'd be most appreciative. (I've got
angry developers trying to track down memory issues, who about to come after
me with torches and pitchforks :P )
Thanks,
Diane
== GDB ==
* Committed second mainline patch to fix re-built executable
remote test problems (#804392).
* Prepared for rebasing Linaro GDB on top of GDB 7.3 release.
== Misc ==
* Prepared for Linaro Connect.
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E.
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk
Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294
Hi,
* Monday was full of IBM internal meetings
* Android
* got a self built LEB and generic version 2.3.4 of linaro android
running on my pandaboard (build with the gcc 4.6 07 release plus the
patch that Richard made)
* requires libicui18n.so (external/icu4c/i18n) to be built with -O2
* ran into a few issues (816491, 807230)
* libunwind:
* simplified the local unwinding (there is no need to touch the ARM
exidx table segment when looking it up)
* fixed a bug (corner case: the info of the IP to be unwound is
described by the last unw entry)
* made some progress on the remote unwinding via ptrace
* remotely searching the unw withing entry exidx table segment
* next step is to remotely extract the unw isns
Regards
Ken
RAG:
Red:
Amber: OMAP3 patch upstreaming is slower progress than hoped
Green: various outstanding patches accepted upstream in time for 0.15
Current Milestones:
|| || Planned || Estimate || Actual ||
||qemu-linaro 2011-08 || 2011-08-18 || 2011-08-18 || ||
Historical Milestones:
||qemu-linaro 2011-04 || 2011-04-21 || 2011-04-21 || 2011-04-21 ||
||qemu-linaro 2011-05 || 2011-05-19 || 2011-05-19 || n/a ||
||close out 1105 blueprints || 2011-05-28 || 2011-05-28 || 2011-05-19 ||
||complete 1111 planning || 2011-05-28 || 2011-05-28 || 2011-05-27 ||
||qemu-linaro-2011-06 || 2011-06-16 || 2011-06-16 || 2011-06-16 ||
||qemu-linaro-2011-07 || 2011-07-21 || 2011-07-21 || 2011-07-21 ||
== upstream-omap3-patches ==
* omap-gpmc patches now all cleaned up; I think I need to look at
qdevifying this device before submitting patches, though
* sent patch for bug which makes n810 model crash when key is pressed
* sent a pull request collecting together the patches submitted so far
== other ==
* qemu 0.15: put together pull request for ARM patches I think should
go into this release; wrote ARM-related bits of the release notes
* helped GSoC student track down a bug causing android not to boot
* LP:816791: tracking down issues with running mono under qemu
(combination of a couple of known qemu bugs and a mono bug)
* admin/prep for upcoming travel (cambourne, vancouver, orlando)
* reviewing pl041 patches which add audio support to versatilepb
and vexpress models
* mailing list discussion of possible new qemu object model
* lots of meetings this week (toolchain, standup, doughnuts, team
comms x2)
Current qemu patch status is tracked here:
https://wiki.linaro.org/PeterMaydell/QemuPatchStatus
Absences:
1-5 August: Linaro sprint 1111
15-19 August: KVM Forum and LinuxCon NA, Vancouver
the current gcc-4.6/eglibc is now built multilib'd for -mfloat-abi=softfp|hard,
including the GCC runtime libraries. I hope that the gcc cross builds will pick
this up soonish, not needing to build the cross compiler twice for softfp and
hard float-abi.
Matthias
== 64 bit atomics ==
* Sent updated set of 64bit atomic patches to gcc list with fixes
from previous review
* Started hunting for other users of 64bit atomics than membase
jemalloc, sdl and boost lock free look like possibilities; but I've
not looked at them hard yet
== QEmu ==
* Released fix for last SD card block access error
- Vincent Palatin released a bunch of SD card fixes a few hours
later - that included a fix to the same bug; however it does look like
he has a bunch of other stuff we should keep sync'd with.
* Changing caching mode to writeback on the block layer fixes bug
732223 (hangs on heavy IO) - goes from 130KB/s to 8MB/s on vexpress
- Asked mailing list whether that's reasonable to make as default for SD
* Looking at path from CPU->MMC/SD card - the DMA on OMAP is pretty
inefficiently emulated, but the soc_dma code has an unused special
case for dma'ing to hardware, looks promising but need to figure
out how to use it and if it works.
* Comparing Vincent's SD card patch with earlier meego patches;
partial overlap.
== Other ==
* Pinged libc-ports for comments on my optimised memchr patch
* Image testing
Next week; I intend to be in Camborne on the afternoon of Monday,
Wednesday and Friday.
Dave
Hi,
I am checking the coverage of the NEON instructions mostly by writing
tests in C to check which instructions are generated (after
auto-vectorization) and which are not.
I put here https://wiki.linaro.org/IraRosen/Sandbox/InstructionCoverage
the list of things that I've checked till now.
Ira
Spun release tarballs for Linaro GCC 4.5 and 4.6. Sent them to Michael
Hope and Matthias Klose.
Testing for my widening multiplies patches revealed a bug when the
accumulate value had a different type. The problem is easily fixed, so
I've created a patch, submitted it, and now it's approved upstream.
Same again, this time with a bug involving constant integers. Again,
easily fixed, submitted, and approved.
Nobody had reviewed the first patch in my series - Richard Guenther had
reviewed all the others, but wasn't happy to review the expand pass. So,
I asked newly crowned RTL Maintainer Richard Sandiford to review it, but
apparently it's the wrong bit of the back-end, so I asked Bernd instead.
Bernd kindly reviewed and approved it, so now the whole series is ready
to commit if only my test comes back clean.
Continued trying to figure out why my thumb2 constants patch is broken.
So far, no further progress. It might be that Michael's build system is
confused, but it's looking likely to be a real bug.
----
Upstream patched requiring review:
* NEON scheduling patch
http://gcc.gnu.org/ml/gcc-patches/2011-02/msg01431.html
- Opened PR49789 to record the bootstrap failure with SMS flags.
- SPEC2006/libquantum: Wrote a hack to apply SMS on the hot loop. Need
to make it more accurate.
- Pinged SMS patches in mainline.
- Looking with Ramana on the effect of the Tree reassociation
improvement patch on bwaves
http://gcc.gnu.org/ml/gcc-patches/2011-07/msg00904.html
== 64 bit atomics ==
* Updated gcc patches as per comments from Ramana and Joseph; build
currently cooking on Panda
== Qemu ==
* Testing Peter's pre-release, finding bug on beagle (that he
tracked down to x-loader change)
* Found cause of occasional SD card errors I was seeing (SD: CMD12
in a wrong state); I'll cut
a patch next week, but the bug is writing the last sector throws
an error and also leaves it in
the wrong state
* Added a bunch of tracing code to the SD card layer
* With the tracing code and fixing the other bug I'm starting to
understand how it works - and
half a dozen reasons that the emulation is really slow; whether
that's the cause of the reported
recoverable lock ups under load is an interesting question; I
plan to fix the obvious problems
and see how it goes.
Dave
== GDB ==
* Committed mainline patch to fix re-built executable remote test
problems (#804392).
* Committed two more mainline patches to fix remote test issues.
== GCC ==
* Patch review.
* Determined root cause of bug #809768 (ICE in bionic libm).
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E.
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk
Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294
== GCC ==
=== Progress ===
* ivopts patch to minimise the amount of VFP moves to integer
registers because of auto-inc - sent out for review. It appears to
test fine and reduces the number of FP to integer moves in certain
SPEC2k6 benchmarks by about 20%
* More cases with vfp moves identified and some more patches coming
out soonish. one-case where we have moves from VFP regs to integer
regs because we allow POST_MODIFY_DISP.
and another case from scimark where we have a case with moves from
integer registers to VFP registers because I suspect the order of
constraints in movdf_vfp has integer registers before fp registers
while
the movsf_vfp doesn't .
* Panda died a couple of times again because of power glitches in the
office - restarted runs for BRANCH_COST.
* Sorted out travel plans for UDS orlando. Need to book tickets.
* Sometime spent on getting the Eagle boards working.
* Bug triage and some patch review.
=== Plans ===
* Benchmark the ivopts patch to see what happens.
* Some issues with my last patch on movdi_vfp . I think I've missed a
set of ce_count and was thinking of why there was an Ada failure
with things outside IT blocks There appears to be an ubuntu bug for that.
* Disable POST_MODIFY for VFP mode values and see what happens and
change the order of the constraints to have loads to VFP registers
before loads to core registers for movdf_vfp and thumb2_movdf_vfp.
* Look at effects of auto-inc-dec with the VFP mode stuff.
* Look at EPILOGUE_USES and clear more of my patch queue.
Meetings:
* 1-1s
* TCWG calls
Absences.
* 1st Aug - 5th August - Linaro sprint.
* 8th - 9th August - Internal training.
* 29th Aug - Sept. 2 - Holiday booked and approved.
* 31st Oct - 4th Nov - Linaro Summit Orlando - Travel to be booked.
RAG:
Red:
Amber: OMAP3 patch upstreaming is slower progress than hoped
Green: various outstanding patches accepted upstream in time for 0.15
Current Milestones:
|| || Planned || Estimate || Actual ||
||qemu-linaro-2011-07 || 2011-07-21 || 2011-07-21 || 2011-07-21 ||
Historical Milestones:
||qemu-linaro 2011-04 || 2011-04-21 || 2011-04-21 || 2011-04-21 ||
||qemu-linaro 2011-05 || 2011-05-19 || 2011-05-19 || n/a ||
||close out 1105 blueprints || 2011-05-28 || 2011-05-28 || 2011-05-19 ||
||complete 1111 planning || 2011-05-28 || 2011-05-28 || 2011-05-27 ||
||qemu-linaro-2011-06 || 2011-06-16 || 2011-06-16 || 2011-06-16 ||
== linaro-qemu-11.11 ==
* tracking down a problem with very recent beagle snapshots not booting
in qemu; this turns out to be an x-loader bug (LP:813407)
* made the release
== other ==
* upstream are planning to branch for 0.15 release today
* most of the outstanding ARM patches have now been pulled
* reviewed a patch adding ARM1176 support
* wrote a patch fixing the feature flags for ARM1136r1 so it includes
the TLS registers (needed as newer kernels now try to use them)
* submitted some patches fixing a few VFP UNDEF/UNPREDICTABLE cases so
they don't crash qemu
* submitted patch to make v6 cp15 barrier insns work in linux-user mode
* looked at a reported problem where linux kernel versions 2.6.39+
display graphics wrongly. This turns out to be that 2.6.39 (or 38)
changed (inadvertently?) from programming the versatilepb CLCD as
RGB565 to setting it to BGR565; qemu wasn't implementing the latter.
Dusted off some PL111 support patches, added the mux control support
for PL110 and submitted them.
Current qemu patch status is tracked here:
https://wiki.linaro.org/PeterMaydell/QemuPatchStatus
Absences:
1-5 August: Linaro sprint 1111
15-19 August: KVM Forum and LinuxCon NA, Vancouver
== This week ==
* Wrote a fix for 809768. Accepted upstream.
* Looked at upstream PR 49742 (the failures seen with predictive commoning).
Accepted upstream.
* More shrink-wrap review.
* Sent auto-inc-dec changes out for comments. Got some good private
feedback (in the sense of being positive, and having good suggestions).
* Sent a related define_bypass patch out for review.
* Started looking at sms-and-memory-dependencies.
== Next week ==
* Deal with auto-inc-dec suggestions.
* More SMS.
The Linaro Toolchain Working Group is pleased to announce the release of
both Linaro GCC 4.6 and Linaro GCC 4.5.
Linaro GCC 4.6 is the fifth release in the 4.6 series. Based off the latest
GCC 4.6.1+svn175677, it adds new optimisations and vectoriser improvements.
Interesting changes include:
* Updates to 4.6.1+r175677
* Improves support for vector shifts by a constant
* Improves handling of memory dependencies in the SMS optimisation
* Improved vectorisation of widening multiplies by keeping the operands
smaller for longer
* Improves the peeling of potentially misaligned vectorised loops
* Improved vectorisation of signed and unsigned widening multiplies by a
constant
* Merges the new upstream Cortex-A5 tuning
Fixes:
* LP: #721531: Don't optimise out testing of the Thumb mode bit on function
pointers
* LP: #723185: ICE in reload_cse_simplify_operands when compiling with -marm
-mfpu=neon
* LP: #744754: ICE in *neon_movoi when using NEON intrinsics
* LP: #791327: ICE due to using the stack pointer in RSB instructions
* LP: #797748: ICE building SPEC2006 403.gcc emit-rtl.c
* LP: #803232: ICE on code that uses vld4q_s16() NEON intrinsic
* LP: #809435: Omit building the target libiberty when building a cross
compiler
* LP: #807573: ICE in *truncsisf2_vfp: Could not find a spill register
* PR 49385: Ensure at least one of the operands is a register in
thumb2_movhi_insn
* Fixes an EABI unwinding bug that improves interoperability with armcc
* Fixes a DWARF 2 problem exposed through shrinkwrap.
* Fixes a bug in __builtin_isgreaterequal
Known issues:
* Building Python 2.7 with -mfpu=neon exposes a bug in vmov.i64 in binutils
2.20.51. Please use 2.21 or later.
Linaro GCC 4.5 2011.07 is the twelfth release in the 4.5 series. Based off
the latest GCC 4.5.3+svn175676, the release is focused on maintenance.
Interesting changes in 4.5 include:
* Updates to 4.5.3+r175676
Fixes:
* LP: #721531: Don't optimise out testing of the Thumb mode bit on function
pointers
* LP: #723185: ICE in reload_cse_simplify_operands when compiling with -marm
-mfpu=neon
* LP: #744754: ICE in *neon_movoi when using NEON intrinsics
* LP: #797748: ICE building SPEC2006 403.gcc emit-rtl.c
* LP: #803232: ICE on code that uses vld4q_s16() NEON intrinsic
* Fixes a DWARF 2 problem exposed through shrinkwrap.
The source tarball is available from:
https://launchpad.net/gcc-linaro/+milestone/4.6-2011.07https://launchpad.net/gcc-linaro/+milestone/4.5-2011.07
Downloads are available from the Linaro GCC page on Launchpad:
https://launchpad.net/gcc-linaro
Mailing list: http://lists.linaro.org/mailman/listinfo/linaro-toolchain
Bugs: https://bugs.launchpad.net/gcc-linaro/
Questions? https://ask.linaro.org/
Interested in commercial support? inquire at support(a)linaro.org
-- Michael
Hi,
- I finally submitted the over-widening patch, but Richard Guenther
thought that this optimization should be done for scalars as well, and
he is now working on this himself.
- Some auto-vectorizer fixes
Ira
The Linaro Toolchain Working Group is pleased to announce the release
of Linaro QEMU 2011.07.
Linaro QEMU 2011.07-0 is the latest monthly release of qemu-linaro. Based
off upstream (trunk) QEMU, it includes a number of ARM-focused bug fixes
and enhancements.
This month's release is primarily minor improvements:
- Fixes a compile failure on ia64 hosts
- syscall 369 (prlimit64) implemented in linux-user mode
- Fixes an ELF loader bug that caused problems with binaries generated
by the Google Go compiler
Plus of course new upstream fixes and improvements.
Known issues:
- The beagle and beaglexm models still do not support USB networking
- Very recent Linaro omap3 hwpacks (20110716 and later) do not boot on
the beagle model; this is caused by an x-loader bug (LP:813407)
The source tarball is available at:
https://launchpad.net/qemu-linaro/+milestone/2011.07
Binary builds of this qemu-linaro release are being prepared and
will be available shortly for users of Ubuntu. Packages will be in
the linaro-maintainers tools ppa:
https://launchpad.net/~linaro-maintainers/+archive/tools/
More information on Linaro QEMU is available at:
https://launchpad.net/qemu-linaro
Hi All,
Apologies for missing the stand-up call today.
I've been having technical difficulties at my end. :(
I think they're resolved now ... maybe.
Andrew
Hi,
* continued to look into #809768 (ICE when building bionic's libm)
* created some toolchain and android builds for verification purposes
* libunwind
* discussions with Michael and Uli on how to proceed (thanks!)
* started to work on libunwind-ptrace
* also look for .debug_frame info if there is no .eh_frame info
for the given IP
* mimics the behaviour of the reworked local unwinding
* Attended an IBM internal class on Wednesday
Note: I'm off for two days (21-22) and back on Monday.
Regards
Ken
Hi there. The 2011.07 release has been spun and is testing up well.
The 4.5 and 4.6 branches are now open so feel free to commit any
approved patches.
-- Michael
== GCC ==
=== Progress ===
* Identified particular patterns that have issues with scheduler
descriptions in A8 and A9 . Fixes to be benchmarked next.
* Spent sometime on the new tree-reassoc work but SPEC2k failed for
some of the neon configurations. Needs investigation.
* T2 perf call.
* Looked at libquantum bits with Revital.
* BRANCH_COST benchmarking now complete for T2 . Same is running for ARM state.
* Some patch review and bugzilla triaging upstream.
=== Plans ===
* ivopts patch for RichardS to try out - related to the excessive
moves between integer and VFP unit.
* BRANCH_COST further results.
* Submit scheduler patches upstream after benchmarking.
Meetings:
* 1-1s
* TCWG calls
Absences.
* 1st Aug - 5th August - Linaro sprint.
* 8th - 9th August -Internal training.
* 29th Aug - Sept. 2 - Holiday booked and approved.
* 31st Oct - 4th Nov - Linaro Summit Orlando - Travel to be booked.
Continued responding to review comments on my widening multiply patches.
Wrote large parts of most of the patches to fix bugs and tidy them up.
The result is that all but patch 1 are now approved. Pushed the patches
to Launchpad for final testing.
Monitored the test status of my thumb2 constants patch, but it still
hasn't returned any results. It seems Michael has been having some
problems with his systems.
Went back to looking at merging patches to 4.6. It's only really the
hard ones left. Many are blocked on work that needs to be done by
somebody else. Pinged Tom and Bernd to find out the status of their ones
- all are stuck on the back burner.
----
Upstream patched requiring review:
* NEON scheduling patch
http://gcc.gnu.org/ml/gcc-patches/2011-02/msg01431.html
* Widening Multiplies 1/7
http://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg08721.html
- Tracked the problematic file which contains the loop that causing
bootstrap failure with SMS flags on ARM machine. It is not caused by
SMS but rather due to doloop optimization which is applied when SMS
flags are set. Now working on locating the exact loop and producing a
testcase to reproduce the error.
- Looking into Spec2006/libquantum benchmark - it has hot loop with
conditional store which suppress SMS as it is only applied on single
basic-block loops. If-conversion can not be done (replacing the store
with conditional move and then a store) because in order to do that we
need to prove that there is a store to the same location in each
iteration of the loop; and it is not the case in this loop.
Apparently, when running with crotex-a8 flag cond_exe statement is
generated for the store but that's happening only after register
allocation pass which is applied after SMS (IIUC,moving the generation
of cond_exe before RA is not trivial
http://gcc.gnu.org/ml/gcc/2000-05/msg00079.html)
So, I'm looking into teaching SMS to handle conditional statements
based on technique presented in [1]. This change is not trivial so I'm
going to estimate the potential of applying SMS on the loop at first
stage.
[1] M. Lam, "Software pipelining: an effective scheduling technique
for VLIW machines"
== GDB ==
* Tested GDB 7.2.91 prerelease on ARM; everything looking good.
* Created a set of patches to prepare for Linaro GDB 7.3 series;
verified release process on top of a current 7.3 snapshot.
* Committed three mainline patches to fix shared library remote
test problems (#804387).
* Reviewed Yao's latest Thumb-2 displaced stepping patch.
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E.
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk
Wittkopp
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294
== String routines ==
* Sent a patch to libc-ports with modified configure scripts to add
subdirectories
for architecture specific ARM code, and the memchr.S from cortex-strings.
== 64 bit atomics ==
* Working through comments on my patches and the set of discussions about the
kernel interface for the helper case - not really sure which way
that's going to go.
== QEmu ==
* Looking at how tracing works, considering adding tracing to sd
card code to help
track down some of the sd card issues.
Dave
== This week ==
* Fixed the unnecessary union initialisers that were causing ICEs
with -g. This turned out to be a lot more work than Richard's
one-liner suggested. :-)
* Backported Chung-Lin's arm_legitimize_reload_address patch to 4.5.
* Backported the smallest_mode_for_size patch to 4.5 and 4.6.
* Patch review.
* Found an off-by-one error in the vectoriser that caused it to think
that contiguous memory regions overlapped. Unfortunately, this meant
that a lot of my microbenchmarks were using the fallback ARM code
instead of the nice-looking NEON code that I could see in the asm.
* A bit more work on auto inc/dec. It tested regression-free for all
default languages. Ran some more benchmarks and posted the results.
== Next week ==
* Bugs and auto inc/dec.
Richard
Hi,
* analyzed/tested toolchain issues the Linaro Android folks are facing
* libquadmath disabled due to configure test fail of the target
libiberty (#809435)
* fix will be in 11.07 release
* ICE when building bionic's libm (#809768)
* not reproducible with a "plain" Linaro GCC
* non upstreamable workaround in place
(prevents the ICE but degrades the DWARF quality)
* binary toolchain at http://people.linaro.org/~kwerner/
* libunwind
* localunwrework branch now on git.linaro.org
Note: I'll take two days off at the end of next week (21-22).
Regards
Ken
Achieved:
* Set up networking on the Panda board, ssh to the board from my laptop
works fine.
* Downloaded the benchmarks (SPEC2000 and EEMBC) and built them for x86. I
now have a basic understanding of what the benchmarks do and how to run
them.
* For EEMBC I used the -m32 flag for building on my 64 bit installation.
* For the SPEC2000 I used the Linaro configuration file and enabled the
portability flags for a 64-bit host. Played around with different "runspec"
actions and options. Building and running individual test cases as well as
the full test suites. Finally I did a reportable run for all benchmarks. It
took several hours and in the end I got a result for my laptop.
Next step:
* Cross-compile EEMBC and SPEC2000. I will try linaro-gcc and the cross
compiler that comes with Natty.
Best Regards
Åsa
Hi,
- merged over-widened multiply patch to gcc-linaro-4.6 (now vectorized
rgbyiqv should be about as good as its scalar version)
- continued working on over-widened shifts and bit operations
Ira