- linaro-toolchain - lists.linaro.org

[ACTIVITY] 2011-12-23 - and goodbye

by David Gilbert

== QEMU == * Wrote the context routines for Eglibc, including those that QEMU uses These pass all the context tests I could find, including QEMUs coroutine tests, and with them QEMU seems to boot OK. I've got a full eglibc test run going at the moment, but I don't think anything else uses them. I posted them with comments and a question to libc-ports; I'll try and chase follow ups. == String routines == * I posted the strchr and strlen routines to eglibc (libc-ports) * On strchr the question of whether it was worth using the longer version that's faster for longer strings (but slower for shorter strings) came up. I posted some stats, observations etc - and there is still a discussion on going about it. * For strlen, rth noted the same trick that I'd originally seen in newlib (and for which RichardS and Ramana had suggested) of a quicker end-of-string sequence using clz. I'd avoided this because I'd originally seen it in newlib and didn't want to copy it; but since 3 people have individually suggested it it would seem using. == Goodbye! == Thank you all for a fun & interesting year! I'm sure many of us will meet online again in the future. I'll try and follow my linaro.org address while it's still live to check for any replies to any patches/comments etc. Feel free to mail me at davidgil(a)uk.ibm.com (work) or dave(a)treblig.org (home); for Linaro people I've also added some more contact methods at: https://wiki.linaro.org/Internal/People/DaveGilbert/Contact Thanks again! Dave

13 years, 6 months

2
1
0 0

Fwd: New question: problem in installing Linaro tools on Ask Linaro

by Andy Doan

anyone have a suggestion for this person? -------- Original Message -------- Subject: New question: problem in installing Linaro tools on Ask Linaro Date: Mon, 26 Dec 2011 21:31:57 -0800 (PST) From: Ask Linaro <dnsadmin(a)linaro.org> To: doanac <andy.doan(a)linaro.org> Ask Linaro - Q & A forum for Linaro developers <http://ask.linaro.org/> ------------------------------------------------------------------------ Hello doanac, chandrakala <http://ask.linaro.org/users/172/chandrakala> has just posted a new question on Ask Linaro, entitled problem in installing Linaro tools <http://ask.linaro.org/questions/456/problem-in-installing-linaro-tools> and tagged "/linaro <http://ask.linaro.org/tags/linaro/> installation <http://ask.linaro.org/tags/installation/>/". Here's what it says: Hi all, Iam new to linux and gnu tools. i have downloaded linaro gnu tools to install cortex-a8 on my ubuntu machine. iam able to successfully install gcc tools. But when i try to install arm tools it is throwing error. following are the commands used to install gnu tools and the error it is generated mckala@mckala:~/Desktop/gstreamer/gcc_objdir$ ../gcc-linaro-4.6-2011.12/gcc-linaro-4.6-2011.12/configure --target=arm-*-elf --disable-bootstrap --enable-languages=c,c++ --with-mode=thumb --with-arch=armv7-a --with-tune=cortex-a8 --with-fpu=neon --with-float=softfp --prefix=/home/mckala/Desktop/gstreamer/gcc_objdir --disable-werror --with-newlib mckala@mckala:~/Desktop/gstreamer/gcc_objdir$ make all-host mckala@mckala:~/Desktop/gstreamer/gcc_objdir$ make install-host with this gnu tools installation is completed and i did not get any error. when i used the following command to install cross compiler, it is giving error mckala@mckala:~/Desktop/gstreamer/gcc_objdir$ make The error is checking for suffix of object files... configure: error: in |/home/mckala/Desktop/gstreamer/gcc_objdir/arm-*-elf/libgcc': configure: error: cannot compute suffix of object files: cannot compile See|config.log' for more details. make[1]: *** [configure-target-libgcc] Error 1 config.log file is showing the following error. gcc_objdir/arm-/-elf/bin/ -B/home/mckala/Desktop/gstreamer/gcc_objdir/arm-/-elf/lib/ -isystem /home/mckala/Desktop/gstreamer/gcc_objdir/arm-/-elf/include -isystem /home/mckala/Desktop/gstreamer/gcc_objdir/arm-/-elf/sys-include -V >&5 xgcc: fatal error: no input files compilation terminated. Can any one please help on this. Thanks in advance. Don't forget to come over and cast your vote. Thanks, Ask Linaro P.S. You can always fine-tune which notifications you receive here <http://ask.linaro.org//users/16/doanac/subscriptions/>. ------------------------------------------------------------------------

13 years, 6 months

2
1
0 0

Round-up and goodbye

by Richard Sandiford

I've posted all my WIP patches to this list over the last few days. Please treat them kindly. :-) I've also tried to update the relevant blueprints. I pinged the 4.5 and 4.6 backports for lp736661 on gcc-patches last week, then again this week, but there's obviously not likely to be much response at this time of year. I therefore went ahead with the Linaro merges of the branches rather than relying on them being committed to FSF branches in time for next month's Linaro release. I'll continue to ping though. If I've forgotten anything, or if you need more info, please don't hesitate to ask. I'll continue to monitor my Linaro email address as long as it remains active, although rdsandiford(a)googlemail.com is likely to be a better bet. With all that out of the way, I just wanted to say thank you to everyone for making this a really enjoyable project to work on. It feels like we managed to get through a fair number of new features, performance improvements and bug fixes this year. I hope Linaro will be around for a good few years yet and that it continues to go from strength to strength. Happy New Year, and all the best, Richard

13 years, 6 months

1
0
0 0

Patch drop: sms-and-memory-dependencies

by Richard Sandiford

Here's the patch for sms-and-memory-dependencies. The idea is to bypass the sched-deps.c {output,read,anti,true}_dependence tests altogether -- which is easy to do thanks to the note_mem_dep hook -- and instead handle them in ddg.c. The ddg.c tests then use RTL loop iv analysis to try to get longer distances on the memory dependencies. (Note that other memory-related dependencies, such as those between volatile MEMs and other volatile instructions, are still handled by sched-deps.c.) Dependencies are now always created in pairs, so there's no need for get_sched_window to set an upper bound when processing incoming MEM_DEPs, or a lower bound when processing outgoing MEM_DEPs; we can rely on the partnering edge to do that instead. Richard gcc/ * Makefile.in (ddg.o): Depend on $(TREE_PASS_H). * ddg.h (REG_OR_MEM_DEP, REG_AND_MEM_DEP): Delete. (ddg_mem_ref): New structure. (ddg): Add loads and stores array. (create_ddg): Add a loop argument. (add_edges_to_ddg): Declare. (MAX_DDG_DISTANCE): New macro. * ddg.c: Include tree-pass.h. (mem_ref_p, mark_mem_use, mark_mem_use_1, mem_read_insn_p) (mark_mem_store, mem_write_insn_p, rtx_mem_access_p) (mem_access_insn_p): Delete. (create_mem_ref): New function. (graph_and_node): New structure. (record_loads, record_stores): New functions. (create_ddg_dep_from_intra_loop_link): Treat all dependencies as register dependencies. (walk_mems_2, walk_mems_1, insns_may_alias_p, add_intra_loop_mem_dep) (add_inter_loop_mem_dep): Delete. (build_intra_loop_deps): Ignore memory dependencies created by sched-deps.c. Don't handle memory dependencies here. (measure_mem_distance, add_memory_dep): New functions. (FOR_EACH_LATER_MEM_REF): New macro. (build_memory_deps): New function. (create_ddg): Take the loop as argument. Don't count loads and stores here. Call iv_analysis_loop_init. Pass all loads to record_loads and all stores to record_stores. Move edge creation to... (add_edges_to_ddg): ...this new function. Also call build_memory_deps. * modulo-sched.c (sat_mulpp, sat_addsp, sat_subsp): New functions. (schedule_reg_moves): Only handle register dependencies. (sms_schedule): Update call to create_ddg. Call iv_analysis_done after creating all ddgs. Only set issue_rate if there are ddgs. Only call setup_sched_infos and haifa_sched_init if there are ddgs. Call add_edges_to_ddg before processing each ddg. (get_sched_window): Use saturating arithmetic. Do not add an implicit upper bound for incoming MEM_DEPs, or an implicit lower bound for outgoing MEM_DEPs. Rework calculation of final window. (calculate_must_precede_follow, compute_split_row): Use saturating arithmetic. Index: gcc/Makefile.in =================================================================== --- gcc/Makefile.in 2011-12-30 13:13:45.077544981 +0000 +++ gcc/Makefile.in 2011-12-30 13:24:57.330195801 +0000 @@ -3316,7 +3316,7 @@ ddg.o : ddg.c $(DDG_H) $(CONFIG_H) $(SYS $(DIAGNOSTIC_CORE_H) $(RTL_H) $(TM_P_H) $(REGS_H) $(FUNCTION_H) \ $(FLAGS_H) insn-config.h $(INSN_ATTR_H) $(EXCEPT_H) $(RECOG_H) \ $(SCHED_INT_H) $(CFGLAYOUT_H) $(CFGLOOP_H) $(EXPR_H) $(BITMAP_H) \ - hard-reg-set.h sbitmap.h $(TM_H) + hard-reg-set.h sbitmap.h $(TM_H) $(TREE_PASS_H) modulo-sched.o : modulo-sched.c $(DDG_H) $(CONFIG_H) $(CONFIG_H) $(SYSTEM_H) \ coretypes.h $(TARGET_H) $(DIAGNOSTIC_CORE_H) $(RTL_H) $(TM_P_H) $(REGS_H) $(FUNCTION_H) \ $(FLAGS_H) insn-config.h $(INSN_ATTR_H) $(EXCEPT_H) $(RECOG_H) \ Index: gcc/ddg.h =================================================================== --- gcc/ddg.h 2011-12-30 13:13:45.077544981 +0000 +++ gcc/ddg.h 2011-12-30 13:24:57.324195831 +0000 @@ -35,8 +35,7 @@ typedef struct ddg_scc *ddg_scc_ptr; typedef struct ddg_all_sccs *ddg_all_sccs_ptr; typedef enum {TRUE_DEP, OUTPUT_DEP, ANTI_DEP} dep_type; -typedef enum {REG_OR_MEM_DEP, REG_DEP, MEM_DEP, REG_AND_MEM_DEP} - dep_data_type; +typedef enum {REG_DEP, MEM_DEP} dep_data_type; /* The following two macros enables direct access to the successors and predecessors bitmaps held in each ddg_node. Do not make changes to @@ -44,6 +43,28 @@ typedef enum {REG_OR_MEM_DEP, REG_DEP, M #define NODE_SUCCESSORS(x) ((x)->successors) #define NODE_PREDECESSORS(x) ((x)->predecessors) +/* A structure that represents a memory read or write in the DDG; + context decides which. */ +struct ddg_mem_ref { + /* The previous reference of the same type (read or write) in the DDG. */ + struct ddg_mem_ref *prev; + + /* The DDG node that contains the memory reference. */ + ddg_node_ptr node; + + /* The memory reference itself. */ + rtx mem; + + /* If the address is a known induction variable, its value in iteration + I is given by: + + BASE + OFFSET + I * STEP + + In other cases BASE is null. */ + rtx base; + HOST_WIDE_INT offset, step; +}; + /* A structure that represents a node in the DDG. */ struct ddg_node { @@ -117,6 +138,11 @@ struct ddg /* Number of instructions in the basic block. */ int num_nodes; + /* The loads and stores in the BB, from the end of the block to + the beginning. */ + struct ddg_mem_ref *loads; + struct ddg_mem_ref *stores; + /* Number of load/store instructions in the BB - statistics. */ int num_loads; int num_stores; @@ -167,7 +193,9 @@ struct ddg_all_sccs }; -ddg_ptr create_ddg (basic_block, int closing_branch_deps); +struct loop; +ddg_ptr create_ddg (struct loop *, basic_block, int closing_branch_deps); +void add_edges_to_ddg (ddg_ptr); void free_ddg (ddg_ptr); void print_ddg (FILE *, ddg_ptr); @@ -188,4 +216,7 @@ int longest_simple_path (ddg_ptr, int fr bool autoinc_var_is_used_p (rtx, rtx); +/* The maximum allowable distance on a DDG edge. */ +#define MAX_DDG_DISTANCE INT_MAX + #endif /* GCC_DDG_H */ Index: gcc/ddg.c =================================================================== --- gcc/ddg.c 2011-12-30 13:13:45.077544981 +0000 +++ gcc/ddg.c 2011-12-30 13:36:35.005498271 +0000 @@ -43,6 +43,7 @@ Software Foundation; either version 3, o #include "expr.h" #include "bitmap.h" #include "ddg.h" +#include "tree-pass.h" #ifdef INSN_SCHEDULING @@ -61,88 +62,102 @@ static ddg_edge_ptr create_ddg_edge (ddg dep_data_type, int, int); static void add_edge_to_ddg (ddg_ptr g, ddg_edge_ptr); -/* Auxiliary variable for mem_read_insn_p/mem_write_insn_p. */ -static bool mem_ref_p; +/* Create a memory reference record for MEM, which occurs in NODE. + PREV is the previous reference of the same type. */ +static struct ddg_mem_ref * +create_mem_ref (struct ddg_mem_ref *prev, ddg_node_ptr node, rtx mem) +{ + struct ddg_mem_ref *entry; + enum machine_mode pmode; + struct rtx_iv iv; + rtx x; + + entry = XCNEW (struct ddg_mem_ref); + entry->prev = prev; + entry->node = node; + entry->mem = mem; + + pmode = targetm.addr_space.address_mode (MEM_ADDR_SPACE (mem)); + if (iv_analyze_expr (node->insn, XEXP (mem, 0), pmode, &iv) + && iv.extend == UNKNOWN + && CONST_INT_P (iv.step)) + { + x = iv.base; + if (GET_CODE (x) == PLUS && CONST_INT_P (XEXP (x, 1))) + { + entry->base = XEXP (x, 0); + entry->offset = INTVAL (XEXP (x, 1)); + } + else + { + entry->base = x; + entry->offset = 0; + } + entry->step = INTVAL (iv.step); + } -/* Auxiliary function for mem_read_insn_p. */ -static int -mark_mem_use (rtx *x, void *data ATTRIBUTE_UNUSED) -{ - if (MEM_P (*x)) - mem_ref_p = true; - return 0; + if (dump_file) + { + fprintf (dump_file, "Found memory reference in insn %d:\n", + INSN_UID (node->insn)); + print_rtl (dump_file, mem); + if (entry->base) + { + fprintf (dump_file, "\nwith base:"); + print_rtl (dump_file, entry->base); + fprintf (dump_file, "\noffset " HOST_WIDE_INT_PRINT_DEC + " and step " HOST_WIDE_INT_PRINT_DEC "\n\n", + entry->offset, entry->step); + } + else + fprintf (dump_file, "\nwhich isn't a recognized iv\n\n"); + } + return entry; } -/* Auxiliary function for mem_read_insn_p. */ -static void -mark_mem_use_1 (rtx *x, void *data) -{ - for_each_rtx (x, mark_mem_use, data); -} +/* A structure for pairing a node and the graph to which it belongs. */ +struct graph_and_node { + ddg_ptr g; + ddg_node_ptr node; +}; -/* Returns nonzero if INSN reads from memory. */ -static bool -mem_read_insn_p (rtx insn) +/* A for_each_rtx callback. Record all loads in an instruction. + DATA points to a graph_and_node. */ +static int +record_loads_1 (rtx *loc, void *data) { - mem_ref_p = false; - note_uses (&PATTERN (insn), mark_mem_use_1, NULL); - return mem_ref_p; -} + struct graph_and_node *gn; -static void -mark_mem_store (rtx loc, const_rtx setter ATTRIBUTE_UNUSED, void *data ATTRIBUTE_UNUSED) -{ - if (MEM_P (loc)) - mem_ref_p = true; + if (MEM_P (*loc)) + { + gn = (struct graph_and_node *) data; + gn->g->loads = create_mem_ref (gn->g->loads, gn->node, *loc); + gn->g->num_loads++; + } + return 0; } -/* Returns nonzero if INSN writes to memory. */ -static bool -mem_write_insn_p (rtx insn) +/* A note_uses callback. Record all loads in an instruction. + DATA points to a graph_and_node. */ +static void +record_loads (rtx *loc, void *data) { - mem_ref_p = false; - note_stores (PATTERN (insn), mark_mem_store, NULL); - return mem_ref_p; + for_each_rtx (loc, record_loads_1, data); } -/* Returns nonzero if X has access to memory. */ -static bool -rtx_mem_access_p (rtx x) +/* A note_stores callback. Record all stores in an instruction. + DATA points to a graph_and_node. */ +static void +record_stores (rtx x, const_rtx setter ATTRIBUTE_UNUSED, void *data) { - int i, j; - const char *fmt; - enum rtx_code code; - - if (x == 0) - return false; + struct graph_and_node *gn; if (MEM_P (x)) - return true; - - code = GET_CODE (x); - fmt = GET_RTX_FORMAT (code); - for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--) { - if (fmt[i] == 'e') - { - if (rtx_mem_access_p (XEXP (x, i))) - return true; - } - else if (fmt[i] == 'E') - for (j = 0; j < XVECLEN (x, i); j++) - { - if (rtx_mem_access_p (XVECEXP (x, i, j))) - return true; - } + gn = (struct graph_and_node *) data; + gn->g->stores = create_mem_ref (gn->g->stores, gn->node, x); + gn->g->num_stores++; } - return false; -} - -/* Returns nonzero if INSN reads to or writes from memory. */ -static bool -mem_access_insn_p (rtx insn) -{ - return rtx_mem_access_p (PATTERN (insn)); } /* Return true if DEF_INSN contains address being auto-inc or auto-dec @@ -175,9 +190,6 @@ create_ddg_dep_from_intra_loop_link (ddg ddg_edge_ptr e; int latency, distance = 0; dep_type t = TRUE_DEP; - dep_data_type dt = (mem_access_insn_p (src_node->insn) - && mem_access_insn_p (dest_node->insn) ? MEM_DEP - : REG_DEP); gcc_assert (src_node->cuid < dest_node->cuid); gcc_assert (link); @@ -201,7 +213,7 @@ create_ddg_dep_from_intra_loop_link (ddg TODO: support the removal of all anti-deps edges, i.e. including those whose register has multiple defs in the loop. */ if (flag_modulo_sched_allow_regmoves - && (t == ANTI_DEP && dt == REG_DEP) + && t == ANTI_DEP && !autoinc_var_is_used_p (dest_node->insn, src_node->insn)) { rtx set; @@ -224,7 +236,7 @@ create_ddg_dep_from_intra_loop_link (ddg } latency = dep_cost (link); - e = create_ddg_edge (src_node, dest_node, t, dt, latency, distance); + e = create_ddg_edge (src_node, dest_node, t, REG_DEP, latency, distance); add_edge_to_ddg (g, e); } @@ -380,107 +392,6 @@ build_inter_loop_deps (ddg_ptr g) } } - -static int -walk_mems_2 (rtx *x, rtx mem) -{ - if (MEM_P (*x)) - { - if (may_alias_p (*x, mem)) - return 1; - - return -1; - } - return 0; -} - -static int -walk_mems_1 (rtx *x, rtx *pat) -{ - if (MEM_P (*x)) - { - /* Visit all MEMs in *PAT and check indepedence. */ - if (for_each_rtx (pat, (rtx_function) walk_mems_2, *x)) - /* Indicate that dependence was determined and stop traversal. */ - return 1; - - return -1; - } - return 0; -} - -/* Return 1 if two specified instructions have mem expr with conflict alias sets*/ -static int -insns_may_alias_p (rtx insn1, rtx insn2) -{ - /* For each pair of MEMs in INSN1 and INSN2 check their independence. */ - return for_each_rtx (&PATTERN (insn1), (rtx_function) walk_mems_1, - &PATTERN (insn2)); -} - -/* Given two nodes, analyze their RTL insns and add intra-loop mem deps - to ddg G. */ -static void -add_intra_loop_mem_dep (ddg_ptr g, ddg_node_ptr from, ddg_node_ptr to) -{ - - if ((from->cuid == to->cuid) - || !insns_may_alias_p (from->insn, to->insn)) - /* Do not create edge if memory references have disjoint alias sets - or 'to' and 'from' are the same instruction. */ - return; - - if (mem_write_insn_p (from->insn)) - { - if (mem_read_insn_p (to->insn)) - create_ddg_dep_no_link (g, from, to, - DEBUG_INSN_P (to->insn) - ? ANTI_DEP : TRUE_DEP, MEM_DEP, 0); - else - create_ddg_dep_no_link (g, from, to, - DEBUG_INSN_P (to->insn) - ? ANTI_DEP : OUTPUT_DEP, MEM_DEP, 0); - } - else if (!mem_read_insn_p (to->insn)) - create_ddg_dep_no_link (g, from, to, ANTI_DEP, MEM_DEP, 0); -} - -/* Given two nodes, analyze their RTL insns and add inter-loop mem deps - to ddg G. */ -static void -add_inter_loop_mem_dep (ddg_ptr g, ddg_node_ptr from, ddg_node_ptr to) -{ - if (!insns_may_alias_p (from->insn, to->insn)) - /* Do not create edge if memory references have disjoint alias sets. */ - return; - - if (mem_write_insn_p (from->insn)) - { - if (mem_read_insn_p (to->insn)) - create_ddg_dep_no_link (g, from, to, - DEBUG_INSN_P (to->insn) - ? ANTI_DEP : TRUE_DEP, MEM_DEP, 1); - else if (from->cuid != to->cuid) - create_ddg_dep_no_link (g, from, to, - DEBUG_INSN_P (to->insn) - ? ANTI_DEP : OUTPUT_DEP, MEM_DEP, 1); - } - else - { - if (mem_read_insn_p (to->insn)) - return; - else if (from->cuid != to->cuid) - { - create_ddg_dep_no_link (g, from, to, ANTI_DEP, MEM_DEP, 1); - if (DEBUG_INSN_P (from->insn) || DEBUG_INSN_P (to->insn)) - create_ddg_dep_no_link (g, to, from, ANTI_DEP, MEM_DEP, 1); - else - create_ddg_dep_no_link (g, to, from, TRUE_DEP, MEM_DEP, 1); - } - } - -} - /* Perform intra-block Data Dependency analysis and connect the nodes in the DDG. We assume the loop has a single basic block. */ static void @@ -493,6 +404,9 @@ build_intra_loop_deps (ddg_ptr g) /* Build the dependence information, using the sched_analyze function. */ init_deps_global (); + /* Ignore the usual dependencies between two MEM rtxes. We still rely + on sched_analyze to handle memory barriers and the like. */ + sched_deps_info->note_mem_dep = 0; init_deps (&tmp_deps, false); /* Do the intra-block data dependence analysis for the given block. */ @@ -519,37 +433,6 @@ build_intra_loop_deps (ddg_ptr g) create_ddg_dep_from_intra_loop_link (g, src_node, dest_node, dep); } - - /* If this insn modifies memory, add an edge to all insns that access - memory. */ - if (mem_access_insn_p (dest_node->insn)) - { - int j; - - for (j = 0; j <= i; j++) - { - ddg_node_ptr j_node = &g->nodes[j]; - if (DEBUG_INSN_P (j_node->insn)) - continue; - if (mem_access_insn_p (j_node->insn)) - { - /* Don't bother calculating inter-loop dep if an intra-loop dep - already exists. */ - if (! TEST_BIT (dest_node->successors, j)) - add_inter_loop_mem_dep (g, dest_node, j_node); - /* If -fmodulo-sched-allow-regmoves - is set certain anti-dep edges are not created. - It might be that these anti-dep edges are on the - path from one memory instruction to another such that - removing these edges could cause a violation of the - memory dependencies. Thus we add intra edges between - every two memory instructions in this case. */ - if (flag_modulo_sched_allow_regmoves - && !TEST_BIT (dest_node->predecessors, j)) - add_intra_loop_mem_dep (g, j_node, dest_node); - } - } - } } /* Free the INSN_LISTs. */ @@ -560,13 +443,187 @@ build_intra_loop_deps (ddg_ptr g) sched_free_deps (head, tail, false); } +/* Given a "source" memory reference from iteration 0 and a "target" + memory reference from iteration BASE_DISTANCE, return the first + N >= BASE_DISTANCE such that the source reference in iteration 0 + overlaps the target reference in iteration N. + + FROM_OFFSET is the offset of the source reference from an unspecified + base, while TO_OFFSET is the offset of the target reference from that + same base. FROM_SIZE and TO_SIZE are the sizes of the two references + in bytes. + + PMODE is the mode of both addresses, and STEP is the amount that + will be added to each address by one loop iteration. */ +static int +measure_mem_distance (enum machine_mode pmode, + unsigned HOST_WIDE_INT from_offset, + unsigned HOST_WIDE_INT from_size, + unsigned HOST_WIDE_INT to_offset, + unsigned HOST_WIDE_INT to_size, + unsigned HOST_WIDE_INT base_distance, + HOST_WIDE_INT step) +{ + unsigned HOST_WIDE_INT extra, from2to, to2from; + + from2to = (to_offset - from_offset) & GET_MODE_MASK (pmode); + to2from = (from_offset - to_offset) & GET_MODE_MASK (pmode); + if (from2to < from_size || to2from < to_size) + /* The source reference in iteration 0 overlaps the target reference + in iteration BASE_DISTANCE. The check is written this way to cope + with cases where offset + size overflows. */ + return base_distance; + + /* N > BASE_DISTANCE. To cope more easily with cases where the round-up + divisions: + + (to2from - (to_size - 1) + (step - 1)) / step + (from2to - (from_size - 1) + (step - 1)) / step + + would overflow, bump BASE_DISTANCE and subtract STEP from each + dividend to compensate. */ + base_distance++; + if (step > 0) + extra = (to2from - to_size) / (unsigned HOST_WIDE_INT) step; + else + extra = (from2to - from_size) / (unsigned HOST_WIDE_INT) -step; + if (extra > MAX_DDG_DISTANCE || base_distance + extra > MAX_DDG_DISTANCE) + return MAX_DDG_DISTANCE; + return base_distance + extra; +} + +/* If FROM and TO might alias, record memory dependencies: + + FROM--(FORWARD_TYPE)-->TO + and TO--(BACKWARD_TYPE)-->FROM + + FROM comes before TO in the original loop, and both belong to G. + FORWARD_DISTANCE is the minimum distance of the FROM--->TO dependence. */ +static void +add_memory_dep (ddg_ptr g, struct ddg_mem_ref *from, + struct ddg_mem_ref *to, dep_type forward_type, + dep_type backward_type, int forward_distance) +{ + HOST_WIDE_INT step; + unsigned HOST_WIDE_INT from_size, to_size, to_disp, abs_step, future_offset; + enum machine_mode pmode; + int backward_distance; + + gcc_checking_assert (from->node->cuid < to->node->cuid); + + if (!may_alias_p (from->mem, to->mem)) + return; + + /* In the worst case, the TO---->FROM edge has a distance of 1. */ + backward_distance = 1; + + /* See if we can get more accurate distances. Look for cases where + the addresses of FROM and TO are ivs with the same base and step. */ + if (from->base + && to->base + && from->step + && from->step == to->step + && !MEM_VOLATILE_P (from->mem) + && !MEM_VOLATILE_P (to->mem) + && MEM_SIZE_KNOWN_P (from->mem) + && MEM_SIZE_KNOWN_P (to->mem) + && MEM_ADDR_SPACE (from->mem) == MEM_ADDR_SPACE (to->mem) + && rtx_equal_p (from->base, to->base)) + { + step = to->step; + abs_step = (step < 0 ? -step : step); + from_size = MEM_SIZE (from->mem); + to_size = MEM_SIZE (to->mem); + + /* If the step is a power of two, or the negative of a power of two, + see whether we can prove that the references never overlap. */ + if (abs_step == (abs_step & -abs_step)) + { + to_disp = (to->offset - from->offset) % abs_step; + if (from_size <= to_disp && to_disp + to_size <= abs_step) + return; + } + + pmode = targetm.addr_space.address_mode (MEM_ADDR_SPACE (to->mem)); + future_offset = to->offset + forward_distance * step; + forward_distance = measure_mem_distance (pmode, + from->offset, from_size, + future_offset, to_size, + forward_distance, step); + future_offset = from->offset + backward_distance * step; + backward_distance = measure_mem_distance (pmode, + to->offset, to_size, + future_offset, from_size, + backward_distance, step); + } + + if (DEBUG_INSN_P (from->node->insn) || DEBUG_INSN_P (to->node->insn)) + { + forward_type = ANTI_DEP; + backward_type = ANTI_DEP; + } + create_ddg_dep_no_link (g, from->node, to->node, forward_type, MEM_DEP, + forward_distance); + create_ddg_dep_no_link (g, to->node, from->node, backward_type, MEM_DEP, + backward_distance); +} + +/* Make REF2 iterate over all entries in ddg_mem_ref list LIST + that come later than ddg_mem_ref REF1. */ +#define FOR_EACH_LATER_MEM_REF(REF2, REF1, LIST) \ + for (REF2 = (LIST); \ + REF2 && REF2->node->cuid > REF1->node->cuid; \ + REF2 = REF2->prev) + +/* Check for dependencies between pairs of memory rtxes. */ +static void +build_memory_deps (ddg_ptr g) +{ + struct ddg_mem_ref *ref1, *ref2; + int distance; + + for (ref1 = g->loads; ref1; ref1 = ref1->prev) + { + /* LOAD--->LOAD. */ + if (MEM_VOLATILE_P (ref1->mem)) + FOR_EACH_LATER_MEM_REF (ref2, ref1, g->loads) + if (MEM_VOLATILE_P (ref2->mem)) + add_memory_dep (g, ref1, ref2, ANTI_DEP, ANTI_DEP, 0); + + /* LOAD--->STORE. */ + FOR_EACH_LATER_MEM_REF (ref2, ref1, g->stores) + { + distance = anti_dependence (ref1->mem, ref2->mem) ? 0 : 1; + add_memory_dep (g, ref1, ref2, ANTI_DEP, TRUE_DEP, distance); + } + } + + for (ref1 = g->stores; ref1; ref1 = ref1->prev) + { + /* STORE--->LOAD. */ + FOR_EACH_LATER_MEM_REF (ref2, ref1, g->loads) + { + distance = true_dependence (ref1->mem, VOIDmode, + ref2->mem, rtx_varies_p) ? 0 : 1; + add_memory_dep (g, ref1, ref2, TRUE_DEP, ANTI_DEP, distance); + } + + /* STORE--->STORE. */ + FOR_EACH_LATER_MEM_REF (ref2, ref1, g->stores) + { + distance = output_dependence (ref1->mem, ref2->mem) ? 0 : 1; + add_memory_dep (g, ref1, ref2, OUTPUT_DEP, OUTPUT_DEP, distance); + } + } +} -/* Given a basic block, create its DDG and return a pointer to a variable - of ddg type that represents it. +/* Given a basic block, create the nodes of its DDG and return a pointer + to a variable of ddg type that represents it. Initialize the ddg structure fields to the appropriate values. */ ddg_ptr -create_ddg (basic_block bb, int closing_branch_deps) +create_ddg (struct loop *loop, basic_block bb, int closing_branch_deps) { + struct graph_and_node gn; ddg_ptr g; rtx insn, first_note; int i; @@ -586,13 +643,6 @@ create_ddg (basic_block bb, int closing_ if (DEBUG_INSN_P (insn)) g->num_debug++; - else - { - if (mem_read_insn_p (insn)) - g->num_loads++; - if (mem_write_insn_p (insn)) - g->num_stores++; - } num_nodes++; } @@ -603,6 +653,8 @@ create_ddg (basic_block bb, int closing_ return NULL; } + iv_analysis_loop_init (loop); + /* Allocate the nodes array, and initialize the nodes. */ g->num_nodes = num_nodes; g->nodes = (ddg_node_ptr) xcalloc (num_nodes, sizeof (struct ddg_node)); @@ -637,18 +689,31 @@ create_ddg (basic_block bb, int closing_ g->nodes[i].predecessors = sbitmap_alloc (num_nodes); sbitmap_zero (g->nodes[i].predecessors); g->nodes[i].first_note = (first_note ? first_note : insn); - g->nodes[i++].insn = insn; + g->nodes[i].insn = insn; first_note = NULL_RTX; + + gn.g = g; + gn.node = &g->nodes[i]; + note_uses (&PATTERN (insn), record_loads, &gn); + note_stores (PATTERN (insn), record_stores, &gn); + + i++; } /* We must have found a branch in DDG. */ gcc_assert (g->closing_branch); + return g; +} +/* Add the edges to a DDG that was previously created by create_ddg. + This function relies on scheduler dependencies. */ - /* Build the data dependency graph. */ +void +add_edges_to_ddg (ddg_ptr g) +{ build_intra_loop_deps (g); build_inter_loop_deps (g); - return g; + build_memory_deps (g); } /* Free all the memory allocated for the DDG. */ Index: gcc/modulo-sched.c =================================================================== --- gcc/modulo-sched.c 2011-12-30 13:13:45.077544981 +0000 +++ gcc/modulo-sched.c 2011-12-30 13:24:57.327195816 +0000 @@ -345,6 +345,38 @@ ps_num_consecutive_stages (partial_sched return ps_reg_move (ps, id)->num_consecutive_stages; } +/* Perform a saturating multiplication of nonnegative values A and B. */ + +static inline int +sat_mulpp (unsigned int a, unsigned int b) +{ + if ((unsigned int) INT_MAX / b <= a) + return INT_MAX; + else + return a * b; +} + +/* Perform a saturating addition of signed value A and nonnegative value B. */ + +static inline int +sat_addsp (int a, int b) +{ + if (INT_MAX - b <= a) + return INT_MAX; + return a + b; +} + +/* Perform a saturating subtraction of signed value A and nonnegative + value B. */ + +static inline int +sat_subsp (int a, int b) +{ + if (INT_MIN + b >= a) + return INT_MIN; + return a - b; +} + /* Given HEAD and TAIL which are the first and last insns in a loop; return the register which controls the loop. Return zero if it has more than one occurrence in the loop besides the control part or the @@ -709,7 +741,9 @@ schedule_reg_moves (partial_schedule_ptr ranges started at u (excluding self-loops). */ distances[0] = distances[1] = false; for (e = u->out; e; e = e->next_out) - if (e->type == TRUE_DEP && e->dest != e->src) + if (e->data_type == REG_DEP + && e->type == TRUE_DEP + && e->dest != e->src) { int nreg_moves4e = (SCHED_TIME (e->dest->cuid) - SCHED_TIME (e->src->cuid)) / ii; @@ -781,7 +815,9 @@ schedule_reg_moves (partial_schedule_ptr copy of this register, depending on the time the use is scheduled. Record which uses require which move results. */ for (e = u->out; e; e = e->next_out) - if (e->type == TRUE_DEP && e->dest != e->src) + if (e->data_type == REG_DEP + && e->type == TRUE_DEP + && e->dest != e->src) { int dest_copy = (SCHED_TIME (e->dest->cuid) - SCHED_TIME (e->src->cuid)) / ii; @@ -1355,6 +1391,7 @@ sms_schedule (void) basic_block condition_bb = NULL; edge latch_edge; gcov_type trip_count = 0; + int num_ddgs; loop_optimizer_init (LOOPS_HAVE_PREHEADERS | LOOPS_HAVE_RECORDED_EXITS); @@ -1364,34 +1401,19 @@ sms_schedule (void) return; /* There are no loops to schedule. */ } - /* Initialize issue_rate. */ - if (targetm.sched.issue_rate) - { - int temp = reload_completed; - - reload_completed = 1; - issue_rate = targetm.sched.issue_rate (); - reload_completed = temp; - } - else - issue_rate = 1; - - /* Initialize the scheduler. */ - setup_sched_infos (); - haifa_sched_init (); - /* Allocate memory to hold the DDG array one entry for each loop. We use loop->num as index into this array. */ g_arr = XCNEWVEC (ddg_ptr, number_of_loops ()); if (dump_file) - { - fprintf (dump_file, "\n\nSMS analysis phase\n"); - fprintf (dump_file, "===================\n\n"); - } + { + fprintf (dump_file, "\n\nSMS loop discovery phase\n"); + fprintf (dump_file, "========================\n\n"); + } /* Build DDGs for all the relevant loops and hold them in G_ARR indexed by the loop index. */ + num_ddgs = 0; FOR_EACH_LOOP (li, loop, 0) { rtx head, tail; @@ -1512,7 +1534,7 @@ sms_schedule (void) instructions. The branch is rotated to be in row ii-1 at the end of the scheduling procedure to make sure it's the last instruction in the iteration. */ - if (! (g = create_ddg (bb, 1))) + if (! (g = create_ddg (loop, bb, 1))) { if (dump_file) fprintf (dump_file, "SMS create_ddg failed\n"); @@ -1523,12 +1545,38 @@ sms_schedule (void) if (dump_file) fprintf (dump_file, "...OK\n"); + num_ddgs++; + } + iv_analysis_done (); + + if (num_ddgs == 0) + { + if (dump_file) + fprintf (dump_file, "No suitable loops\n"); + goto done; } + + /* Initialize issue_rate. */ + if (targetm.sched.issue_rate) + { + int temp = reload_completed; + + reload_completed = 1; + issue_rate = targetm.sched.issue_rate (); + reload_completed = temp; + } + else + issue_rate = 1; + + /* Initialize the scheduler. */ + setup_sched_infos (); + haifa_sched_init (); + if (dump_file) - { - fprintf (dump_file, "\nSMS transformation phase\n"); - fprintf (dump_file, "=========================\n\n"); - } + { + fprintf (dump_file, "\nSMS transformation phase\n"); + fprintf (dump_file, "=========================\n\n"); + } /* We don't want to perform SMS on new loops - created by versioning. */ FOR_EACH_LOOP (li, loop, 0) @@ -1542,6 +1590,8 @@ sms_schedule (void) if (! (g = g_arr[loop->num])) continue; + add_edges_to_ddg (g); + if (dump_file) { rtx insn = BB_END (loop->header); @@ -1754,10 +1804,12 @@ sms_schedule (void) free_ddg (g); } - free (g_arr); - /* Release scheduler data, needed until now because of DFA. */ haifa_sched_finish (); + + done: + free (g_arr); + loop_optimizer_finalize (); } @@ -1844,6 +1896,7 @@ #define DFA_HISTORY SMS_DFA_HISTORY /* A threshold for the number of repeated unsuccessful attempts to insert an empty row, before we flush the partial schedule and start over. */ #define MAX_SPLIT_NUM 10 + /* Given the partial schedule PS, this function calculates and returns the cycles in which we can schedule the node with the given index I. NOTE: Here we do the backtracking in SMS, in some special cases. We have @@ -1896,7 +1949,7 @@ get_sched_window (partial_schedule_ptr p fprintf (dump_file, "=========== =========== =========== ===========" " =====\n"); } - /* Calculate early_start and limit end. Both bounds are inclusive. */ + /* Calculate early_start and limit start. Both bounds are inclusive. */ if (psp_not_empty) for (e = u_node->in; e != 0; e = e->next_in) { @@ -1905,26 +1958,36 @@ get_sched_window (partial_schedule_ptr p if (TEST_BIT (sched_nodes, v)) { int p_st = SCHED_TIME (v); - int earliest = p_st + e->latency - (e->distance * ii); - int latest = (e->data_type == MEM_DEP ? p_st + ii - 1 : INT_MAX); + int earliest = sat_subsp (sat_addsp (p_st, e->latency), + sat_mulpp (e->distance, ii)); + if (e->data_type == MEM_DEP) + { + start = MAX (start, earliest); + if (dump_file) + fprintf (dump_file, "%11d %11s %11s %11s", + earliest, "", "", ""); + } + else + { + early_start = MAX (early_start, earliest); + if (dump_file) + fprintf (dump_file, "%11s %11d %11s %11s", + "", earliest, "", ""); + } if (dump_file) { - fprintf (dump_file, "%11s %11d %11s %11d %5d", - "", earliest, "", latest, p_st); + fprintf (dump_file, " %5d", p_st); print_ddg_edge (dump_file, e); fprintf (dump_file, "\n"); } - early_start = MAX (early_start, earliest); - end = MIN (end, latest); - if (e->type == TRUE_DEP && e->data_type == REG_DEP) count_preds++; } } - /* Calculate late_start and limit start. Both bounds are inclusive. */ + /* Calculate late_start and limit end. Both bounds are inclusive. */ if (pss_not_empty) for (e = u_node->out; e != 0; e = e->next_out) { @@ -1933,20 +1996,30 @@ get_sched_window (partial_schedule_ptr p if (TEST_BIT (sched_nodes, v)) { int s_st = SCHED_TIME (v); - int earliest = (e->data_type == MEM_DEP ? s_st - ii + 1 : INT_MIN); - int latest = s_st - e->latency + (e->distance * ii); + int latest = sat_addsp (sat_subsp (s_st, e->latency), + sat_mulpp (e->distance, ii)); + if (e->data_type == MEM_DEP) + { + end = MIN (end, latest); + if (dump_file) + fprintf (dump_file, "%11s %11s %11s %11d", + "", "", "", latest); + } + else + { + late_start = MIN (late_start, latest); + if (dump_file) + fprintf (dump_file, "%11s %11s %11d %11s", + "", "", latest, ""); + } if (dump_file) { - fprintf (dump_file, "%11d %11s %11d %11s %5d", - earliest, "", latest, "", s_st); + fprintf (dump_file, " %5d", s_st); print_ddg_edge (dump_file, e); fprintf (dump_file, "\n"); } - start = MAX (start, earliest); - late_start = MIN (late_start, latest); - if (e->type == TRUE_DEP && e->data_type == REG_DEP) count_succs++; } @@ -1963,14 +2036,22 @@ get_sched_window (partial_schedule_ptr p /* Get a target scheduling window no bigger than ii. */ if (early_start == INT_MIN && late_start == INT_MAX) - early_start = NODE_ASAP (u_node); - else if (early_start == INT_MIN) - early_start = late_start - (ii - 1); - late_start = MIN (late_start, early_start + (ii - 1)); - - /* Apply memory dependence limits. */ - start = MAX (start, early_start); - end = MIN (end, late_start); + { + /* The default window (as given in the paper) is based on + the node's ASAP value, but shift or shrink it as necessary + in order to honor memory dependencies. */ + early_start = MIN (NODE_ASAP (u_node), end - (ii - 1)); + start = MAX (early_start, start); + } + else + { + end = MIN (end, late_start); + if (early_start == INT_MIN) + start = MAX (start, end - (ii - 1)); + else + start = MAX (start, early_start); + } + end = MIN (end, start + (ii - 1)); if (dump_file && (psp_not_empty || pss_not_empty)) fprintf (dump_file, "%11s %11d %11d %11s %5s final window\n", @@ -2060,8 +2141,8 @@ calculate_must_precede_follow (ddg_node_ SCHED_TIME (e->src) - (e->distance * ii) == first_cycle_in_window */ for (e = u_node->in; e != 0; e = e->next_in) if (TEST_BIT (sched_nodes, e->src->cuid) - && ((SCHED_TIME (e->src->cuid) - (e->distance * ii)) == - first_cycle_in_window)) + && (sat_subsp (SCHED_TIME (e->src->cuid), sat_mulpp (e->distance, ii)) + == first_cycle_in_window)) { if (dump_file) fprintf (dump_file, "%d ", e->src->cuid); @@ -2371,7 +2452,8 @@ compute_split_row (sbitmap sched_nodes, int v = e->src->cuid; if (TEST_BIT (sched_nodes, v) - && (low == SCHED_TIME (v) + e->latency - (e->distance * ii))) + && low == sat_subsp (sat_addsp (SCHED_TIME (v), e->latency), + sat_mulpp (e->distance, ii))) if (SCHED_TIME (v) > lower) { crit_pred = v;

13 years, 6 months

1
0
0 0

Update to lp:~rsandifo/+junk/loop-microbenchmarks

by Richard Sandiford

About three months ago, 4.7 stopped being able to optimise things like: int *__restrict x = ...; The (libav) loop microbenchmarks that I'd written used this construct a lot, as an easy way of automatically generating a whole function from a loop kernel. I spent a while testing 4.7 with the restrict patch reverted, while I caught up with my post-holiday email backlog and saw whether the effect on this code was deliberate. I eventually realised it was, so implemented a change that Ira had suggested: splitting out a peak_loop_1 that takes all the restrict pointers as arguments. I just realised that I never pushed that change back up to bzr, so I've done it now. Probably a write-only change, since I doubt anyone's going to be using the benchmark again, but just in case :-) Richard

13 years, 6 months

1
0
0 0

Patch drop: auto-inc-dec.c rewrite

by Richard Sandiford

This is my current 4.7 auto-inc-dec.c patch. I submitted an RFC in July: http://article.gmane.org/gmane.comp.gcc.patches/241779/ and updated the patch in line with the feedback I got. Steven Bosscher sent some very useful comments in private email, so the update deals with those as well as Bernd's public ones. If we do go ahead with this rewrite, it depends on the A9 pipeline description changes. I submitted some A8 and A9 changes here: http://article.gmane.org/gmane.comp.gcc.patches/244238/ http://article.gmane.org/gmane.comp.gcc.patches/244242/ but because I later noticed that the A9 didn't behave quite as I thought, I decided not to apply them. Ramana asked around internally about what the A9 actually does (thanks) and had some ideas. The patch also relies on the MEM rtx_costs patch that I just posted: http://lists.linaro.org/pipermail/linaro-toolchain/2011-December/001944.html Richard gcc/ * Makefile.in (auto-inc-dec.o): Depends on $(OPTABS_H) and addresses.h. * auto-inc-dec.c: Rewrite. Index: gcc/Makefile.in =================================================================== --- gcc/Makefile.in 2011-12-07 11:43:29.549238252 +0000 +++ gcc/Makefile.in 2011-12-29 09:24:51.066303201 +0000 @@ -3145,7 +3145,8 @@ alloc-pool.o : alloc-pool.c $(CONFIG_H) auto-inc-dec.o : auto-inc-dec.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \ $(TREE_H) $(RTL_H) $(TM_P_H) hard-reg-set.h $(BASIC_BLOCK_H) insn-config.h \ $(REGS_H) $(FLAGS_H) output.h $(FUNCTION_H) $(EXCEPT_H) $(DIAGNOSTIC_CORE_H) $(RECOG_H) \ - $(EXPR_H) $(TIMEVAR_H) $(TREE_PASS_H) $(DF_H) $(DBGCNT_H) $(TARGET_H) + $(EXPR_H) $(TIMEVAR_H) $(TREE_PASS_H) $(DF_H) $(DBGCNT_H) $(TARGET_H) \ + $(OPTABS_H) addresses.h cfg.o : cfg.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) $(FLAGS_H) \ $(REGS_H) hard-reg-set.h output.h $(DIAGNOSTIC_CORE_H) $(FUNCTION_H) $(EXCEPT_H) $(GGC_H) \ $(TM_P_H) $(TIMEVAR_H) $(OBSTACK_H) $(TREE_H) alloc-pool.h \

13 years, 6 months

1
0
0 0

Patch drop: Rework MEM rtx_costs

by Richard Sandiford

I originally wrote this patch as part of the auto-inc-dec work. I didn't submit it because I wasn't sure what value of extra_writeback_latency was appropriate for A9. (I was hoping to crib it from Ramana's pipeline description.) The patch introduces three new fields to the costs structure: one to control the latency of core loads, one to control the latency of NEON loads, and one to control the penalty of address writeback. The patch includes a tweak for cases where we use two VLDRs. That part should obviously be dropped if we change the move patterns to use something else. Richard gcc/ * config/arm/arm-protos.h (tune_params): Add core_mem_latency, neon_mem_latency and extra_writeback_latency. * config/arm/arm.c (arm_slowmul_tune, arm_fastmul_tune) (arm_strongarm_tune, arm_xscale_tune, arm_9e_tune, arm_v6t2_tune) (arm_cortex_tune, arm_cortex_a5_tune, arm_cortex_a9_tune) (arm_fa726te_tune): Populate the new tune_params fields. (arm_mem_cost): New function. (arm_rtx_costs_1): Use it. Index: gcc/config/arm/arm-protos.h =================================================================== --- gcc/config/arm/arm-protos.h 2011-08-09 15:01:14.000000000 +0100 +++ gcc/config/arm/arm-protos.h 2011-08-09 15:04:58.121984034 +0100 @@ -236,6 +236,9 @@ struct tune_params int l1_cache_size; int l1_cache_line_size; bool prefer_constant_pool; + int core_mem_latency; + int neon_mem_latency; + int extra_writeback_latency; int (*branch_cost) (bool, bool); }; Index: gcc/config/arm/arm.c =================================================================== --- gcc/config/arm/arm.c 2011-08-09 15:01:14.000000000 +0100 +++ gcc/config/arm/arm.c 2011-08-09 15:07:07.215103271 +0100 @@ -840,6 +840,9 @@ const struct tune_params arm_slowmul_tun 5, /* Max cond insns. */ ARM_PREFETCH_NOT_BENEFICIAL, true, /* Prefer constant pool. */ + 2, + 2, + 0, arm_default_branch_cost }; @@ -851,6 +854,9 @@ const struct tune_params arm_fastmul_tun 5, /* Max cond insns. */ ARM_PREFETCH_NOT_BENEFICIAL, true, /* Prefer constant pool. */ + 2, + 2, + 0, arm_default_branch_cost }; @@ -865,6 +871,9 @@ const struct tune_params arm_strongarm_t 3, /* Max cond insns. */ ARM_PREFETCH_NOT_BENEFICIAL, true, /* Prefer constant pool. */ + 2, + 2, + 0, arm_default_branch_cost }; @@ -876,6 +885,9 @@ const struct tune_params arm_xscale_tune 3, /* Max cond insns. */ ARM_PREFETCH_NOT_BENEFICIAL, true, /* Prefer constant pool. */ + 2, + 2, + 0, arm_default_branch_cost }; @@ -887,6 +899,9 @@ const struct tune_params arm_9e_tune = 5, /* Max cond insns. */ ARM_PREFETCH_NOT_BENEFICIAL, true, /* Prefer constant pool. */ + 2, + 2, + 0, arm_default_branch_cost }; @@ -898,6 +913,9 @@ const struct tune_params arm_v6t2_tune = 5, /* Max cond insns. */ ARM_PREFETCH_NOT_BENEFICIAL, false, /* Prefer constant pool. */ + 2, + 2, + 0, arm_default_branch_cost }; @@ -910,6 +928,9 @@ const struct tune_params arm_cortex_tune 5, /* Max cond insns. */ ARM_PREFETCH_NOT_BENEFICIAL, false, /* Prefer constant pool. */ + 2, + 2, + 1, arm_default_branch_cost }; @@ -924,6 +945,9 @@ const struct tune_params arm_cortex_a5_t 1, /* Max cond insns. */ ARM_PREFETCH_NOT_BENEFICIAL, false, /* Prefer constant pool. */ + 2, + 2, + 0, arm_cortex_a5_branch_cost }; @@ -935,6 +959,9 @@ const struct tune_params arm_cortex_a9_t 5, /* Max cond insns. */ ARM_PREFETCH_BENEFICIAL(4,32,32), false, /* Prefer constant pool. */ + 2, + 2, + 0, arm_default_branch_cost }; @@ -946,6 +973,9 @@ const struct tune_params arm_fa726te_tun 5, /* Max cond insns. */ ARM_PREFETCH_NOT_BENEFICIAL, true, /* Prefer constant pool. */ + 2, + 2, + 0, arm_default_branch_cost }; @@ -6848,6 +6878,41 @@ thumb1_rtx_costs (rtx x, enum rtx_code c } } +/* Return the cost in insns of a memory reference of mode MODE to + address ADDR. */ + +static int +arm_mem_cost (enum machine_mode mode, rtx addr) +{ + int count, base; + + count = ARM_NUM_REGS (mode); + if (TARGET_NEON + && (VALID_NEON_DREG_MODE (mode) + || VALID_NEON_QREG_MODE (mode) + || VALID_NEON_STRUCT_MODE (mode))) + { + base = current_tune->neon_mem_latency; + + if (count == 4 && (GET_CODE (addr) == PLUS || CONSTANT_P (addr))) + /* In this case we use two VLDRs. */ + return COSTS_N_INSNS (base + 2); + + /* Assume that one quad can be accessed each cycle. */ + return COSTS_N_INSNS (base + (count + 3) / 4); + } + + base = current_tune->core_mem_latency; + + if (count == 1 && GET_RTX_CLASS (GET_CODE (addr)) == RTX_AUTOINC) + /* On some targets (like A8), core accesses chained by address + register writeback cannot issue in consecutive cycles. + Pessimize writeback to account for this. */ + base += current_tune->extra_writeback_latency; + + return COSTS_N_INSNS (base + count); +} + static inline bool arm_rtx_costs_1 (rtx x, enum rtx_code outer, int* total, bool speed) { @@ -6860,9 +6925,7 @@ arm_rtx_costs_1 (rtx x, enum rtx_code ou switch (code) { case MEM: - /* Memory costs quite a lot for the first word, but subsequent words - load at the equivalent of a single insn each. */ - *total = COSTS_N_INSNS (2 + ARM_NUM_REGS (mode)); + *total = arm_mem_cost (mode, XEXP (x, 0)); return true; case DIV:

13 years, 6 months

1
0
0 0

Goodbye

by Ira Rosen

Hi, Thank you all for an interesting and pleasant experience. I am very grateful to Linaro for the opportunity to meet and work with such an amazing group of people. I wish you all the best, and hope to meet you again (at least online). You can find me at irar(a)il.ibm.com or ira.rsn(a)gmail.com. Ira

13 years, 6 months

1
0
0 0

[ACTIVITY] WW51

by Zhenqiang Chen

Summary: * Read armV7-A/R reference manual; crosstool-ng patches and wrapper scripts. Details: 1. Patches for crosstool-ng: * Fix symlink issue when CT_USE_SYSROOT is not enabled. * Update sample/linaro-arm-none-eabi (baremetal) to disable SYSROOT. So that both include and lib files are in the same dir. 2. Study armV7-A/R reference manual. 3. Validate embedded toolchain Dec. release. 4. Enhance the wrapper to use crosstool-ng for embedded toolchain. Plan: * Ramp-up on gcc. Best regards! -Zhenqiang

13 years, 6 months

1
0
0 0

[ACTIVITY] weekly status

by Revital Eres

Submitting patch for Bug #879725: http://gcc.gnu.org/ml/gcc-patches/2011-12/msg01459.html Looking at the performance results running SMS with automatic testing. This is my last week in Linaro so I would also like to thank you all for the interesting year -- it was a great experience for me to work in this project. I wish you all good luck and happy holidays! Revital

13 years, 6 months

1
0
0 0

[ACTIVITY] weekly status

by Ken Werner

Hi, OpenEmbedded-Core: * No response on the CSL patches I posted to the ml yet * khem says someone (other than me) needs to try them * Linaro binary toolchain * Runs on Oneiric-X86_64 after installing lsb-core (interpreter: /lib/ld-lsb.so.3) * The do_rootfs tasks fails with runtine dependecy issues when using the external-linaro-toolchain_arm-2011.11.bb recipe. When re-using my CSL 2011.03 recipe with the linaro toolchain the error doesn't show up - strange. * OE-Core build gets confused by the (arm-linux-gnueabi-)pkg-config of the external linaro toolchain. As a workaround I just renamed this script. * The qemuarm MACHINE configuration uses "-march=armv5te -mno-thumb" Since the linaro toolchain defaults to thumb and -mno-thumb has no effect some inline assemblies are failing (i.e. on the umull insn). GNU #47930 suggests using -marm instead -> OE-Core patch posted. * Got the core-image-minimal to build, but it doesn't run yet (I suspect some basic runtime dependencies like libc again) * The build of the sato image fails (seems libtool and/or C++ related - need to investigate) Regards Ken

13 years, 6 months

1
0
0 0

Fwd: [ACTIVITY] w51

by Asa Sandahl

Hi, * Continued with comparison of eembc results for gcc4.4 and gcc 4.6 (FSF and Linaro). Collecting results for 4.6 with loop-unrolling turned off. * Working on a plotbench.py script that will use matplotlib for plotting the results. Right now the script plots the geomean value, for instance for eembc. I now try to make it plot all subtest as well. Then it should also show relative improvements instead of just the numbers, and then also sorted from best to worse. This script depends on Michaels script libtabulate.py for transforming the tabulated file back to python records. * Will be back January 9 /Regards Åsa

13 years, 6 months

1
0
0 0

[ACTIVITY] Nov 19 - Dec 22

by Ulrich Weigand

== GDB == * Ongoing work on remote support for "info proc" and core file generation. Completed implementation of latest solution via accessing arbitrary files on the remote site, only to run into a fundamental design problem ... so it's probably back to the previous approach. Discussion on the list is ongoing. * Fixed a GDB 7.4 test suite failure on ARM: PR tdep/12797 * Fixed another GBD 7.4 test suite failure on ARM, by enabling pthread_t thread debugging on core files. == GCC == * Patch review week. Mit freundlichen Gruessen / Best Regards Ulrich Weigand -- Dr. Ulrich Weigand | Phone: +49-7031/16-3727 STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E. IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294

13 years, 6 months

1
0
0 0

cdce3.C execution fault

by Michael Hope

Hi there. I've looked further into the intermittent gcc/testsuite/g++.dg/cdce3.C test failures. Taking Ira's vectoriser-only fix-pr51301-4.6 branch and comparing it with it's predecessor r106845: * cdce3.o itself is identical across compilers * Fault occurs in a parallel test run as part of the normal auto build * Fault occurs every time * Fault occurs with a manual 'make check-gcc RUNTESTFLAGS="dg.exp=cdce*' * Fault doesn't occur when building from the command line * Fault doesn't occur after updating binutils I'm suspicious of the linker. The auto builders are Natty based and come with ld 2.21.0.20110327. Updating them to Oneiric's 2.21.53.20110810 clears the problem. I've saved the build trees. I see no reason not to commit ~ramana/gcc-linaro/fix-lp-900426 and ~irar/gcc-linaro/fix-pr51301-4.6. -- Michael

13 years, 6 months

3
4
0 0

[ACTIVITY] Nov 12 - Dec 16

by Ulrich Weigand

== GDB == * Ongoing work on remote support for "info proc" and core file generation. Implemented initial version of latest solution via accessing arbitrary files on the remote site. == GCC == * Started familiarizing myself with current status of various performance patches in programm, in preparation of my taking on GCC performance work next year. Mit freundlichen Gruessen / Best Regards Ulrich Weigand -- Dr. Ulrich Weigand | Phone: +49-7031/16-3727 STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E. IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294

13 years, 6 months

1
0
0 0

[ACTIVITY] WW 50

by Zhenqiang Chen

Summary * make check-gcc on windows. * crosstool-ng patches Details: 1. Two patches for crosstool-ng: * Fix the compile error when CT_USE_SYSROOT is not "y". With this fix, we can config crosstool-ng to remove the symbol link for windows build. * Add scripts to build manual for newlib. 2. make check-gcc on Windows: * Wrap gcc/g++ for windows test. testglue.c should be compiled with gcc not g++. * Enhance scripts to convert path using "cygpath -w" 3. Analyze and root cause the pseudo new failed cases on windows. * gcc fail cases (gcc.dg/cpp/assert3.c, gcc.dg/cpp/include7.c and gcc.dg/cpp/trad/assert3.c) Root cause: " in options are removed in the test scripts. e.g. When reading gcc.log, you can find “-Aabc = jkl” in "Executing on host" as: Executing on host: …/cpp/assert3.c -A abc=def -A abc(ghi) "-Aabc = jkl" … But in spawn: the “” are removed. spawn …/cpp/assert3.c -A abc=def -A abc(ghi) -Aabc = jkl * g++ fail cases (dwarf2/lineno-simple1.C, dwarf2/pr44641.C and dwarf2/pr46527.C) The assembler codes generated from windows g++ and linux g++ are same except the PATH string. And all PASS on linux test. It seams the scripts can not grep the expected string on windows. * Tests on windows are not stable. For each test, there will have random fail cases (pass when retesting separately). Plans: * Create Makefile for embedded toolchain in linaro crosstool-ng. Best regards! -Zhenqiang

13 years, 6 months

1
0
0 0

Dependencies between Linaro toolchain and GDB/Linux Kernel

by Gibson, David

Hi, I was just wondering if anyone knows of any current or future dependencies with the Linaro toolchain (2011.11) and the Linaro release of GDB and the Linux Kernel? Is it considered safe to use the toolchain with the upstream releases of GDB and the Kernel, assuming that the versions of each are suitably compatible? Or are there potential dependencies on work that has been done in the toolchain? For example, new instructions supported in the compiler/assembler, or enhancements to the kernel/runtime library on the Linaro branch that would depend on them being in sync. Thanks, Dave.

13 years, 6 months

2
1
0 0

[ACTIVITY] weekly status

by Ken Werner

Hi, OpenEmbedded-Core: * the CSL 2011.03 recipe works with localization support disabled * got the OE-Core sato image to built (~250 source packages) * also built the Qt4 demo image (~100 source packages) to stress the C++ part of the toolchain * both are booting using qemu and seem to work just fine. * all of the Linaro members approved the request to contribute to OpenEmbedded * started to post patches onto the mailing list * briefly looked at the Linaro binary toolchain * the most recent one is dynamically linked * while the old one has old binutils (.21) that causes issue with --gc-sections * Currently the build is using the OE qemuarm machine configuration that uses a Yocto kernel and targets armv5te. This is something I'd like to look at too. Regards Ken

13 years, 6 months

2
1
0 0

[ACTIVITY] weekly status

by Revital Eres

Re-submitted the patch to estimate register pressure in SMS to the gcc-patches ml after discussing the patch with Richard.

13 years, 6 months

1
0
0 0

[ACTIVITY] 12th - 16th December

by Andrew Stubbs

Continued work on 64-bit shifts. - Improved 64-bit shifts without NEON (should benefit all cases). - Fixed bugs in constant shift code. - Rewrote 64-bit neon patch to take advantage of the new non-neon code, in the fall-back case. - Titied up the code, in general. - Rewrote SImode shift amount patch for new neon patches. The code produced now seems pretty good, but it still seems to choose which mode to use slightly haphazardly. The next step is to figure that out and benchmark the results. Had a few more attempts at getting the LAVA system to do something useful for me. I'm getting closer, but keep hitting new problems. Some of them my fault, and some are bugs in the system. Paul Larson has been very kindly helping me out and swatting the bugs. Didn't get much further with benchmarking for the generic tuning. This has been put on the back-burner while I work on the Neon shifts, and my test runs on my IGEPv2 A8 board have all been interrupted by power cuts or rendered useless by my forgetting to kill background tasks (such as Xorg). ---- Next weeks On vacation December 19th - January 3rd (returning January 4th).

13 years, 6 months

1
0
0 0

[ACTIVITY] 2011-12-16

by David Gilbert

== General == * Tidying things up and updating my list of statuses == String routines == * Adding strchr and strlen to eglibc; tests running at the moment. Dave

13 years, 6 months

1
0
0 0

[ACTIVITY] report week 50

by Peter Maydell

[short week, three days] RAG: Red: Amber: Green: Current Milestones: || || Planned || Estimate || Actual || ||upstream-omap3-cleanup || 2011-11-10 || 2011-12-15 || 2011-12-12 || ||cp15-rework || 2012-01-06 || 2012-01-17 || || ||initial-a15-system-model || 2012-01-27 || 2012-01-27 || || ||qemu-kvm-getting-started || 2012-03-04?|| 2012-03-04?|| || (for blueprint definitions: https://wiki.linaro.org/PeterMaydell/QemuKVM) Historical Milestones: ||add-omap3-networking || 2011-10-13 || 2011-10-13 || 2011-10-13 || ||a15-systemmode-planning || 2011-10-13 || 2011-10-13 || 2011-09-22 || ||a15-usermode-support || 2011-11-10 || 2011-11-10 || 2011-10-27 || == cp15-rework == * estimate pushed back a bit because I've ended up doing this in parallel with the other blueprints. Also exynos4210 review has taken some time. == upstream-omap3-cleanup == * split up the last handful of patches in the stack which were doing several things at once * this blueprint is now complete, meaning that the next stage of omap3 upstreaming will be cleaning up individual subsets of functionality to send upstream. This is all backburner level priority, though. == other == * reviewed most of Samsung's exynos4210 model * completed a conflict-heavy rebase of qemu-linaro (the result of MemoryRegion conversions for omap devices landing upstream) * LP:903239 : added linux-user support for some missing xattr syscalls that were causing build problems for apparmor -- PMM

13 years, 6 months

1
0
0 0

[ACTIVITY] December 11-15

by Ira Rosen

Hi, After learning how to control MEM_ALIGN and, therefore, alignment hints from the vectorizer, I was able to generate 64-bit hints (with the help of Ramana's patches). I saw a 16% improvement on a benchmark with stack variables, for which we now force alignment to 64 bits and create alignment hints, instead of using peeling. Ira

13 years, 6 months

1
0
0 0

[ACTIVITY] w50

by Asa Sandahl

Hi, * Finished an across-compilers report for benchmarks over the latest in FSF and Linaro series. Will start storing results in the linaro-toolchain-benchmarks bzr repository. * Looking closer at eembc results, especially regressions between gcc-4.4 and gcc-4.6. Did runs with gcc-linaro-4.4 with -fno-unroll-loop. Will continue analyze and try to present the result in a good way. * Reviewed Michael's geomean implementation. * I will be on Christmas holiday w52 and w01, will be back 9/1. /Regards Åsa

13 years, 6 months

1
0
0 0

Tidying up blueprints

by Michael Hope

Hi there. Could the toolchain team please have a look through the current GCC blueprints and update them? You can see a list and states at: http://apus.seabright.co.nz/helpers/backlog and for gcc-linaro only at: http://apus.seabright.co.nz/helpers/backlog/project/gcc-linaro Please check for any that: * are on your short-term todo list but aren't against your name * have been started but are stuck in the backlog or todo * are finished but not marked as such * are blocked * are duplicates or too undefined * or are obsolete I'm especially interested in: * "slp-supported-ops" * "sms-register-scheduling" * "better-block-operations" * "libraries-for-backlog" * "backport-conditional-execution" * "improve-peeling" * "64-bit-sync-pimitives" * "neon-strided-load-extract" If you've finished a significant amount of work on one blueprint then let me know. We can split that work out and push the rest back into the backlog. Also, let me know if you're blocked on final benchmarking. We can now easily benchmark a merge request and see the difference. -- Michael

13 years, 6 months

1
0
0 0

[ACTIVITY] 5th - 9th December

by Andrew Stubbs

Continued work on 64-bit neon operations. The negdi2 seems to be more difficult than previously thought - vneg won't do it, and there's no way to encode either "0-reg" or "not(reg)+1", so I'm shelving that idea for the moment, and moving on to one_compldi2_neon, which ought to be straight forward. Did the entire Linaro GCC release process, in the absence of Michael Hope, from source to announcement. The process didn't go as smoothly as I'd have liked, but I got through it, mostly. Hopefully Michael won't be travelling next time ... Tried to figure out how to do 64-bit shifts using a QImode shift amount. This promised to eliminate the unnecessary zero-extends, but it doesn't work because neither iwmmxt or neon registers are permitted to hold QImode values (presumably changing this would have consequences elsewhere?). Annoyingly, it's also not possible to put SImode values in (most) neon registers, so I'm not sure quite how to optimize the values. More investigation required.

13 years, 6 months

1
0
0 0

Agenda for performance call tomorrow

by Ramana Radhakrishnan

Hi, If folks want to put anything on the agenda can you please add it to the wiki page here ? https://wiki.linaro.org/WorkingGroups/ToolChain/Meetings/2011-12-13 Ramana

13 years, 6 months

1
0
0 0

[ACTIVITY] w49

by Asa Sandahl

Hi! This week was spent doing internal ST-E work, but related to the Linaro tcwg so I will give a short summary anyway. I have taken the Linaro toolchain (prebuilt by the Android working group) and used it in our internal Android build. There were several build errors, as expected when going from gcc-4.4.3, which is the default compiler in Android (Gingerbread) to gcc-4.6.2. Many errors were solved with patches from the Linaro Android distribution. Did some benchmarking related to web browsing: ARMBBench (load and rendering of web pages) - gave me 4-6% improvement with the Linaro toolchain. Sunspider and BroserMark (JavaScript) - gave me ~6% overall regression with the Lianaro toolchain. However, when zooming in to individual test cases - SunSpider consist of ~25 tests in 9 categories - the results are really scattered. A few tests are mainly contributing to the regression. I try to narrow things down to understand which code parts in v8 (the JavaScript engine) that causes the slowdown. Best regards Åsa

13 years, 6 months

1
0
0 0

[ACTIVITY] weekly status

by Revital Eres

Continue working on the patch to estimate register pressure on SMS: Addressing the comments received from Richard and Ayal. Testing the patch on libav micro benchmarks.

13 years, 6 months

1
0
0 0

[ACTIVITY] WW49

by Zhenqiang Chen

Summary * "make check-gcc" for linux gcc, cygwin gcc and native windows gcc. Details: 1. "make check-gcc" on linux. * One more failed case (gcc.dg/visibility-d) for the toolchain generated from crosstool-ng based on embedded toolchain code base. But logs show the .s files are the same. 2. "make check-gcc" on windows. * Dir format issue: Native windows programs require the disk symbol format as c:, d:, etc. But in cygwin, it is changed to /cygdrive/c, /cygdrive/d. Need wrapper to convert it. * qemu output in cygwin (Qemu-0.15.1-windows-Medium.zip from http://lassauge.free.fr/qemu/) qemu can not output the result like "*** EXIT 0" on screen. Need wrapper to handle it. * "make check-gcc" for cygwin toolchain (build from scratch in cygwin). You can run make check like it on linux. * "make check-gcc" for pre-installed binary toolchain (installed as native windows programs) a. configure gcc from the source package. (Only need the config*, Makefile to make sure "make check" work) b. reset the TEST_GCC_EXEC_PREFIX (site.exp) to the correct dir (INSTALL DIR) with the right format. c. wrap gcc/xgcc to use the pre-installed gcc and change the dir format. d. handle /usr/share/dejagnu/testglue.c (cp it to current test dir or convert it to windows path) Plan: * Handle g++ test on windows. * Work out a formal document or wiki page on how to "make check-gcc" on windows. * Test and analyze the failed cases. Best regards! -Zhenqiang PS: 1) qemu-system-arm.exe sample #!/bin/sh dir=`dirname $0` run () { # Change /cygdrive/e to e: para=`echo $* | sed -e 's/\/cygdrive\/e/e\:/'` # arm.exe is the real qemu-system-arm.exe # output to stdout.txt or stderror.txt. $dir/arm.exe $para | tee # output to screen cat $dir/stdout.txt } run $* 2) xgcc.exe sample #!/bin/sh run () { # Change /cygdrive/e to e: para=`echo $* | sed -e 's/\/cygdrive\/e/e\:/'` # Use a local copy of testglue.c rather than /usr/share/dejagnu/testglue.c para=`echo $para | sed -e 's/\/usr\/share\/dejagnu\/testglue.c/testglue.c/g'` # run the test with preinstalled binary toolchain #TBD: handle g++ arm-none-eabi-gcc.exe $para } run $* 3) TEST_GCC_EXEC_PREFIX in site.exp sample # Toolchain is installed at e:/Dec/RC3. TEST_GCC_EXEC_PREFIX "e:/Dec/RC3/lib/gcc/"

13 years, 6 months

2
2
0 0

[ACTIVITY] weekly status

by Ken Werner

Hi, * I've been debugging various errors and warnings that I encountered with the binary CSL 2011.03 toolchain * Fleshed out my recipe for the external toolchain; now get a working core-image-minimal that boots fine within qemu * Debugged why cmake based recipes (like libproxy) are having trouble when compiling with an external toolchain * Currently the libc is provided by the sysroot of the external toolchain. This might not be ideal and as time permits I'd like to find a way to get eglibc build instead. Regards Ken

13 years, 6 months

1
0
0 0

[Activity] Week ending 9th Dec.

by Ramana Radhakrishnan

Task Planned Estimated Actual Historical ~~~~~~~ Connect 2011.q4 preparation 28/10/2011 28/10/2011 28/10/2011 Linaro Tasks ~~~~~~~~~~~~ Fully Investigate the O3 performance regressions 31/01/2012 Neon backend experiments 09/12/2011 14/12/2011 with alignment hints and addressing mode work. Investigate partial-partial PRE and regression with bitmnp01 18/12/2011 Writeup on the optimizations 31/12/2011 enabled with PGO RAG : RED : None AMBER: ==Progress=== * The Android guys found a bug with the vcvt.f64.s32 instruction coming out after my patch and I found a few assembler issues as well during this process which are now fixed upstream. * Backported the A15 patches into Linaro 4.6 * Assisted as needed with the release which really wasn't too much work for me other than the revert . * Backported one part of the partial-partial PRE patch . Still looking into it. * Did some analysis of the failure with di-layout.c test failure and RichardS has now fixed it in the middle-end. * Wrote a patch to replace all vector mode aligned vldm / vstm with equivalent vld1.64 and vst1.64 to allow more alignment hints to come out of the compiler. Still not fully happy with it but it's looking much better than the original hack. === Plans === * Continue looking at partial-partial PRE and try and understand it further. * Flush out these neon patches that I'm accruing with the addressing modes and see where we get to with alignment hints and vld1.64's . * Look at movw's / movt's vs constant pools. * Submit my PGO patch . Absences. * Dec 19 - 31st Dec - Tentatively booked * Feb 6-10 : Linaro Connect Q1.12. * Feb 11- 15 : Holiday.

13 years, 6 months

1
0
0 0

[ACTIVITY] 2011-12-09

by David Gilbert

== QEMU == * Wrote a fix for bug 883133 (code buffer/libc conflict); spent some time testing it because I wasn't sure whether the crash I was seeing after that was my fix not being complete or actually bug 893208. * Got it to boot with -cpu 486; without that it's triple faulting in a divide just after a load of time stamp reads which makes me suspicious that 893208 is a timer problem. * (It also fails when used with vnc graphics, but works in SDL and curses, but I'll leave that bug for another time). == String routines == * With one more tweak to my memchr, it finally made it into eglibc. Dave

13 years, 6 months

1
0
0 0

ARM A15 support

by Matt Waddel

Hi, I received this question from an ARM FAE: Does the 4.5.2 version support A15 optimization? Or would you recommend using the latest 4.6 versions? Thanks for any response I could forward back to him. Best regards, Matt

13 years, 6 months

2
2
0 0

[ACTIVITY] Weekly status

by Richard Sandiford

== This week == * Got the -fsched-pressure code into a state where it's almost presentable. Found a few more things to tweak on the way. Fixed some FIXMEs, notably to honour MAX_SCHED_READY_INSNS. * More testing on ARM. Tried to get some SPEC2000 results as well as the usual EEMBC & DENbench, but I'm not sure how noisy the SPEC ones are. * More testing on powerpc. Decided that this really isn't a good target to test on for 4.7 because of the poor choice of pressure classes. SPEC CPU2006 INT results are reasonable-to-good, but the FP ones suffer from the fact that we think there are twice as many registers available for normal FP than there actually are. I'd like to fix this, but all pressure-estimation bits of GCC suffer from the same problem, and it's hard to justify as part of Linaro, because it doesn't apply to ARM. * Fixed upstream PR 50873 (ICE for NEON misaligned moves). Thanks to Ramana for the heads-up and analysis. * Retested and posted the patch for PR 48941 upstream (poor code generated by the vzip*() and vunzp*() arm_neon.h functions). Richard

13 years, 6 months

1
0
0 0

[ACTIVITY] Nov 05 - Dec 09

by Ulrich Weigand

== GDB == * Created and published Linaro GDB 7.3-2011.12 release. * Updated Linaro GDB 7.3 to GDB 7.3.1 code base. * Implemented support for single-stepping atomic operation code sequences for ARM (and Thumb) (LP #892008). Checked in to mainline and Linaro GDB. * Ongoing work on remote support for "info proc" and core file generation. Currently yet another solution for the remote interface has been brought up in mailing list discussions (support accessing arbitrary files on the remote side, not just /proc). I'm working on prototyping this suggestion. Mit freundlichen Gruessen / Best Regards Ulrich Weigand -- Dr. Ulrich Weigand | Phone: +49-7031/16-3727 STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E. IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294

13 years, 6 months

1
0
0 0

[ACTIVITY] report week 49

by Peter Maydell

Current Milestones: || || Planned || Estimate || Actual || ||upstream-omap3-cleanup || 2011-11-10 || 2011-12-15 || || ||cp15-rework || 2012-01-06 || 2012-01-06 || || ||initial-a15-system-model || 2012-01-27 || 2012-01-27 || || ||qemu-kvm-getting-started || 2012-03-04?|| 2012-03-04?|| || (for blueprint definitions: https://wiki.linaro.org/PeterMaydell/QemuKVM) Historical Milestones: ||add-omap3-networking || 2011-10-13 || 2011-10-13 || 2011-10-13 || ||a15-systemmode-planning || 2011-10-13 || 2011-10-13 || 2011-09-22 || ||a15-usermode-support || 2011-11-10 || 2011-11-10 || 2011-10-27 || == initial-a15-system-model == * finished cleanup of mpcore private peripheral models, sent patchset == cp15-rework == * sketched out initial design of cp15 register cleanup * started implementation, making good progress == other == * upstream patch review: * inference rules for turning on ARM feature bits (v7 implies v6, etc) * Samsung are posting patches for Exynos4 support (~8500 lines); did initial quick-pass review, more to be done next week * made qemu-linaro 2011.12 release

13 years, 6 months

1
0
0 0

Release status

by Andrew Stubbs

Hi Michael, I have finally managed to complete the release process. It wasn't quite as smooth as I would have liked, but we seem the have got there! Notes: - Ramana's VCVT patch caused an Android problem. This was reverted right before the release. - The initial release spin and test went without a hitch. - There was an additional test failure in the GCC testsuite, but this turns out to be because the snapshot date "20121201" happens to contain the string "120". Interestingly, this will also be true for most of 2012. - The ubutest runs seem to have a some problems: all of the glibc and python builds have failed with a message about libgcc. Since this has hit both 4.5 and 4.6 simultaneously I'm assuming it's environmental and not caused by a new toolchain bug. The rest of the compilation appears fine. - The benchmarking seems fine on A9, but I couldn't find results for the others, although the scheduler lists the jobs. - The upload to Launchpad was somewhat problematic. Uploading 4.5 took two attempts. Uploading 4.6 failed about 6 times (at 20 minutes or so each) before I tried from another machine with a faster uplink - that went first time. Andrew

13 years, 6 months

1
0
0 0

Linaro GCC 4.6 & 4.5 2011.12 released

by Andrew Stubbs

The Linaro Toolchain Working Group is pleased to announce the 2011.12 release of both Linaro GCC 4.6 and Linaro GCC 4.5. Linaro GCC 4.6 2011.12 is the tenth release in the 4.6 series. Based off the latest GCC 4.6.2+svn181866, it contains a range of vectoriser performance improvements and general bug fixes. Interesting changes include: * Updates to 4.6.2+svn181866 * Generic tuing support for Big-endian platforms. * SLP support for operations with arbirary numbers of operands. * SLP support for conditions. * Pattern recognition support in basic-block SLP. * Enhancements to mixed-size condition pattern recognition. * Support for 64bit __sync* primitives on ARM. * Unaligned block-move support for ARMv7. * Added Cortex-A15 integer pipeline tuning. Linaro GCC 4.5 2011.12 is the sixteenth release in the 4.5 series. Based off the latest GCC 4.5.3+svn181877, this is a maintenance focused release. Interesting changes in 4.5 include: * Updates to 4.5.3+svn181877 The source tarballs are available from: https://launchpad.net/gcc-linaro/+milestone/4.6-2011.12 https://launchpad.net/gcc-linaro/+milestone/4.5-2011.12 Downloads are available from the Linaro GCC page on Launchpad: https://launchpad.net/gcc-linaro More information on the features and issues are available from the release page: https://launchpad.net/gcc-linaro/4.6/4.6-2011.12 https://launchpad.net/gcc-linaro/4.5/4.5-2011.12 Mailing list: http://lists.linaro.org/mailman/listinfo/linaro-toolchain Bugs: https://bugs.launchpad.net/gcc-linaro/ Questions? https://ask.linaro.org/ Interested in commercial support? inquire at support(a)linaro.org

13 years, 6 months

1
0
0 0

[ACTIVITY] December 4-8

by Ira Rosen

Hi, - fixed PR 51285 - continued looking at the alignment issue, ran Michael's script with different options, tested Ramana's preliminary patch for vld1/vst1, and my "don't peel for low loop bounds" patch Ira

13 years, 6 months

1
0
0 0

Linaro QEMU 2011.12 released

by Peter Maydell

The Linaro Toolchain Working Group is pleased to announce the release of Linaro QEMU 2011.12. Linaro QEMU 2011.12 is the latest monthly release of qemu-linaro. Based off upstream (trunk) QEMU, it includes a number of ARM-focused bug fixes and enhancements. New in this month's release: - There are no Linaro-specific changes of note in this release - This release is based on the upstream QEMU 1.0 release. (Note that future qemu-linaro releases will continue to track upstream trunk; the release dates for upstream and our release just happened to be conveniently aligned in this case.) Known issues: - Graphics do not work for OMAP3 based models (beagle, overo) with 11.10 Linaro images. - This release of qemu-linaro is known not to work on ARM hosts. (See bugs #883133, #883136) The source tarball is available at: https://launchpad.net/qemu-linaro/+milestone/2011.12 More information on Linaro QEMU is available at: https://launchpad.net/qemu-linaro

13 years, 6 months

1
0
0 0

Linaro GDB 7.3 2011.12 released

by Ulrich Weigand

The Linaro Toolchain Working Group is pleased to announce the release of Linaro GDB 7.3. Linaro GDB 7.3 2011.12 is the fourth release in the 7.3 series. Based off the latest GDB 7.3.1, it includes a number of ARM-focused bug fixes and enhancements. This release contains: * Update to GDB 7.3.1 code base * Support single-stepping atomic operations (LDREX/STREX sequences) The source tarball is available at: https://launchpad.net/gdb-linaro/+milestone/7.3-2011.12 More information on Linaro GDB is available at: https://launchpad.net/gdb-linaro

13 years, 6 months

1
0
0 0

Release notes for GCC 4.6

by Andrew Stubbs

Hi all, I've copied all those who made commits to GCC 4.6 this month. Could you please give me a sentence or two for the release notes? Thanks Andrew

13 years, 6 months

3
2
0 0

Effect of alignment and peeling on vectorised loops

by Michael Hope

I had a play with the vecotiser to see how peeling, unrolling, and alignment affected the performance of simple memory bound loops. The short story is: * For fixed length loops, don't peel * Performance is the same for 8 byte aligned arrays and up * Performance is very similar for unaliged arrays * vld1 is as fast as vldmia * vld1 with specified alignment is much faster than vld1 The loop is the rather ugly and artifical:: void op(struct ains * __restrict out, const struct aints * __restrict in) { for (int i = 0; i < COUNT; i++) { out->v[i] = (in->v[i] * 173) | in->v[i]; } } where `struct aints` is a aligned structure. I couldn't figure out how to use an aligned typedef of ints without still introducing a runtime check. I assume I was running into some type of runtime alias checking. This compiled into:: vmov.i32 q10, #173 add r3, r0, #5 0: vldmia r1!, {d16-d17} vmul.i32 q9, q8, q10 vorr q8, q9, q8 vstmia r0!, {d16-d17} cmp r0, r3 bne 0b I then lied to the compiler by changing the actual alignment at runtime. See: http://people.linaro.org/~michaelh/incoming/runtime-offset.png The performance didn't change for actual alignments of 8, 16, or 32 bytes. I then converted the loop into one using vld1 and fed it smaller alignments. See: http://people.linaro.org/~michaelh/incoming/small-offsets.png The throughput falls into two camps: one of alignments 1, 2, or 4 and one of 8, 16, 32. The throughput is very similar for both camps but has some stange dropoffs at 24 words, around 48 words, and around 96 words. The terminal throughput at 300 words and above is within 0.5 % I then converted the vld1 and vst1 to specifiy an alignment of 64 bits. See: http://people.linaro.org/~michaelh/incoming/set-alignment.png This improved the throughput in all cases and in cases for more than 50 words by 14 %. This graph also shows the overhead of the runtime peeling check. The blue line is the vectoriser version which is slower to pick up due the greater per call overhead. I then went back to the vectoriser and changed the alignment of the struct to cause peeling to turn on and off. See: http://people.linaro.org/~michaelh/incoming/unroll.png At 200 words, the version without peeling is 2.9 % faster. This is partly due to a fixed count loop turning into a runtime count due to unknown alignment. This run also showed the affect of loop unrolling. The loop seems to be unrolled for loops of <= 64 words and drops off in performance past around 8 words. When the unrolling finally drops out, performance increases by 101 %. Raw results and the test cases are available in lp:~linaro-toolchain-dev/linaro-toolchain-benchmarks/private-runs A graph of all results is at: http://people.linaro.org/~michaelh/incoming/everything.png The usual caveats apply: this test was all in L1, only on the A9, and very artificial. -- Michael

13 years, 6 months

3
5
0 0

Re: Static Library startup

by Dave Martin

> On Mon, Dec 5, 2011 at 1:40 AM, Tom Gall <tom.gall(a)linaro.org> wrote: > > I probably know the answer to this already but ... > > > > For shared libs one can define and use something like: > > > > void __attribute__ ((constructor)) my_init(void); > > void __attribute__ ((destructor)) my_fini(void); > > > > Which of course allows your lib to run code just after the library is > > loaded and just before the library is going to be unloaded. This helps > > keep out cruft such as the following out of your design: > > > > PleaseCallThisLibraryFunctionFirstOrThereWillBeAnErrorWhichYouWillHitCausingYouToPostToTheMailingListAskingTheSameQuestionThatHasBeenAsked1000sOfTimes(); > > > > Yeah .. you know the function. I don't like it either. > > > > Unfortunately this doesn't work when people link in the .a from your > > lib. Libs like libjpeg-turbo in theory should never ever need to be > > linked in that fashion but consider the browsers who link to the > > universe instead of using system shared libs. On Mon, Dec 05, 2011 at 04:19:11PM +0800, Kito Cheng wrote: > Here is some triky way for this problem, you can put the constructor > and destructor to the source file which contain necessary function > call in your libraries to enforce the linker to archive your > constructor and destructor. > > However if this solution is not work for your situation, you can apply > the patch in attach for build script to enable the > LOCAL_WHOLE_STATIC_LIBRARIES for executable, > > After patch you can just add a line in your Android.mk : > > LOCAL_WHOLE_STATIC_LIBRARIES += libfoo > > The most disadvantage of this way is you should always link libfoo by > LOCAL_WHOLE_STATIC_LIBRARIES...and this patch don't send to linaro and > aosp yet. [...] Part of the problem here is that .a libraries lack the dependency and linkage metadata that shared libraries have. -2) Put up with the need to call an explicit initialisation function for the library. A lot of commonly-used libraries require an initialisation call, and I'm not sure it causes that much of a problem in practice... -1) Put a C++ wrapper around just enough of your library such that your constructor/destructor code is recognised as a needed static constructor/descructor by the toolchain. I can't think of a very nice way of doing this, so I won't elaborate on it... It's also not really a solution, since you still need to pull in a dummy static object from somewhere in order to cause the construcor and descructor to get called. 0) libtool or similar may help solve this problem, but I don't know much about this -- also, for solving the problem, that approach only works if uses of your library link via libtool. 1) One hacky approach is to rename your library to libmylib-real.a, and then make replace libmylib.a with a linker script which pulls in the needed constructor as well as the real library: libmylib.a: EXTERN(__mylib_constructor) INPUT(/path/to/libmylib-real.a) This works, providing that __mylib_constructor is external (normally, you would be able have the constructor function static, but it needs to be externally visible in order to be pulled in in this way. 2) Another way of doing a similar thing is to mark __mylib_constructor as undefined in all the objects that make up the library. Unfortunately, there seems to be no obvious way of doing that: the assembler generates undefined symbol references automatically for unresolved references at assembly time. There's no way for force the existence of an undefined symbol without an actual reference to it. objcopy/elfedit don't seem to support adding such a symbol either. It would be simple to write a tool to add the undefined symbol reference (such tools may exist already), but binutils doesn't seem to provide this for you. The plausible-looking -u option to gcc doesn't do anything unless doing a link. One other way of doing it without a special tool is to insert a bogus relocation into the text section of each object with an assembler .reloc directive specifying relocation type R_<arch>_NONE. There isn't really a portable way to do that, though. The name of the relocation changes per-arch, and some arches have other quirks (on ARM for example, .reloc cannot refer to the current location, but seems instead to need to refer to a defined symbol which is non-zero distance away from the location counter). One advantage to this approach is that your .a file looks just like any other .a file. Also, you can include that dependency in only those objects which really require the library to be initialised (normally, this is not a huge benefit though, since probably most of your objects _do_ require the library to be initialised). A disadvantage (other than portability problems) is that, like (1), the constructor symbol must be external (not static)... so it pollutes the symbol table and isn't protected against people calling it directly. You can create a dummy symbol instead of referring to the constructor symbol directly though -- this solves the second problem. 3) Finally, you can split your contructor/destructor code out into a separate .o file (say mylib-ctors.o), and use the linker script trick for (1) to forcibly include this object when linking: libmylib.a: INPUT(/path/to/mylib-ctors.o /path/to/mylib-real.a) This avoids some of the disadvantages of the other approaches, but you still end up with a strange-looking library which is really a linker script. This is closer to how the C library traditionally solves the problem (i.e., the crt*.o stuff). libc.so also tends to be a linker script, which deals with the fact that some parts of libc must be statically linked from a separate library when linking to -lc. Obviously, approaches (1)..(3) all suffer from arch or toolchain portability problems (or both). (The GNU/GCC __constructor__ thing is obviously a portability problem in itself, it you're minded to care about it.) Cheers ---Dave

13 years, 6 months

2
3
0 0

[ACTIVITY] 28th November - 2nd December

by Andrew Stubbs

* Linaro GCC Continued work on 64-bit shift / extend / etc. in NEON. I have posted an RFC to the gcc-patches list in the hope of getting some feedback on how best to fix this. No response yet. Hopefully some of the Linaro guys are at least looking at it ... Merged FSF GCC 4.5 and 4.6 into the Linaro GCC release branches prior to the release next week. Set more benchmarking work running in my ongoing investigation into generic tuning. Did a dry run of the extra release testing Michael normally does. It failed. Michael says he's fixed it now, but I know how to do my bit, so fingers crossed. * Other Experienced some IT/connectivity outages within Mentor. Resolved now.

13 years, 6 months

1
0
0 0

Activity - Week ending 2nd Dec 2011.

by Ramana Radhakrishnan

==Progress=== * Off sick on Monday * Systematic testing duty - few Aarch64 issues. * Linaro patch review duty. * Tested my vcvt fixed point patch and close to committing. * Worked on sometime on movw / movt for symbol references rather than constant pools . While this gives nice benefits it's a code size hog and needs further investigation. * PGO patch being tested finally and should go back up for review. === Plans === * Release week next week. * Start looking at partial_partial PRE. * Finish committing by backlog of patches. Absences. * Dec 19 - 31st Dec - Tentatively booked * Feb 6-10 : Linaro Connect Q1.12/

13 years, 6 months

1
0
0 0

[ACTIVITY] WW48

by Zhenqiang Chen

Summary: * Patch linaro crosstool-ng. * Windows install package Details: * Patch linaro crosstool-ng: * Back port upstream patches. * Check-in the zlib/libiconv/expat/ncurses related patches to linaro branch. * Create reference windows install package for linaro toolchain from installjammer. The install process works well on Win7. Plans: * Investigate test on Windows. Best regards! -Zhenqiang

13 years, 6 months

1
0
0 0

[ACTIVITY] weekly status

by Ken Werner

Hi, OpenEmbedded: * started on creating a receipts to compile the "core-image-minimal" using an external prebuilt toolchain (csl arm-2011.03) * there are still a lot of warnings at the do_package/do_package_qa task * the good news is that the build process finishes and kernel plus root file system image gets created * the bad news is that the rootfs lacks some important libs like libc and therefore won't run under qemu-system-arm (since init, busybox, etc. are dynamically linked) * currently a 3-lines hack on oe-core is required to be able to overwrite a task of the generic glibc receipt; all other files could go into a separate layer Linaro Android: * had a quick look into the EABI attribute tag issue Regards Ken

13 years, 7 months

1
0
0 0

[ACTIVITY] 2011-12-02

by David Gilbert

== String routines == * Sent updated memchr to the eglibc list == 64 bit atomics == * Ran a set of timing consistency tests that a colleague had sent me while I was off; Panda passed those, so time doesn't appear to be going backwards or anything, so that's not the problem with membase. * Pushed the code into linaro-gcc. == QEmu == * Tested Peter's prerelease - all good. * Started looking at the issues for running in TCG mode on ARM == Other == * Read through the ARMv8 instructions docs that landed on arm.com; quite interesting. Note that multiple instruction IT blocks are listed as being deprecated for 32bit mode on v8 (although this will work but it can be put in a mode to fault you to make it easy to find the uses). * Some debugging of Panda odd timing issue with Paul Mckenney. Dave

13 years, 7 months

1
0
0 0

[ACTIVITY] report week 48

by Peter Maydell

RAG: Red: Amber: Green: Current Milestones: || || Planned || Estimate || Actual || ||upstream-omap3-cleanup || 2011-11-10 || 2011-12-15 || || ||cp15-rework || 2012-01-06 || 2012-01-06 || || ||initial-a15-system-model || 2012-01-27 || 2012-01-27 || || ||qemu-kvm-getting-started || 2012-03-04?|| 2012-03-04?|| || (for blueprint definitions: https://wiki.linaro.org/PeterMaydell/QemuKVM) Historical Milestones: ||add-omap3-networking || 2011-10-13 || 2011-10-13 || 2011-10-13 || ||a15-systemmode-planning || 2011-10-13 || 2011-10-13 || 2011-09-22 || ||a15-usermode-support || 2011-11-10 || 2011-11-10 || 2011-10-27 || == qemu-kvm-getting-started == * now reasonably set up to run KVM under Fast Model; howto is here: https://wiki.linaro.org/PeterMaydell/A15OnFastModels * rebased kvm patches into qemu-linaro * fixed bug where we weren't passing cpu number to kvm properly when delivering an interrupt * sent some minor patches to upstream qemu that will be needed for kvm (eg configure script tweaks) == initial-a15-system-model == * started on cleaning up a9/11mpcore private peripheral implementation; now mostly done and looking much better as a base for a15 == other == * preparation for qemu-linaro release (rolled tarball, tested) * submitted patch to fix buffer overrun in GIC model * discussion: linux-user mode race conditions, and in particular how we should handle signals that arrive during syscall emulation * upstream patch review: imx31 round 3 -- PMM

13 years, 7 months

1
0
0 0

[ACTIVITY] Nov 28 - Dec 02

by Ulrich Weigand

== GDB == * Completed new set of patches to support both "info proc" and core file generation across the remote protocol, and posted them to the mailing list for review. * Tested GDB trunk in preparation for 7.4 release branch point on multiple platforms; analyzed and fixed a couple of problems, some also present on ARM in remote testing. Patches checked in to mainline. Mit freundlichen Gruessen / Best Regards Ulrich Weigand -- Dr. Ulrich Weigand | Phone: +49-7031/16-3727 STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E. IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294

13 years, 7 months

1
0
0 0

[ACTIVITY] Weekly status

by Richard Sandiford

== This week == * More on -fsched-pressure. Testing on POWER7 showed a degenerate case that I'd failed to handle well. Fixed that. Saw that part of the problem on POWER7 was that IRA was using a combination of GENERAL_REGS and CR_REGS as a single pressure class, so there appeared to be 39 registers available for storing integers. Fixed (or worked around) that. Tweaked a few other things too. The only denbench result that I wasn't happy with was RSA, where both forms of -fsched-pressure are significantly worse than -fno-sched-pressure. Tracked down the cause of that. We had a block BB1: A: (set (reg:DI X) Y) B: (clobber (reg:DI Z)) C: (set (subreg:SI (reg:DI Z) 0) (... X ...)) D: (set (subreg:SI (reg:DI Z) 4) (...)) where B makes sure that Z is treated as dead before C. Interblock motion causes B to be scheduled in an earlier block, but none of the other instructions can be. This means that, when we schedule BB1, it still contains A, C and D, and Z now appears to be live on entry to the block. C therefore appears to reduce register pressure, because it contains the last use of X, and appears to leave Z's liveness unaffected. In reality it should be treated as increasing register pressure by 1 (-1 for the death of X, +2 for the birth of Z). I "fixed" this by moving C's dependencies to B, a bit like we do for scheduling groups (although none of the other handling of scheduling groups should apply). This made a big difference, so that the new code is a win on RSA. There's still one SPEC2006 degradation on POWER7 that I want to look at. * Caught up on a lot of mail. gcc-patches backlog has gone down from ~4900 when I got back to ~500. * Briefly looked at x86's drap support, to see what would be needed for ARM. Didn't look for long though: the overhead seems excessive for optional alignment, and the agreement seemed to be that 128-bit alignment wouldn't really make much of a difference anyway. Richard

13 years, 7 months

1
0
0 0

[ACTIVITY] w48

by Asa Sandahl

Hi! * Continued with running eembc, coremark, denbench and spec2k on the ursas with the latest of the Linaro and FSF series. The variants used were o3-neon and o3-neon-novect. Something went wrong with the variants the first time, so I had to rerun the tests once. Discussed draft report with Michael, next week I will share with the rest of the team. * Did a rerun SPEC2K runs with "train" and "ref" data sets. I did -o2 and -o3 runs on a panda with the two data sets. Asked for a sanity check of the numbers. * Prepared and held a presentation about the tcwg internally. * Will be tied up with internal work for the most of w49. Best regards Åsa

13 years, 7 months

1
0
0 0

[ACTIVITY] Nov 27 - Dec 1

by Ira Rosen

Hi, - Ran eon with gcc 4.7: there are much more loops similar to the one in lp#831094 that get vectorized (due to some data ref analysis improvement), so the impact of disabling peeling for such loops (i.e. loops with low loop bound) is even bigger than for 4.6, and vectorization improves the performance by 2.5%. I prefer to understand the peeling/alignment situation better and not just commit this patch (and I spent some time trying to do that). - Fixed PR 51301 - a bug in over-promotion pattern. Proposed for merge to gcc-linaro-4.6. - Merged the last SLP patch to gcc-linaro-4.6. Ira

13 years, 7 months

1
0
0 0

gcc4.6,how to remove werror

by tknv

Hello,When I compile armel kernel by gcc4.6 on Linux du 3.0.0-12-generic-pae #20-Ubuntu SMP Fri Oct 7 16:37:17 UTC 2011 i686 i686 i386 GNU/Linux. tknv@du:~$ arm-linux-gnueabi-gcc -v Using built-in specs. COLLECT_GCC=arm-linux-gnueabi-gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/arm-linux-gnueabi/4.6.1/lto-wrapper Target: arm-linux-gnueabi Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 4.6.1-9ubuntu3' --with-bugurl=file:///usr/share/doc/gcc-4.6/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.6 --enable-shared --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/arm-linux-gnueabi/include/c++/4.6.1 --libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-plugin --enable-objc-gc --enable-multilib --disable-sjlj-exceptions --with-arch=armv7-a --with-float=softfp --with-fpu=vfpv3-d16 --with-mode=thumb --disable-werror --enable-checking=release --build=i686-linux-gnu --host=i686-linux-gnu --target=arm-linux-gnueabi --program-prefix=arm-linux-gnueabi- --includedir=/usr/arm-linux-gnueabi/include --with-headers=/usr/arm-linux-gnueabi/include --with-libs=/usr/arm-linux-gnueabi/lib Thread model: posix gcc version 4.6.1 (Ubuntu/Linaro 4.6.1-9ubuntu3) and Makefile KBUILD_CFLAGS := -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs \ -fno-strict-aliasing -fno-common \ -Wno-unused-but-set-variable \ -Wno-unused-parameter \ -Wno-array-bounds \ -Wno-format-security \ -fno-delete-null-pointer-checks and make WERROR=0 but error: array subscript is above array bounds [-Werror=array-bounds] cc1: all warnings being treated as errors How to remove werror all ? thanks. -- w.tknv/

13 years, 7 months

3
11
0 0

plans for QEMU support for KVM on ARM

by Peter Maydell

This email is just a quick summary of what we (Linaro) are planning in the way of QEMU work to support KVM on ARM Cortex-A15. The idea is to let people know what's coming up, find out if we've forgotten anything, and avoid people duplicating work unnecessarily. Most of this is based on a useful session at the recent 'ARM server mini-summit' in Orlando (UDS/Linaro Connect) at the beginning of this month. The work we're currently proposing to do falls into three parts: * refactor QEMU's cp15 register handling At the moment QEMU handles cp15 accesses by calling out to a single helper function which is an enormous set of nested switch statements to handle the different coprocessor registers. Access permissions are checked separately at translate time. This design makes specifying board-dependent or cpu-dependent registers somewhat painful; it's also easy for the access permission checks to be out of sync. There is no support for banked cp15 registers either (needed for trustzone and virtualisation). We need a better design which lets a board or core register handler routines for cp15 registers. This will make the code cleaner and more maintainable as a base for new features. This isn't strictly a requirement for KVM, but we're going to want KVM to be able to hand off cp15 accesses to QEMU, and I don't think that's going to be maintainable or reliable without this refactoring. (https://blueprints.launchpad.net/qemu-linaro/+spec/cp15-rework) * A15 system model Basically a QEMU model of a Versatile-Express with a Cortex-A15 minus the virtualization and LPAE extensions. This needs the A15 private peripherals (just the GIC in the right place in the memory map, really; generic timer not required) and the new memory map version of the vexpress board model, plus some new cp15 registers. (Bill Carson has already done some patches in this area but they need a little rework and may have minor missing pieces.) https://blueprints.launchpad.net/qemu-linaro/+spec/initial-a15-system-model * miscellaneous integration work We're aiming for a reasonable working prototype of A15 guest on an A15 Fast Model host here; we need to fix at least some of the bugs which currently mean upstream QEMU doesn't work on ARM hosts, sort out which kernel and qemu trees we are developing from, and get things running in our validation lab's continuous integration setup. https://blueprints.launchpad.net/qemu-linaro/+spec/qemu-kvm-getting-started Also on the radar is a fourth piece of work: * QEMU virtio-mmio support This is adding support for the 'mmio' virtio transport, which will allow virtio support in a versatile-express model. We're going to need this at some point but the current thought is that we want to do the above listed more important bits of work first... (The exception would probably be if it turned out that this was sufficiently useful for making early KVM development easier) https://blueprints.launchpad.net/qemu-linaro/+spec/add-amba-virtio-support So, questions: (1) did we forget something important? (2) is anybody else already planning to do any of this (or would like to start)? if so we should coordinate... (3) is there anything that the kernel folk need/want earlier rather than later? thanks -- PMM

13 years, 7 months

6
18
0 0

launchpad / merge requests and upstreaming patches.

by Ramana Radhakrishnan

Hi, Now that upstream trunk is in stage3 and we have a few patches that won't really make it upstream until stage1 is reopened is it worthwhile having a new status in the merge requests that moves it into a to_upstream status . The other option is to have a common spreadsheet that we keep updating with links to merge requests that need to be upstreamed . Thoughts ? Ramana PS - Any clue on what's happening with the branch diff bug that's been open in launchpad forever now ?

13 years, 7 months

2
3
0 0

[ACTIVITY] November 20-24

by Ira Rosen

Hi, * Worked on peeling problem in eon (#831094). Wrote a patch that checks if the number of vector iterations is going to be more than 2, and disables peeling otherwise. With this patch I see about 1.5% regression with vectorization (and about 7% without it). * I am thinking to extend the patch for unknown number of iterations by creating a run-time check. The threshold could be set by param. Another option, could be doing it through the cost model, but it's hard to evaluate costs when misalignments are unknown (and, I think, the cost model handles known misalignment properly). * Disabling peeling for low loop bounds also helps with one of EEMBC benchmarks, for which vectorization with double-words is more beneficial than with quad-words. It turns out that we are able to force the alignment for double-words (and, therefore, avoid peeling), because we check that the required alignment (64 in this case) is less or equal to BIGGEST_ALIGNMENT, where arm.h:#define BIGGEST_ALIGNMENT (ARM_DOUBLEWORD_ALIGN ? DOUBLEWORD_ALIGNMENT : 32) and arm.h:#define DOUBLEWORD_ALIGNMENT 64 So, we can never force alignment for 128 bits on ARM. I wonder if that's a real limitation. * Proposed three SLP patches to gcc-linaro, and merged two of them. Ira

13 years, 7 months

1
1
0 0

[ACTIVITY] weekly status

by Revital Eres

Addressing the comments received from Richard and Ayal regarding the patch to estimate register pressure. Testing the patch on eembc and libav micro benchmarks. Looking at the regressions seen with SMS.

13 years, 7 months

1
0
0 0

[ACTIVITY] report week 47

by Peter Maydell

RAG: Red: Amber: Green: KVM/QEMU work blueprints set up Current Milestones: || || Planned || Estimate || Actual || ||upstream-omap3-cleanup || 2011-11-10 || 2011-12-15 || || ||cp15-rework || 2012-01-06 || || || ||initial-a15-system-model || 2012-01-27 || || || ||qemu-kvm-getting-started || 2012-03-04?|| || || (for blueprint definitions: https://wiki.linaro.org/PeterMaydell/QemuKVM) Historical Milestones: ||add-omap3-networking || 2011-10-13 || 2011-10-13 || 2011-10-13 || ||a15-systemmode-planning || 2011-10-13 || 2011-10-13 || 2011-09-22 || ||a15-usermode-support || 2011-11-10 || 2011-11-10 || 2011-10-27 || == qemu-kvm-getting-started == * sorted out how to cross compile QEMU (involved an upgrade to Oneiric) * documented this and how to put together other required pieces at https://wiki.linaro.org/PeterMaydell/A15OnFastModels * started porting Christoffer's KVM patch forward to current QEMU (compiles, not yet tested) == other == * A15 blueprints etc now sorted -- summary at https://wiki.linaro.org/PeterMaydell/QemuKVM (includes definition of what the above blueprint/milestones are) * upstream patch review (imx.31 board patches, sp804 timer cleanup)

13 years, 7 months

1
0
0 0

[ACTIVITY] Nov 22 - Nov 25

by Ulrich Weigand

== GDB == * Ongoing work on support for cross-platform core file generation. Posted a new design proposal to the mailing list to include not only "info proc mappings", but *all* "info proc" commands. This would involve a remote protocol command to read arbitrary proc files, instead of a specific command to retrieve the memory map. * Investigated Launchpad bug: #891970 msp430-gdb segmentation fault with target remote == GCC == * Patch review week. Mit freundlichen Gruessen / Best Regards Ulrich Weigand -- Dr. Ulrich Weigand | Phone: +49-7031/16-3727 STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E. IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294

13 years, 7 months

1
0
0 0

[ACTIVITY] 21st - 25th November

by Andrew Stubbs

Worked on adding support for 64-bit NEON integer shifts. I have this working now, although I'm still not very happy about how the register allocator chooses which mode to use - it prefers core-registers if the values start or end in core-regs, even though moving to values to NEON registers might be more efficient (general 64-bit shifts in core registers require several instructions). I've also had to mark the CC register clobbered in all cases, even though it only gets clobbered in some of them, which might be necessary, but isn't very satisfactory. The NEON shifts work showed that 32->64 bit extends could be done better also. This hasn't been a great problem up to now, but the shift amount (in particular) is typically a 32-bit value and yet needs to be zero-extended to 64-bit for NEON's purposes. Right now, GCC prefers to extend the value in core-registers, and then copy it to NEON. This works, but burns another core-register - a scarce commodity - so I think it would be better to copy it first, and then extend it after. NEON has instructions for this, so I'm investigating how to get the compiler to do it (this is all strictly post-combine, so the usual options are out, and the register allocator has to be allowed to do it the old way in the case where core-regs really are the best option, so it's tricky).

13 years, 7 months

1
0
0 0

[ACTIVITY] WW47

by Zhenqiang Chen

Summary: * Upstream crosstool-ng patches. * Create windows install package from installjammer. * Investigate link issues. Details: * crosstool-ng patches. * Patches for newlib extra config, gdb extra config, pch, nls option are committed to crosstool-NG upstream. * The dependant library patches are in discussion. * Learn installjammer and integrate it to scripts to create windows install package. * Investigate warning message from link when linking the prebuilt zlib for migw32 host. It might be OK with static link, but migh fail with dynamic link on windows. For i586-mingw32[msvc] host, lots of messages like libtool: link: Could not determine host path corresponding to ... For i386-mingw32 host: In addition to the message in i586-mingw32 build, output the following message *** Warning: linker path does not have real file for library -lz. ... Plans: * Build and test. Absences: * Nov 29, 30: Trainings. Thanks! -Zhenqiang

13 years, 7 months

1
0
0 0

ICS + linaro-gcc 4.6 = working :)

by Bernhard Rosenkränzer

Hi, Good news -- I just built a version of ICS with the current version of linaro-gcc. Panda build here: http://people.linaro.org/~bernhardrosenkranzer/boot.tar.bz2 http://people.linaro.org/~bernhardrosenkranzer/system.tar.bz2 http://people.linaro.org/~bernhardrosenkranzer/userdata.tar.bz2 Use linaro-android-media-create as usual to install. This is not yet a build that we can reproduce inside android-build because I've had to cheat by swapping out linkers in a couple of places (just using current binutils the way we normally do produces a build that doesn't boot, using binutils built from the AOSP source release works, but the prehistoric linker doesn't know about "dmb st", can't link u-boot, can't link the kernel, and strangely enough can't link some components of ICS - apparently the binaries they ship have some extra patches in). But the good news is that every part is built with our compiler - there's nothing in the way of using that (aside from the code insanities I've already fixed). I'll work on sorting out binutils now... ttyl bero

13 years, 7 months

1
0
0 0

[ACTIVITY] weekly status

by Ken Werner

Hi, I've spent most of my time to dig into OE. First I started with OE (classic); then realized that OE-core is where the future happens and switched to it. I've set up a build system and got a ARM minimal image to build that boots in QEMU *yay*. In parallel I've been reading the manual and looked into the receipts to find out what toolchain they are using (gcc-4_6-branch plus patches). Next step is to get the OE-core built using the Linaro-GCC. Regards Ken

13 years, 7 months

1
0
0 0

[ACTIVITY] Weekly status

by Richard Sandiford

== This week == * Looked at the MIPS _unpack_d bug. libgcc.a did have a definition, and Michael couldn't reproduce with his build, so the bug report is now marked as Incomplete. * Backported patch for PR 48190 to upstream 4.6 and 4.5. * Reviewed Revital's SMS register-pressure patch. * More on -fsched-pressure. I now have a version that I'm happy with as far as ARM goes, in that it usually seems to produce code that is no worse than the better of currect -fsched-pressure and current -fno-sched-pressure. (I'm sure there's a better way of saying that.) In some cases it is better than both. * Continued trying to catch up on mail. == Next week == * Clean up the -fsched-pressure code (it's still in its "experimental mess" state). Try it on Power. * Resurrect vzip and vunzp patch after Richard E said he wouldn't object. Richard

13 years, 7 months

1
0
0 0

[ACTIVITY] w47

by Asa Sandahl

Hi! * Ran eembc, coremark, denbench and spec2k on the ursas with the latest of the Linaro and FSF series. The variants used were o3-neon and o3-neon-novect. I first got a c++ related build error when using 4.4.x compilers, the was error caused by symbol versioning. Michael's explanation: "We want to use the gcc-4.4.5 libstdc++ when building and running. However, when running c++ itself, it links in /usr/lib/libppl_c.so, which was built with the host 4.5 compiler, which needs the 4.5 libstdc++!" The work around is to remove the LD_LIBRARY_PATH from build.mk (the gcc-%/benchmarks.stamp target) and run the C only tests. * Continued documentation of running benchmarks: https://wiki.linaro.org/AsaSandahl/Sandbox/RunningBenchmark. Tips of more efficient ways of doing things are always welcome. * Collected the results for SPEC2K runs with "train" and "ref" data sets. I did -o2 and -o3 runs on a panda with the two data sets. The results for -o2 and -o3 looks almost the same though. I will double check the "*build.txt" files from the benchmark runs, and if needed do a complementary run. Best regards Åsa

13 years, 7 months

1
0
0 0

Query regarding booting kernel image on Linaro Qemu

by Jubi Taneja

Dear All, I am using arch/arm/configs/vexpress_defconfig to configure and build Linux Kernel 3.1.1 http://launchpad.net/linux-linaro/3.1/3.1-2011.11/+download/linux-linaro-3.… and then if I booth the zImage crated on Linaro QEMU http://launchpad.net/qemu-linaro/trunk/2011.10/+download/qemu-linaro-0.15.5… ,it works properly. But if i enable the LPAE support in the config file, the kernel builds and when I boot the kernel image on QEMU, it just prints the output as : Uncompressing Linux... done, booting the kernel. And, then it hangs ... Can anyone please tell how to fix this issue? Looking forward to your reply. Thanks and Regards, Jubi

13 years, 7 months

3
3
0 0

[PATCH] gas: Allow for a more sensible number of macro arguments

by Dave Martin

I discovered some excessive memory usage in gas recently when defining macros. It turns out that this is a weird implementation feature rather than a bug. This patch has a possible fix for the issue, but I'd be interested in people's views before I go so far as cleaning it up and discussing it upstream. Cheers ---Dave Dave Martin (1): gas: Allow for a more sensible number of macro arguments gas/as.c | 17 +++++++++++++++++ gas/doc/as.texinfo | 9 ++++++++- gas/hash.c | 5 +++-- gas/hash.h | 1 + gas/macro.c | 22 +++++++++++++++++++++- gas/macro.h | 1 + 6 files changed, 51 insertions(+), 4 deletions(-) -- 1.7.4.1

13 years, 7 months

2
3
0 0

Re: Query regarding booting kernel image on Linaro Qemu

by Peter Maydell

[Jubi, I'm afraid this is the second copy of this you'll see, because you accidentally sent your reply to linaro-toolchain-request rather than to the actual mailing list, and so my first reply was misdirected. This reply is to the correct list address...] On 22 November 2011 13:28, Jubi Taneja <jubitaneja(a)gmail.com> wrote: > Thanks for your reply. Please find the response inline .. > On Tue, Nov 22, 2011 at 6:44 PM, Peter Maydell <peter.maydell(a)linaro.org> > wrote: >> On 22 November 2011 13:06, Jubi Taneja <jubitaneja(a)gmail.com> wrote: >> > But if i enable the LPAE support in the config file, the kernel builds >> > and >> > when I boot the kernel image on QEMU, it just prints the output as : >> > >> > Uncompressing Linux... done, booting the kernel. >> >> Does your kernel boot OK on real hardware? >> >> (ie, is a kernel with LPAE support expected to boot on a CPU like the >> A9 which doesn't have LPAE?) > > Yes, it is expected to boot ARM Cortex A15 CPU. The A9 and the A15 are different CPUs. QEMU currently supports only the A9. This is why I asked if this kernel boots OK on real Versatile Express A9 hardware. >> Also if your config/kernel command line don't turn on earlyprintk it's >> worth enabling this as it usually gets you better diagnostic messages >> for early kernel boot failures. > > Ok, I will try to check this. But, unfortunately now I again tried enabling > LPAE in config file and the current status is that when I boot the kernel > image on Qemu. it simply hangs. It now don't show that message of > Uncompressing kernel.. I am trying to debug it using gdb, but could not find > much. Please guide me how shall I proceed ahead. If you've turned on kernel support for the Versatile Express A15 rather than the Versatile Express A9 then this is expected behaviour: the VE-A15 has a different memory layout and in particular the serial ports are in a different place. So if you try to boot the kernel on a VE-A9 system (which is what QEMU is modelling) then it will display nothing because the kernel is trying to write to UARTs which aren't there. What are you actually trying to achieve here? -- PMM

13 years, 7 months

2
2
0 0

[ACTIVITY] 14th - 18th November

by Andrew Stubbs

Continued looking at constant reuse optimizations, as a background task. I've fiddled with the costs a bit more to remove false positives. Continued benchmarking different generic tuning ideas. With each test run taking most of a day this is slow going. Took Michael's rootfs that is used for all the toolchain testing and benchmarking, unpacked it, and repacked it so that it is compatible with "linaro-media-create", then tested that I could use it to run tests on LAVA successfully. I was hoping to use this for extra benchmarking bandwidth, but there's a permissions problem in the LAVA website software that means it's not yet possible to post private results to the system, so no proprietary benchmarks yet. I can still continue pipe-cleaning my process, and maybe run some benchmarks without actually reporting the results (or perhaps posting them somewhere write-only). Begun work on adding GCC support for 64-bit shifts with NEON. This is not quite as simple as it ought to be because a) it's inefficient to move a value to NEON registers just to do a shift, so it needs to detect where the value is, and b) right shifts are encoded as left shift by a negative amount, and negative shift amounts are normally considered undefined behaviour.

13 years, 7 months

3
4
0 0

[Activity] Week ending 18th November 2011

by Ramana Radhakrishnan

RAG : RED : None AMBER: Worried about trunk failures with test runs. Number of testsuite failures after the atomics merge has increased - more below. . Task Planned Estimated Actual Historical ~~~~~~~ Connect 2011.q4 preparation 28/10/2011 28/10/2011 28/10/2011 Linaro Tasks ~~~~~~~~~~~~ Fully Investigate the O3 performance regressions 31/01/2012 Writeup on the optimizations 31/12/2011 enabled with PGO ==Progress=== * Debugged the LTO failures for some time this week - not much progress. * The bootstrap failure with trunk turned out to be the same problem as with the CFG not being updated properly with some of the shrink-wrap patches from Alan M . This was fixed on trunk later on Monday. * Tested atomics fixes but the test results are overall looking ugly . Need to do some more debugging. Discovered GDB was broken for single stepping in ARM atomic sequences - Filed bug report https://bugs.launchpad.net/gdb-linaro/+bug/892008 here. * Looked into the vmul / vmla issue for a bit - did some experiments. Need to write these up and follow up . === Plans === * Commit the vcvt fixed point patch. * Finish experiments with the vmla stuff and find out more about this. * Finish debugging the LTO failures with PGO bootstrap. * Some research into the O3 perf issues. Absences. * Dec 19 - 31st Dec - Tentatively booked

13 years, 7 months

1
0
0 0

[ACTIVITY] Nov 14 - Nov 18

by Ulrich Weigand

== GDB == * Ongoing work on support for cross-platform core file generation. == GCC == * Investigated Launchpad bugs: #889984 binaries: should step across helper functions #889985 binaries: can't step out of helper functions #890764 4.6-11.11 seems to misdetect some files as system header and implicit extern "C" == Misc == * Gave talk on Linaro at the IBM Germany Technical Expert Council fall meeting. Mit freundlichen Gruessen / Best Regards Ulrich Weigand -- Dr. Ulrich Weigand | Phone: +49-7031/16-3727 STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E. IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294

13 years, 7 months

1
0
0 0

[ACTIVITY] weekly status

by Revital Eres

Sent the patch which implements register pressure estimation in SMS to the gcc mailing list as RFC. I looked at some of the regressions in libav and intend to continue with that this week.

13 years, 7 months

1
0
0 0

[ACTIVITY] Weekly status

by Richard Sandiford

== Last week == * Caught up on lots of email. * Looked into the SMS regression that Revital found. Turned out to be caused by the ARM backend not modelling the VMLA fast accumulator path. Need to know how the path actually works before modelling it (the docs aren't clear). * Continued to look at -fsched-pressure. Richard

13 years, 7 months

1
0
0 0

[ACTIVITY] WW46

by Zhenqiang Chen

Summary: * Create crosstool-ng patches and build embedded toolchain on ubuntu-8.04.4. Details: * Crosstool-ng patches * Update old patches according to Michael's comments and revise them according the guideline. * Create patches for nls option, newlib and gdb. * Try to build embedded toolchain on ubuntu-8.04.4. It works after installing all the dependence libraries. Plans: * Upstream the patches. * Build and test. Thanks! -Zhenqiang

13 years, 7 months

1
0
0 0

Output miscompare in 175.vpr

by Michael Hope

Hi there. The 4.6-2011.10 release causes an output miscompare failure in 175.vpr from SPEC 2000. The fault didn't exist in 2011.09 and has cleared in 2011.11. Does this ring a bell with anyone? Andrew, could it be related to your widening multiplies fix? -- Michael

13 years, 7 months

2
2
0 0

[ACTIVITY] weekly status

by Ken Werner

Hi, * rewrote the Android.mk of libunwind to make use of autoreconf and libtool * finished my work on libunwind * upgraded my Linaro Android build environment * debugged Linaro Android build failures (#891753) * tested backtracing on the Linaro Android 2.3.5 and 2.3.7 branches * documented debuggerd usage on Android: https://wiki.linaro.org/Platform/Android/DebugAndroidSystemComponents * Debugged a linking failure of the android perflab benchmark that Andy is seeing. Turns out that the GCC is optimizing two consecutive calls to sinf and cosf (same angle) are optimized by the GCC to one sincosf call. The libm provided by the benchmark is lacking sincosf. Workaround is to use -fno-builtin-sinf -fno-builtin-cosf. Regards Ken

13 years, 7 months

2
1
0 0

[ACTIVITY] 2011-11-18

by David Gilbert

== 64 bit atomics == * Still fighting membase * Cleaned up a bunch of other issues, but I'm back at an 'expiry' issue, where the test stores some data with a fixed expiry time and then waits until after it should have expired, and checks it has. Except on ARM it sometimes doesn't expire quickly enough. I've got enough debug now to see that the server processes view of time (which it updates via an event about every second) is sometimes very behind gettimeofday()'s view of time - and have a small test for it. This doesn't seem to happen on x86. The good part is that it's now a much smaller test, the bad part is that it fails rarely - somewhere between 1/1000 and 1/100 depending on its mood. * Looked at a few other things to see if they might use 64 bit atomics: - spice's (as in the VNC like protocol) FAQ said it needed 64bit atomics and didn't work on 32bit machines due to that; but the source appears to have been fixed for 32bit. - Looked at boost lock-free; it does have an implementation using gcc's __sync primitives, however for ARM it uses a hand coded set of primitives, those are missing the 64 bit implementation, but the contributor of the ARM code said that the boost lock-free author preferred not to use the gcc primtives. == Other == * Testing latest libffi rc - Had most of my varargs for hf fix in (had missed one part of a test) * 1 day of non-linaro work I'm on holiday next week. Dave

13 years, 7 months

2
1
0 0

[ACTIVITY] report week 45

by Peter Maydell

(short week, 4 days) RAG: Red: Amber: we still haven't settled on engineering blueprints and schedule for the KVM work. Proceeding with some obviously necessary bits anyway Amber: not clear whether we can do virtio-mmio this quarter Green: Current Milestones: || || Planned || Estimate || Actual || ||upstream-omap3-cleanup || 2011-11-10 || 2011-11-10 || || (still no milestones, see above) Historical Milestones: ||add-omap3-networking || 2011-10-13 || 2011-10-13 || 2011-10-13 || ||a15-systemmode-planning || 2011-10-13 || 2011-10-13 || 2011-09-22 || ||a15-usermode-support || 2011-11-10 || 2011-11-10 || 2011-10-27 || == upstream-omap3-cleanup == * spent a half day on this to try to get this blueprint tidied away even if we aren't going to have much time for the later upstreaming == other == * A15/KVM planning * experimenting with getting an A15 kernel booting on Fast Model (see https://wiki.linaro.org/LoicMinier/Sandbox/FastModels and https://wiki.linaro.org/PeterMaydell/A15OnFastModels) * estimated required work for doing a qemu model of a board for one of the Landing Teams (ans: 6 man months +)

13 years, 7 months

1
0
0 0

[ACTIVITY] November 14-17

by Ira Rosen

Hi, - spent most of the week trying to reproduce regressions with vectorization - started bringing the latest SLP feature, condition with different types, to gcc-linaro. There are 5 patches. Merged one, started to prepare another one. - fixed PR 51112 Ira

13 years, 7 months

1
0
0 0

Agenda for tomorrow's call .

by Ramana Radhakrishnan

Hi, I've now put this at : https://wiki.linaro.org/WorkingGroups/ToolChain/Meetings/2011-11-15 Are there any other topics that folks want to bring up ? The one thing worth thinking about ahead of time is if we want to bring ahead the call by an hour to allow Michael to join at a not so crazy hour for him. What do folks think of 9 a.m. Tuesdays / Wednesdays UTC ? cheers Ramana

13 years, 7 months

3
6
0 0

How to fail an ARMv5T u-boot build when libgcc is ARMv7T2?

by Loïc Minier

Hey When building u-boot for an ARMv5T platform (versatileqemu_config), the Ubuntu-packaged Linaro cross-toolchain isn't suitable because it only offers an ARMv7T2 libgcc. But I'd like the build to fail when that happens rather than silently generating an u-boot.bin which will trigger a SIGILL when it hits the first non-ARMv5T instruction. I heard that gcc/ld are supposed to check this, but I'm not sure how it's supposed to work; perhaps the way u-boot does its final link prevents this from working properly? I tried building u-boot as follows: make O=obj-broken \ CROSS_COMPILE=arm-linux-gnueabi- ARCH=arm \ OPTFLAGS="-marm -march=armv5te" \ versatileqemu_config make O=obj-broken \ CROSS_COMPILE=arm-linux-gnueabi- ARCH=arm \ OPTFLAGS="-marm -march=armv5te" \ -j2 The final link looks like this: UNDEF_SYM=`arm-linux-gnueabi-objdump -x /home/lool/git/denx/u-boot/obj-v-broken/board/armltd/versatile/libversatile.o /home/lool/git/denx/u-boot/obj-v-broken/api/libapi.o /home/lool/git/denx/u-boot/obj-v-broken/arch/arm/cpu/arm926ejs/libarm926ejs.o /home/lool/git/denx/u-boot/obj-v-broken/arch/arm/cpu/arm926ejs/versatile/libversatile.o /home/lool/git/denx/u-boot/obj-v-broken/arch/arm/lib/libarm.o /home/lool/git/denx/u-boot/obj-v-broken/common/libcommon.o /home/lool/git/denx/u-boot/obj-v-broken/disk/libdisk.o /home/lool/git/denx/u-boot/obj-v-broken/drivers/bios_emulator/libatibiosemu.o /home/lool/git/denx/u-boot/obj-v-broken/drivers/block/libblock.o /home/lool/git/denx/u-boot/obj-v-broken/drivers/dma/libdma.o /home/lool/git/denx/u-boot/obj-v-broken/drivers/fpga/libfpga.o /home/lool/git/denx/u-boot/obj-v-broken/drivers/gpio/libgpio.o /home/lool/git/denx/u-boot/obj-v-broken/drivers/hwmon/libhwmon.o /home/lool/git/denx/u-boot/obj-v-broken/drivers/i2c/libi2c.o /home/lool/git/denx/u-boot/obj-v-broken/drivers/input/libinput.o /home/lool/git/denx/u-boot/obj-v-broken/drivers/misc/libmisc.o /home/lool/git/denx/u-boot/obj-v-broken/drivers/mmc/libmmc.o /home/lool/git/denx/u-boot/obj-v-broken/drivers/mtd/libmtd.o /home/lool/git/denx/u-boot/obj-v-broken/drivers/mtd/nand/libnand.o /home/lool/git/denx/u-boot/obj-v-broken/drivers/mtd/onenand/libonenand.o /home/lool/git/denx/u-boot/obj-v-broken/drivers/mtd/spi/libspi_flash.o /home/lool/git/denx/u-boot/obj-v-broken/drivers/mtd/ubi/libubi.o /home/lool/git/denx/u-boot/obj-v-broken/drivers/net/libnet.o /home/lool/git/denx/u-boot/obj-v-broken/drivers/net/phy/libphy.o /home/lool/git/denx/u-boot/obj-v-broken/drivers/pci/libpci.o /home/lool/git/denx/u-boot/obj-v-broken/drivers/pcmcia/libpcmcia.o /home/lool/git/denx/u-boot/obj-v-broken/drivers/power/libpower.o /home/lool/git/denx/u-boot/obj-v-broken/drivers/rtc/librtc.o /home/lool/git/denx/u-boot/obj-v-broken/drivers/serial/libserial.o /home/lool/git/denx/u-boot/obj-v-broken/drivers/spi/libspi.o /home/lool/git/denx/u-boot/obj-v-broken/drivers/twserial/libtws.o /home/lool/git/denx/u-boot/obj-v-broken/drivers/usb/eth/libusb_eth.o /home/lool/git/denx/u-boot/obj-v-broken/drivers/usb/gadget/libusb_gadget.o /home/lool/git/denx/u-boot/obj-v-broken/drivers/usb/host/libusb_host.o /home/lool/git/denx/u-boot/obj-v-broken/drivers/usb/musb/libusb_musb.o /home/lool/git/denx/u-boot/obj-v-broken/drivers/usb/phy/libusb_phy.o /home/lool/git/denx/u-boot/obj-v-broken/drivers/video/libvideo.o /home/lool/git/denx/u-boot/obj-v-broken/drivers/watchdog/libwatchdog.o /home/lool/git/denx/u-boot/obj-v-broken/fs/cramfs/libcramfs.o /home/lool/git/denx/u-boot/obj-v-broken/fs/ext2/libext2fs.o /home/lool/git/denx/u-boot/obj-v-broken/fs/fat/libfat.o /home/lool/git/denx/u-boot/obj-v-broken/fs/fdos/libfdos.o /home/lool/git/denx/u-boot/obj-v-broken/fs/jffs2/libjffs2.o /home/lool/git/denx/u-boot/obj-v-broken/fs/reiserfs/libreiserfs.o /home/lool/git/denx/u-boot/obj-v-broken/fs/ubifs/libubifs.o /home/lool/git/denx/u-boot/obj-v-broken/fs/yaffs2/libyaffs2.o /home/lool/git/denx/u-boot/obj-v-broken/lib/libfdt/libfdt.o /home/lool/git/denx/u-boot/obj-v-broken/lib/libgeneric.o /home/lool/git/denx/u-boot/obj-v-broken/lib/lzma/liblzma.o /home/lool/git/denx/u-boot/obj-v-broken/lib/lzo/liblzo.o /home/lool/git/denx/u-boot/obj-v-broken/lib/zlib/libz.o /home/lool/git/denx/u-boot/obj-v-broken/net/libnet.o /home/lool/git/denx/u-boot/obj-v-broken/post/libpost.o | sed -n -e 's/.*$__u_boot_cmd_.*$/-u\1/p'|sort|uniq`; cd /home/lool/git/denx/u-boot/obj-v-broken && arm-linux-gnueabi-ld -pie -T /home/lool/git/denx/u-boot/obj-v-broken/u-boot.lds -Bstatic -Ttext 0x10000 $UNDEF_SYM arch/arm/cpu/arm926ejs/start.o --start-group api/libapi.o arch/arm/cpu/arm926ejs/libarm926ejs.o arch/arm/cpu/arm926ejs/versatile/libversatile.o arch/arm/lib/libarm.o common/libcommon.o disk/libdisk.o drivers/bios_emulator/libatibiosemu.o drivers/block/libblock.o drivers/dma/libdma.o drivers/fpga/libfpga.o drivers/gpio/libgpio.o drivers/hwmon/libhwmon.o drivers/i2c/libi2c.o drivers/input/libinput.o drivers/misc/libmisc.o drivers/mmc/libmmc.o drivers/mtd/libmtd.o drivers/mtd/nand/libnand.o drivers/mtd/onenand/libonenand.o drivers/mtd/spi/libspi_flash.o drivers/mtd/ubi/libubi.o drivers/net/libnet.o drivers/net/phy/libphy.o drivers/pci/libpci.o drivers/pcmcia/libpcmcia.o drivers/power/libpower.o drivers/rtc/librtc.o drivers/serial/libserial.o drivers/spi/libspi.o drivers/twserial/libtws.o drivers/usb/eth/libusb_eth.o drivers/usb/gadget/libusb_gadget.o drivers/usb/host/libusb_host.o drivers/usb/musb/libusb_musb.o drivers/usb/phy/libusb_phy.o drivers/video/libvideo.o drivers/watchdog/libwatchdog.o fs/cramfs/libcramfs.o fs/ext2/libext2fs.o fs/fat/libfat.o fs/fdos/libfdos.o fs/jffs2/libjffs2.o fs/reiserfs/libreiserfs.o fs/ubifs/libubifs.o fs/yaffs2/libyaffs2.o lib/libfdt/libfdt.o lib/libgeneric.o lib/lzma/liblzma.o lib/lzo/liblzo.o lib/zlib/libz.o net/libnet.o post/libpost.o board/armltd/versatile/libversatile.o --end-group /home/lool/git/denx/u-boot/obj-v-broken/arch/arm/lib/eabi_compat.o -L /usr/lib/gcc/arm-linux-gnueabi/4.6.1 -lgcc -Map u-boot.map -o u-boot I verified that all the .o files passed above have Tag_CPU_name: "5TE" in their arm-linux-gnueabi-readelf -A output; the only problematic file is -lgcc. Note that the final link is done with arm-linux-gnueabi-ld and doesn't set any architecture; I changed it manually to use gcc and pass the -marm -march=armv5te, and had to set -nostdlib too when using gcc: arm-linux-gnueabi-gcc -marm -march=armv5te -nostdlib -pie -T /home/lool/git/denx/u-boot/obj-v-broken/u-boot.lds -Bstatic -Ttext 0x10000 $UNDEF_SYM arch/arm/cpu/arm926ejs/start.o -Wl,--start-group api/libapi.o arch/arm/cpu/arm926ejs/libarm926ejs.o arch/arm/cpu/arm926ejs/versatile/libversatile.o arch/arm/lib/libarm.o common/libcommon.o disk/libdisk.o drivers/bios_emulator/libatibiosemu.o drivers/block/libblock.o drivers/dma/libdma.o drivers/fpga/libfpga.o drivers/gpio/libgpio.o drivers/hwmon/libhwmon.o drivers/i2c/libi2c.o drivers/input/libinput.o drivers/misc/libmisc.o drivers/mmc/libmmc.o drivers/mtd/libmtd.o drivers/mtd/nand/libnand.o drivers/mtd/onenand/libonenand.o drivers/mtd/spi/libspi_flash.o drivers/mtd/ubi/libubi.o drivers/net/libnet.o drivers/net/phy/libphy.o drivers/pci/libpci.o drivers/pcmcia/libpcmcia.o drivers/power/libpower.o drivers/rtc/librtc.o drivers/serial/libserial.o drivers/spi/libspi.o drivers/twserial/libtws.o drivers/usb/eth/libusb_eth.o drivers/usb/gadget/libusb_gadget.o drivers/usb/host/libusb_host.o drivers/usb/musb/libusb_musb.o drivers/usb/phy/libusb_phy.o drivers/video/libvideo.o drivers/watchdog/libwatchdog.o fs/cramfs/libcramfs.o fs/ext2/libext2fs.o fs/fat/libfat.o fs/fdos/libfdos.o fs/jffs2/libjffs2.o fs/reiserfs/libreiserfs.o fs/ubifs/libubifs.o fs/yaffs2/libyaffs2.o lib/libfdt/libfdt.o lib/libgeneric.o lib/lzma/liblzma.o lib/lzo/liblzo.o lib/zlib/libz.o net/libnet.o post/libpost.o board/armltd/versatile/libversatile.o -Wl,--end-group /home/lool/git/denx/u-boot/obj-v-broken/arch/arm/lib/eabi_compat.o -L /usr/lib/gcc/arm-linux-gnueabi/4.6.1 -lgcc -Wl,-Map u-boot.map -o u-boot But this command works and produces an u-boot ELF which has Tag_CPU_name: "7-A". How would I break the build when libgcc isn't ARMv5T? Thanks, -- Loïc Minier

13 years, 7 months

5
9
0 0

trunk bootstrap failure on arm-linux-gnueabi

by Ramana Radhakrishnan

PR51068 appears to be a dup of http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51051 . I've kicked off a build on habitat.canonical.com with this patch applied to be sure that the build and test continues http://gcc.gnu.org/ml/gcc-patches/2011-11/txt00199.txt Ramana

13 years, 7 months

1
0
0 0

[ACTIVITY] 4th -11th November

by Andrew Stubbs

Spun Linaro GCC 4.6 release tarball, uploaded it to Michael's server, and launched the testing. Continued work on constant reuse optimization. I've now eliminated some more false positives caused by inconsistent rtx_cost results. It turns out the pass also fixes up inefficient constants generated by arm_split_constants, which is nice. Set yet more spec benchmark runs going as part of the generic tuning investigation. Other: Half day Monday to recover from the weekend's travel. Half day on internal Mentor activities.

13 years, 7 months

1
0
0 0

[ACTIVITY] Nov 8 - Nov 11

by Ulrich Weigand

== GDB == * Worked on support for cross-platform core file generation. After some discussion on the mailing list it seems we've come to an agreement that the remote protocol ought to have two separate packets related to memory layout, one that describes the permanent, system-wide layout (for embedded systems) and one that describes the dynamic, per-process layout (for processes with memory-mapped files). The latter also ought to be integrated with the "info proc mappings" command, which should work with gdbserver too. I've been working on updating the patches accordingly. == GCC == * Patch review week. Mit freundlichen Gruessen / Best Regards Ulrich Weigand -- Dr. Ulrich Weigand | Phone: +49-7031/16-3727 STSM, GNU compiler and toolchain for Linux on System z and Cell/B.E. IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart, HRB 243294

13 years, 7 months

1
0
0 0

[ACTIVITY] weekly status

by Revital Eres

Testing the SMS register pressure estimation on libav micro benchmarks and eembc. Discussed with Ayal the implementation. He had some ideas to consider regarding the it. Looking into the regressions of SMSed kernels in libav which are not related to register pressure: Consulting with Ayal regarding the case in dsputil-ssd_int8_vs_int16_c where we have severe regression with SMS; it seemed that the regression was due to dependence between accumulations that can be avoided, more specifically we had the following case in vector code: vec1 = vec1 + ... ... vec1 = vec1+ ... ... vec1 = vec1+ ... ... vec1 = vec1+... to resolve this, I implemented a hack similar to MVE optimiation in the loop-unroller as follows: vec1 = vec1 + ... ... vec2 = vec2+ ... ... vec3 = vec3+ ... ... vec4 = vec4+... This gives ~4.5% improvements to the non-SMSed version. The SMS version now shows no regression as the problematic loop which caused the regression now failed to be SMSed and I'm looking into the reason. Another regression showed in idct-internal-8 is apparently related to the do-loop optimziation (SMS actually failed to be applied in this loop). when applying the patch to expand SMS to recognise doloop then the regression is resolved. (http://gcc.gnu.org/ml/gcc-patches/2011-09/msg02051.html; patch is not in mainline yet)

13 years, 7 months

1
0
0 0

WW45

by Zhenqiang Chen

Summary: * Add expat and ncurses support for gdb-cross. * Rebase and create patches for crosstool-ng upstream. * Compare configurations difference between crosstool-ng and embedded toolchain. Details: * Add expat and ncurses support for gdb cross. At this time, all the packages can be built for both Linux and Mingw32 host with baremetal target. * Rebase and create patches for crosstool-ng upstream. Patches are sent to Michael for review. * Compare configurations difference between crosstool-ng and embedded toolchain. To align the configuration, crosstool-ng * need (not necessary) --disable-nls for companion libs. * need make document. * need add --enable-newlib-register-fini config and enhance CFLAGS_FOR_TARGET for newlib. * need --disable-sim for gdb. * need multilib support (Can workaround for current implementation. Need improvement when it is fully supported). Thanks! -Zhenqiang

13 years, 7 months

1
0
0 0

[ACTIVITY] 2011-11-11

by David Gilbert

== 64 bit atomics == * Nailed one more of the membase tests; again this was a test harness race condition (which I've reported here: http://code.google.com/p/moxi/issues/detail?id=2&thanks=2&ts=1321037460 ) In this case there were two calls to write) performed on the server, yet the test client performed a single read and compared the result to what it was expecting; and got lucky on x86 and about half the time on ARM in that the server data managed to all get read by the 1st read. I think this leaves one more case - that I've seen rarely. == Qemu == * Tested Peter's 11.11 pre release; ran into a couple of issues (vexpress without sound causing hangs, and the Linaro 11.10 Beagle and Overo images not running X). Also filed a couple of bugs in l-i-f-ui that I tripped over while testing it. == String routines == * The new newlib A15 optimised memcpy is slower on an A9 than my routines; posted to newlib list asking what the normal way of dealing with a bunch of different routines is. Would it make sense to get gcc to define a GCC_ARM_TUNE_CORTEX_A-whatever ? == Other == * Watched the Youtube video of the Kernel/Toolchain discussion - for those who didn't attend, I'd encourage a check of the Youtube videos, they're pretty nicely done. * Got pulled away on non-Linaro work for about half the week. Dave

13 years, 7 months

1
0
0 0

[ACTIVITY] weekly status

by Ken Werner

Hi, Android: * managed to remotely debug a system process (like debuggerd) using gdbserver libunwind: * found an error when unwinding via DWARF debug frames when configured for REMOTE_ONLY * discussions on the me revealed that libunwind-ptrace should not be compiled for REMOTE_ONLY case at all (it was intended for host!=target) * this means that our current build approach on Android needs to be changed in the future Misc: * internal meetings Regards Ken

13 years, 7 months

1
0
0 0

[ACTIVITY] report week 45

by Peter Maydell

RAG: Red: Amber: upstream-omap3-cleanup stalled, not clear whether we're going to have any time for it this quarter Green: Current Milestones: || || Planned || Estimate || Actual || ||upstream-omap3-cleanup || 2011-11-10 || 2011-11-10 || || (Future milestones to be added once post-Connect planning is completed.) Historical Milestones: ||add-omap3-networking || 2011-10-13 || 2011-10-13 || 2011-10-13 || ||a15-systemmode-planning || 2011-10-13 || 2011-10-13 || 2011-09-22 || ||a15-usermode-support || 2011-11-10 || 2011-11-10 || 2011-10-27 || == linaro-qemu-11.11 == * 2011.11 release: tarball built, tested and released == other == * nailing down A15/KVM work we're going to do this quarter * usual upstream code review/etc * sent patches upstream to fix some easy bugs somebody found running Coverity on QEMU's source code

13 years, 7 months

1
0
0 0

[ACTIVITY] w45

by Asa Sandahl

Hi! * Ran EEMBC and SPEC on the ursa4. Sorted out a bunch of basic questions related to permissions ans such with Michael. Familiarized myself with the scripts for parsing benchmark results. * Created wiki for running benchmarks in cbuild. It is in my sandbox right now: https://wiki.linaro.org/AsaSandahl/Sandbox/RunningBenchmark * Started off SPEC runs for comparing the "train" and "ref" data sets. We want to know if the changes between variants are the same for the two sets. Best regards Åsa

13 years, 7 months

1
0
0 0

Linaro QEMU 2011.11 released

by Peter Maydell

The Linaro Toolchain Working Group is pleased to announce the release of Linaro QEMU 2011.11. Linaro QEMU 2011.11 is the latest monthly release of qemu-linaro. Based off upstream (trunk) QEMU, it includes a number of ARM-focused bug fixes and enhancements. New in this month's release: - The ARM vexpress-a9, versatilepb, versatileab and realview-* boards now have audio support (thanks to Mathieu Sonet who contributed a PL041 implementation upstream) - Support for multiple instances of the "-sd" option on the command line has been dropped; this was never present in upstream QEMU and has been removed for consistency. Use "-drive,if=sd,index=N,file=file.img" for N=0,1,2... instead - Fixes #886980: 8 and 16 bit reads from the OMAP GPIO module would crash due to an infinite recursion - Fixes #823902: problems running multithreaded programs in linux-user mode Known issues: - Graphics do not work for OMAP3 based models (beagle, overo) with 11.10 Linaro images. NB: if you run QEMU on a host system without properly configured audio you might find that QEMU now hangs at some point; you can fix this by fixing your host system, or work around it by setting the environment variable QEMU_AUDIO_DRV=none. If you build from source you may now want to pass configure a suitable --audio-drv-list=LIST option. The source tarball is available at: https://launchpad.net/qemu-linaro/+milestone/2011.11 Binary builds of this qemu-linaro release are being prepared and will be available shortly for users of Ubuntu. Packages will be in the linaro-maintainers tools ppa: https://launchpad.net/~linaro-maintainers/+archive/tools/ More information on Linaro QEMU is available at: https://launchpad.net/qemu-linaro

13 years, 7 months

1
1
0 0

[ACTIVITY] November 6-10

by Ira Rosen

Hi, - SLP improvements for weight-h264-pixels16x16-8 (libav): - conditions in SLP - committed upstream - support pattern detection in SLP - implemented - enhance mixed condition pattern to handle non-constant then/else clauses - implemented weight-h264-pixels16x16-8 now gets vectorized with 2.6x speedup. - Vectorizer maintenance (bug fixes, patch reviews). I'll be on vacation on Sunday. Ira

13 years, 7 months

1
0
0 0

FW: contribution Linux Kernel Debugger

by Marc TITINGER

Hi there, As discussed with Loïc, please find attached my slides to the ELCE presentation related to our implementation of linux-awareness for JTAG debugging. I still need a formal clearance of my organization on my contribution patch, but I (and my managers) will be happy to see some or all of this work benefit to-and-from the community. If you are interested, I will try to upload a self-contained qemu-based demo to a public ftp. Cheers, Marc Titinger. > -----Original Message----- > From: Loïc Minier [mailto:lool@dooz.org] > Sent: Wednesday, November 02, 2011 3:19 PM > To: Marc TITINGER > Cc: Michael Hope; Ulrich Weigand > Subject: Re: contribution Linux Kernel Debugger > > On Wed, Nov 02, 2011, Marc TITINGER wrote: > > J'ai été content de vous rencontrer toi et Nicolas à la suite de ma > > présentation pour discuter de l'opportunité de la contribution de > > notre debugger linux. J'ai une question: dans la mesure ou STMicro ne > > fait pas partie des membres de linaro, aurais-je un accès restreint > > aux outils et discussions si je souhaite contribuer? Quels serons les > > blueprints correspondant à ce projet ? > > Most things Linaro does are public; concerning the toolchain, > benchmarks are kept private due to licensing constraints. Even if > STMicro is not a member, you're welcome to present your ideas and code > to us (we don't mind if STMicro joins as a member though ;-). > > We covered topics similar to your ELC-Europe talk this week: > https://blueprints.launchpad.net/linaro-toolchain-misc/+spec/linaro- > toolchain-kernel-debugging > > Ulrich Weigand and Michael Hope will continue discussions around where > we will go in terms of helping kernel debugging next cycle, it might > be > that we end up working on similar areas than the ones you and I > discussed (special handling for linux in GDB -- tasks, backtracing > across kernelspace/userspace; OpenOCD fixes...). > > Your slides don't seem to be at > https://events.linuxfoundation.org/events/embedded-linux-conference- > europe/titinger > yet, so perhaps you could share a link with Michael and Ulrich? or > post on the linaro-toolchain@ mailing-list > > We'll be a bit busy this week, but if you want to discuss your > patches, > upstreaming, further developments, I would think Michael can arrange > for you to join a Toolchain WG call in the next weeks -- Michael, I'll > let you comment once you get to see the slides :-) > > -- > Loïc Minier

13 years, 7 months

3
2
0 0

[ACTIVITY] WW43

by Zhenqiang Chen

Summary: * Add zlib and libiconv support in crosstool-ng and repack embedded toolchain source package. Details: * Read crosstool-ng scripts, configs and document to learn on how it works. * Try mkedwards's extensions for crosstool-ng at https://github.com/mkedwards/crosstool-ng. It does have lots of extensions, the GDB-cross can build. But zlib and libiconv do not meet our requirement. * Add config, patch and build scripts for zlib and update the binutils build scripts to use the prebuilt zlib. * Add config and build scripts for libiconv and update the build scripts of gcc and gdb. * Write scripts to patch and repack embedded toolchain source packages to the standard format. Plans: * Linaro connect: Oct. 31 - Nov. 4. * Integrate the repack scripts with crosstool-ng. Thanks! -Zhenqiang

13 years, 7 months

2
1
0 0

[ACTIVITY] report week 44

by Peter Maydell

RAG: Red: Amber: Green: Current Milestones: || || Planned || Estimate || Actual || ||upstream-omap3-cleanup || 2011-11-10 || 2011-11-10 || || Historical Milestones: ||add-omap3-networking || 2011-10-13 || 2011-10-13 || 2011-10-13 || ||a15-systemmode-planning || 2011-10-13 || 2011-10-13 || 2011-09-22 || ||a15-usermode-support || 2011-11-10 || 2011-11-10 || 2011-10-27 || == other == * Linaro Connect week. Included an extremely useful double-length session about KVM on A15, which should turn into blueprints/plans in due course * Found out a bit more about UEFI -- I'm leaning towards having QEMU for vexpress run UEFI by default as a way of letting you just pass it a disk image rather than having to feed it a separatekernel/initrd. (Will look into this more when the ARM landing team have it all building and working on hardware.) * I have a working prototype of the QEMU virtio-mmio transport (written to Pawel's spec). However to get this upstream we will first need to properly refactor the qemu virtio code so the link between the transport and the blk/net/etc backends is a qdev bus. -- PMM

13 years, 7 months

1
0
0 0

[ACTIVITY] weekly status

by Revital Eres

Continue working on the regsiter pressure estimation implementation - testing the implementation on libav micro benchmarks. With the patch some SMSed kernels in put-h264-qpel8-hv-lowpass-8, swscale-rgb24ToY_c mjpegenc benchmarks are identified as having register pressure. I'm looking at the kernels which still have regressions with SMS and it seems the reason is not related to register pressure.

13 years, 7 months

1
0
0 0

Video - Toolchain support for kernel debugging

by Michael Opdenacker

Hi Ulrich, Here's a video of your session about kernel debugging: http://www.youtube.com/watch?v=_lIz0W3l2E8 Thanks, Michael. -- Michael Opdenacker - Community Manager Linaro, http://linaro.org Phone: +33 484 253 396 IRC: 'opm' in #linaro on irc.freenode.net

13 years, 7 months

1
0
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

linaro-toolchain