== This week ==
* More on -fsched-pressure. Testing on POWER7 showed a degenerate case that I'd failed to handle well. Fixed that. Saw that part of the problem on POWER7 was that IRA was using a combination of GENERAL_REGS and CR_REGS as a single pressure class, so there appeared to be 39 registers available for storing integers. Fixed (or worked around) that.
Tweaked a few other things too. The only denbench result that I wasn't happy with was RSA, where both forms of -fsched-pressure are significantly worse than -fno-sched-pressure. Tracked down the cause of that. We had a block BB1:
A: (set (reg:DI X) Y) B: (clobber (reg:DI Z)) C: (set (subreg:SI (reg:DI Z) 0) (... X ...)) D: (set (subreg:SI (reg:DI Z) 4) (...))
where B makes sure that Z is treated as dead before C. Interblock motion causes B to be scheduled in an earlier block, but none of the other instructions can be. This means that, when we schedule BB1, it still contains A, C and D, and Z now appears to be live on entry to the block. C therefore appears to reduce register pressure, because it contains the last use of X, and appears to leave Z's liveness unaffected. In reality it should be treated as increasing register pressure by 1 (-1 for the death of X, +2 for the birth of Z).
I "fixed" this by moving C's dependencies to B, a bit like we do for scheduling groups (although none of the other handling of scheduling groups should apply). This made a big difference, so that the new code is a win on RSA.
There's still one SPEC2006 degradation on POWER7 that I want to look at.
* Caught up on a lot of mail. gcc-patches backlog has gone down from ~4900 when I got back to ~500.
* Briefly looked at x86's drap support, to see what would be needed for ARM. Didn't look for long though: the overhead seems excessive for optional alignment, and the agreement seemed to be that 128-bit alignment wouldn't really make much of a difference anyway.
Richard
linaro-toolchain@lists.linaro.org