On Wed, May 25, 2011 at 10:21:43AM -0400, Nicolas Pitre wrote:
On Wed, 25 May 2011, Dave Martin wrote:
On Tue, May 24, 2011 at 11:45:04PM -0400, Nicolas Pitre wrote:
- typedef int (__kernel_cmpxchg64_t)(const int64_t *oldval,
const int64_t *newval,
volatile int64_t *ptr);
- #define __kernel_cmpxchg64 (*(__kernel_cmpxchg64_t *)0xffff0f60)
- Atomically store newval in *ptr if *ptr is equal to oldval for user space.
- Return zero if *ptr was changed or non-zero if no exchange happened.
- The C flag is also set if *ptr was changed to allow for assembly
- optimization in the calling code.
- Do not attempt to call this function unless __kernel_helper_version >= 5.
Yep. I will queue your other patch prior to this one and make this blurb consistent with the rest.
Oh, OK. I hadn't feedback on that yet, so I wasn't sure whether anyone was picking it up. I will repost to alkml, but I was waiting for the merge window to close since this was just a tidyup rather than something urgent.
This warning is more important for the new helper than for the existing ones (where really we could omit it without much risk).
+kuser_cmpxchg64_fixup:
- @ Called from kuser_cmpxchg_fixup.
- @ r2 = address of interrupted insn (must be preserved).
- @ sp = saved regs. r7 and r8 are clobbered.
- @ 1b = first critical insn, 2b = last critical insn.
- @ If r2 >= 1b and r2 <= 2b then saved pc_usr is set to 1b.
- mov r7, #0xffff0fff
- sub r7, r7, #(0xffff0fff - (0xffff0f60 + (1b - __kuser_cmpxchg64)))
- subs r8, r2, r7
- rsbcss r8, r8, #(2b - 1b)
- strcs r7, [sp, #S_PC]
+#if __LINUX_ARM_ARCH__ < 6
- b kuser_cmpxchg32_fixup
Can we just have movcs pc,lr here, and put kuser_cmpxchg32_fixup immediately after?
This would allow us to skip the branch, and the initial "mov r7" in the kuser_cmpxchg32_fixup code.
The 'mov r7' is still needed unless the second instruction uses another target register.
What I meant is that at the end of the above sequence, r7 = 1b. So we can just fall through in to the 32-bit fixup code and add the appropriate small offset to r7 so that it now points to the corresponding 1b label for __kuser_cmpxchg32, without needing to re-derive the address using mov+sub.
It's a pretty minor tweak though.
I thought about the possibility of "movcs pc, lr", especially since the .text segments are simply concatenated and therefore the branch is effectively branching to the very next instruction. So there could be like a common preamble, then a list of concatenated fixup segments (only two of them now) and finally a postamble which would simply be "mov pc, lr". This would all be put contigous at link time. However I'm not sure yet if this is worth optimizing that far since this code is far from being hot, and clarity would also be affected.
Also, probably the number of fixups is never going to grow very large.
There's a fair amount of duplicated logic from the 32-bit case. Is it worth trying to merge the two?
The core logic spans 5 instructions. Only 3 of them are actually the same in both cases i.e. those with no references to 1b or 2b, and they're perfectly interlaced in between the others. Trying to make this into common runtime code would result in even bigger code I'm afraid. This could be made into common source code via a macro though.
Fair enough -- a macro might be worth a try _if_ it simplifies things in the source, but I think you're right that merging the actual code probably isn't worth it just to save a few words in the vectors page (which eats up 4K regardless of what we put in it) and a few cycles per fault (which already costs many, many cycles).
Cheers ---Dave