Hi,
I've tested a bit more big endian images from rmk/for-next which has Ben's big endian series merged. I've run into several issues while testing Pandaboard and Arndale. All three issues have fairly simple fix once detected.
Here is short explanation/note for each fix
1) ARM: __fixup_smp read of SCU config should do byteswap in BE case [1] introduced SCU read in __fixup_smp function in case of A9 cpu. Such read need byteswap in case of BE.
2) ARM: mm: fix __phys_to_virt to work with 64 bit phys_addr_t in BE case [2] changed type of __phys_to_virt from 'unsigned long' to 'phys_addr_t' but that causes problem with inline assembler in __phys_to_virt function that expects 'r' operand but gets 64 bit value. It is very similar to ASID issue [3]. Small test cases that illustrate inline asm and 64 bit operand issue could be found in [4].
3) ARM: fix mov to mvn conversion in case of 64 bit phys_addr_t and BE Conflict resolution between [5] and [6] was not entirely correct for the case when 'mov' instruction has to be converted into 'mvn' instruction. I missed it in my previos testing because the issue manifests itself only if CONFIG_ARCH_PHYS_ADDR_T_64BIT is enabled, which was not, and I saw this issue only when I got to Arndale testing. In proposed patch I've fixed this issue trying to push spirit of [6] and do it in most optimized way. However personally, I think code could be much readable if we just add instruction byteswap under ARM_BE8 after read, and before write with common patch logic in between. After all, in THUMB2 case few lines above code just do that. Please let me know if folks like this idea better I can respin this fix.
Tests: boots and runs
TC2: LE/BE Thumb2/Non-Thumb2 (no LPAE, no ARCH_PHYS_ADDR_T_64BIT) Pandaboard: LE/BE multiarch (non-thumb2) Arndale: LE/BE Thumb2/Non-Thumb2 (with LPAE, with ARCH_PHYS_ADDR_T_64BIT, no KVM, no MMC_DW_IDMAC)
For testing Arndale and Pandaboard BE BSP changes were used on top of these patches.
References:
[1] bc41b8724f24b9a27d1dcc6c974b8f686b38d554 ARM: 7846/1: Update SMP_ON_UP code to detect A9MPCore with 1 CPU devices
[2] ca5a45c06cd4764fb8510740f7fc550d9a0208d4 ARM: mm: use phys_addr_t appropriately in p2v and v2p conversions
[3] a1af3474487cc3b8731b990dceac6b6aad7f3ed8 ARM: tlb: ASID macro should give 32bit result for BE correct operation
[4] http://lists.infradead.org/pipermail/linux-arm-kernel/2013-October/202584.ht...
[5] f52bb722547f43caeaecbcc62db9f3c3b80ead9b ARM: mm: Correct virt_to_phys patching for 64 bit physical addresses
[6] 2f9bf9beddb1649485b47302a5aba9761cbc9084 ARM: fixup_pv_table bug when CPU_ENDIAN_BE8
Victor Kamensky (3): ARM: __fixup_smp read of SCU config should do byteswap in BE case ARM: mm: fix __phys_to_virt to work with 64 bit phys_addr_t in BE case ARM: fix mov to mvn conversion in case of 64 bit phys_addr_t and BE
arch/arm/include/asm/memory.h | 8 +++++++- arch/arm/kernel/head.S | 7 ++++++- 2 files changed, 13 insertions(+), 2 deletions(-)
Commit "bc41b8724f24b9a27d1dcc6c974b8f686b38d554 ARM: 7846/1: Update SMP_ON_UP code to detect A9MPCore with 1 CPU devices" added read of SCU config register into __fixup_smp function. Such read should be followed by byteswap, if kernel runs in BE mode.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org --- arch/arm/kernel/head.S | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S index 7801866..cd788d5 100644 --- a/arch/arm/kernel/head.S +++ b/arch/arm/kernel/head.S @@ -508,6 +508,7 @@ __fixup_smp: teq r0, #0x0 @ '0' on actual UP A9 hardware beq __fixup_smp_on_up @ So its an A9 UP ldr r0, [r0, #4] @ read SCU Config +ARM_BE8(rev r0, r0) @ byteswap if big endian and r0, r0, #0x3 @ number of CPUs teq r0, #0x0 @ is 1? movne pc, lr
On Monday 04 November 2013 09:16 PM, Victor Kamensky wrote:
Commit "bc41b8724f24b9a27d1dcc6c974b8f686b38d554 ARM: 7846/1: Update SMP_ON_UP code to detect A9MPCore with 1 CPU devices" added read of SCU config register into __fixup_smp function. Such read should be followed by byteswap, if kernel runs in BE mode.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org
Acked-by: Santosh Shilimkar santosh.shilimkar@ti.com
Make sure that inline assembler that expects 'r' operand receives 32 bit value.
Before this fix in case of CONFIG_ARCH_PHYS_ADDR_T_64BIT and CONFIG_ARM_PATCH_PHYS_VIRT __phys_to_virt function passed 64 bit value to __pv_stub inline assembler where 'r' operand is expected. Compiler behavior in such case is not well specified. It worked in little endian case, but in big endian case incorrect code was generated, where compiler confused which part of 64 bit value it needed to modify. For example BE snippet looked like this:
N:0x80904E08 : MOV r2,#0 N:0x80904E0C : SUB r2,r2,#0x81000000
when LE similar code looked like this
N:0x808FCE2C : MOV r2,r0 N:0x808FCE30 : SUB r2,r2,#0xc0, 8 ; #0xc0000000
Note 'r0' register is va that have to be translated into phys
To avoid this situation use explicit cast to 'unsigned long', which explicitly discard upper part of phys address and convert value to 32 bit. Also add comment so such cast will not be removed in the future.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org --- arch/arm/include/asm/memory.h | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h index 4dd2145..7a8599c 100644 --- a/arch/arm/include/asm/memory.h +++ b/arch/arm/include/asm/memory.h @@ -226,7 +226,13 @@ static inline phys_addr_t __virt_to_phys(unsigned long x) static inline unsigned long __phys_to_virt(phys_addr_t x) { unsigned long t; - __pv_stub(x, t, "sub", __PV_BITS_31_24); + /* + * 'unsigned long' cast discard upper word when + * phys_addr_t is 64 bit, and makes sure that inline + * assembler expression receives 32 bit argument + * in place where 'r' 32 bit operand is expected. + */ + __pv_stub((unsigned long) x, t, "sub", __PV_BITS_31_24); return t; }
On Monday 04 November 2013 09:16 PM, Victor Kamensky wrote:
Make sure that inline assembler that expects 'r' operand receives 32 bit value.
Before this fix in case of CONFIG_ARCH_PHYS_ADDR_T_64BIT and CONFIG_ARM_PATCH_PHYS_VIRT __phys_to_virt function passed 64 bit value to __pv_stub inline assembler where 'r' operand is expected. Compiler behavior in such case is not well specified. It worked in little endian case, but in big endian case incorrect code was generated, where compiler confused which part of 64 bit value it needed to modify. For example BE snippet looked like this:
N:0x80904E08 : MOV r2,#0 N:0x80904E0C : SUB r2,r2,#0x81000000
when LE similar code looked like this
N:0x808FCE2C : MOV r2,r0 N:0x808FCE30 : SUB r2,r2,#0xc0, 8 ; #0xc0000000
Note 'r0' register is va that have to be translated into phys
To avoid this situation use explicit cast to 'unsigned long', which explicitly discard upper part of phys address and convert value to 32 bit. Also add comment so such cast will not be removed in the future.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org
arch/arm/include/asm/memory.h | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h index 4dd2145..7a8599c 100644 --- a/arch/arm/include/asm/memory.h +++ b/arch/arm/include/asm/memory.h @@ -226,7 +226,13 @@ static inline phys_addr_t __virt_to_phys(unsigned long x) static inline unsigned long __phys_to_virt(phys_addr_t x) { unsigned long t;
- __pv_stub(x, t, "sub", __PV_BITS_31_24);
Minor nit. An extra line would be good here for a comment to follow.
/*
* 'unsigned long' cast discard upper word when
* phys_addr_t is 64 bit, and makes sure that inline
* assembler expression receives 32 bit argument
* in place where 'r' 32 bit operand is expected.
*/
- __pv_stub((unsigned long) x, t, "sub", __PV_BITS_31_24); return t;
}
Acked-by: Santosh Shilimkar santosh.shilimkar@ti.com
On Mon, Nov 04, 2013 at 06:16:04PM -0800, Victor Kamensky wrote:
static inline unsigned long __phys_to_virt(phys_addr_t x) { unsigned long t;
- __pv_stub(x, t, "sub", __PV_BITS_31_24);
/*
* 'unsigned long' cast discard upper word when
* phys_addr_t is 64 bit, and makes sure that inline
* assembler expression receives 32 bit argument
* in place where 'r' 32 bit operand is expected.
*/
We use tabs for indentation in the kernel source, not 8 spaces. Please fix before final submission, thanks. :)
Fix patching code to convert mov instruction into mvn instruction in case of CONFIG_ARCH_PHYS_ADDR_T_64BIT and CONFIG_ARM_PATCH_PHYS_VIRT.
In BE case store into r0 proper bits so byte swapped instruction could be modified correctly.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org --- arch/arm/kernel/head.S | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S index cd788d5..11d59b3 100644 --- a/arch/arm/kernel/head.S +++ b/arch/arm/kernel/head.S @@ -645,7 +645,11 @@ ARM_BE8(rev16 ip, ip) bcc 1b bx lr #else +#ifdef CONFIG_CPU_ENDIAN_BE8 + moveq r0, #0x00004000 @ set bit 22, mov to mvn instruction +#else moveq r0, #0x400000 @ set bit 22, mov to mvn instruction +#endif b 2f 1: ldr ip, [r7, r3] #ifdef CONFIG_CPU_ENDIAN_BE8 @@ -654,7 +658,7 @@ ARM_BE8(rev16 ip, ip) tst ip, #0x000f0000 @ check the rotation field orrne ip, ip, r6, lsl #24 @ mask in offset bits 31-24 biceq ip, ip, #0x00004000 @ clear bit 22 - orreq ip, ip, r0, lsl #24 @ mask in offset bits 7-0 + orreq ip, ip, r0 @ mask in offset bits 7-0 #else bic ip, ip, #0x000000ff tst ip, #0xf00 @ check the rotation field
On Tuesday 05 November 2013 07:46 AM, Victor Kamensky wrote:
Fix patching code to convert mov instruction into mvn instruction in case of CONFIG_ARCH_PHYS_ADDR_T_64BIT and CONFIG_ARM_PATCH_PHYS_VIRT.
In BE case store into r0 proper bits so byte swapped instruction could be modified correctly.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org
arch/arm/kernel/head.S | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S index cd788d5..11d59b3 100644 --- a/arch/arm/kernel/head.S +++ b/arch/arm/kernel/head.S @@ -645,7 +645,11 @@ ARM_BE8(rev16 ip, ip) bcc 1b bx lr #else +#ifdef CONFIG_CPU_ENDIAN_BE8
- moveq r0, #0x00004000 @ set bit 22, mov to mvn instruction
+#else moveq r0, #0x400000 @ set bit 22, mov to mvn instruction +#endif b 2f 1: ldr ip, [r7, r3] #ifdef CONFIG_CPU_ENDIAN_BE8 @@ -654,7 +658,7 @@ ARM_BE8(rev16 ip, ip) tst ip, #0x000f0000 @ check the rotation field orrne ip, ip, r6, lsl #24 @ mask in offset bits 31-24 biceq ip, ip, #0x00004000 @ clear bit 22
- orreq ip, ip, r0, lsl #24 @ mask in offset bits 7-0
- orreq ip, ip, r0 @ mask in offset bits 7-0
#else bic ip, ip, #0x000000ff tst ip, #0xf00 @ check the rotation field
Ok, I think for the thumb case this is already taken care because of the swap.
Reviewed-by: R Sricharan r.sricharan@ti.com
Regards, Sricharan
On Monday 04 November 2013 09:16 PM, Victor Kamensky wrote:
Fix patching code to convert mov instruction into mvn instruction in case of CONFIG_ARCH_PHYS_ADDR_T_64BIT and CONFIG_ARM_PATCH_PHYS_VIRT.
In BE case store into r0 proper bits so byte swapped instruction could be modified correctly.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org
Looks fine to me Acked-by: Santosh Shilimkar santosh.shilimkar@ti.com
On Mon, Nov 04, 2013 at 06:16:02PM -0800, Victor Kamensky wrote:
- ARM: fix mov to mvn conversion in case of 64 bit phys_addr_t and BE
Conflict resolution between [5] and [6] was not entirely correct for the case when 'mov' instruction has to be converted into 'mvn' instruction.
This is a good reason why rushing stuff along when there's two people working in the same area is bad news. The person who gets to do the conflict resolution doesn't always know what the correct solution is, so many of them are pure guesses, and many of them won't be adequately tested.
Quite honestly, I would have preferred *not* to have pulled Ben's BE changes along with Santosh's other changes, and have Ben rebase for the _following_ merge window to prevent exactly this kind of mess.
The BE stuff has been one problem after another...
On 5 November 2013 07:20, Russell King - ARM Linux linux@arm.linux.org.uk wrote:
On Mon, Nov 04, 2013 at 06:16:02PM -0800, Victor Kamensky wrote:
- ARM: fix mov to mvn conversion in case of 64 bit phys_addr_t and BE
Conflict resolution between [5] and [6] was not entirely correct for the case when 'mov' instruction has to be converted into 'mvn' instruction.
This is a good reason why rushing stuff along when there's two people working in the same area is bad news. The person who gets to do the conflict resolution doesn't always know what the correct solution is, so many of them are pure guesses, and many of them won't be adequately tested.
Quite honestly, I would have preferred *not* to have pulled Ben's BE changes along with Santosh's other changes, and have Ben rebase for the _following_ merge window to prevent exactly this kind of mess.
Sorry, I probably confused folks and pushed too hard with my tc2 "all fine" testing message. Lesson learned.
The BE stuff has been one problem after another...
In what sense? Does it create problems for something or someone? I hope not. I think it is in pair with other efforts, and I think we go after issues quickly enough. Or you mean that we keep discovering more and more BE issues to fix. I agree with that. I think we will see more. We are working on few known issues like BE V7 kprobes, and BE V7 KVM. I think we will see BE breakages and follow up fixes too. And IMHO it is normal, BE is side feature, outside of main direction and it is easy for guys to miss BE issue. That is why Linaro is going to setup automatic BE builds and tests, and we will monitor and fix discovered issues.
Thanks, Victor
linaro-kernel@lists.linaro.org