Hi Guys,
Here is second version of series that enables KVM support for V7 big endian kernels. Mostly it deals with BE KVM host support. Marc Zyngier showed before with his patches how BE guest could run on top LE host. With these patches BE guest runs on top of BE host. If Marc's kvmtool is used with few additional changes I tested that BE host could run LE guest. Also I verified that there were no regressions in BE guest on top of LE host case.
Note that posted series covers only kernel side changes. The changes were tested inside of bigger setup with additional changes in qemu and kvmtool. I will post those changes separately in proper aliases but for completeness sake Appendix A gives pointers to git repositories and branches with all needed changes.
Changes since V1: 1) Patch that handles issue of including assembler.h into kvm .S file now handled separately. It was posted before on mailing list [1]. This series depend on it.
2) Incorporated most of Christoffer V1 review comments. The only thing I did not do wrt Christoffer's suggestion is possible rework of 'one_reg coproc set and get BE fixes' patch. I think I put better explanation on what my code does, and if it is still not good we will go from there.
3) Moved rr_lo_hi macro into arch/arm/include/asm/kvm_asm.h along the lines Christoffer suggested
4) Split 'fix KVM assembler files to work in BE case' patch into 4 smaller one each handling one logic issue.
Thanks, Victor
[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2014-February/231432.h...
Victor Kamensky (7): ARM: KVM: switch hypervisor into BE mode in case of BE host ARM: KVM: fix vgic V7 assembler code to work in BE image ARM: KVM: handle 64bit values passed to mrcc or from mcrr instructions in BE case ARM: KVM: __kvm_vcpu_run function return result fix in BE case ARM: KVM: one_reg coproc set and get BE fixes ARM: KVM: vgic mmio should hold data as LE bytes array in BE case ARM: KVM: MMIO support BE host running LE code
arch/arm/include/asm/kvm_asm.h | 23 +++++++++- arch/arm/include/asm/kvm_emulate.h | 22 +++++++-- arch/arm/kvm/coproc.c | 94 ++++++++++++++++++++++++++++---------- arch/arm/kvm/init.S | 7 ++- arch/arm/kvm/interrupts.S | 9 +++- arch/arm/kvm/interrupts_head.S | 32 +++++++++---- virt/kvm/arm/vgic.c | 4 +- 7 files changed, 145 insertions(+), 46 deletions(-)
Switch hypervisor to run in BE mode if image is compiled with CONFIG_CPU_BIG_ENDIAN.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org --- arch/arm/kvm/init.S | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/arch/arm/kvm/init.S b/arch/arm/kvm/init.S index 1b9844d..74f0718 100644 --- a/arch/arm/kvm/init.S +++ b/arch/arm/kvm/init.S @@ -22,6 +22,7 @@ #include <asm/kvm_asm.h> #include <asm/kvm_arm.h> #include <asm/kvm_mmu.h> +#include <asm/assembler.h>
/******************************************************************** * Hypervisor initialization @@ -70,6 +71,8 @@ __do_hyp_init: cmp r0, #0 @ We have a SP? bne phase2 @ Yes, second stage init
+ARM_BE8(setend be) @ Switch to Big Endian mode if needed + @ Set the HTTBR to point to the hypervisor PGD pointer passed mcrr p15, 4, r2, r3, c2
On Tue, Feb 11, 2014 at 09:41:27PM -0800, Victor Kamensky wrote:
Switch hypervisor to run in BE mode if image is compiled with CONFIG_CPU_BIG_ENDIAN.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org
arch/arm/kvm/init.S | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/arch/arm/kvm/init.S b/arch/arm/kvm/init.S index 1b9844d..74f0718 100644 --- a/arch/arm/kvm/init.S +++ b/arch/arm/kvm/init.S @@ -22,6 +22,7 @@ #include <asm/kvm_asm.h> #include <asm/kvm_arm.h> #include <asm/kvm_mmu.h> +#include <asm/assembler.h> /********************************************************************
- Hypervisor initialization
@@ -70,6 +71,8 @@ __do_hyp_init: cmp r0, #0 @ We have a SP? bne phase2 @ Yes, second stage init +ARM_BE8(setend be) @ Switch to Big Endian mode if needed
- @ Set the HTTBR to point to the hypervisor PGD pointer passed mcrr p15, 4, r2, r3, c2
1.8.1.4
Won't splitting up the patches this way break bisectability?
-Christoffer
On 18 March 2014 15:23, Christoffer Dall christoffer.dall@linaro.org wrote:
On Tue, Feb 11, 2014 at 09:41:27PM -0800, Victor Kamensky wrote:
Switch hypervisor to run in BE mode if image is compiled with CONFIG_CPU_BIG_ENDIAN.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org
arch/arm/kvm/init.S | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/arch/arm/kvm/init.S b/arch/arm/kvm/init.S index 1b9844d..74f0718 100644 --- a/arch/arm/kvm/init.S +++ b/arch/arm/kvm/init.S @@ -22,6 +22,7 @@ #include <asm/kvm_asm.h> #include <asm/kvm_arm.h> #include <asm/kvm_mmu.h> +#include <asm/assembler.h>
/********************************************************************
- Hypervisor initialization
@@ -70,6 +71,8 @@ __do_hyp_init: cmp r0, #0 @ We have a SP? bne phase2 @ Yes, second stage init
+ARM_BE8(setend be) @ Switch to Big Endian mode if needed
@ Set the HTTBR to point to the hypervisor PGD pointer passed mcrr p15, 4, r2, r3, c2
-- 1.8.1.4
Won't splitting up the patches this way break bisectability?
I don't think so. I think if just only this patch applied BE image built with CONFIG_VIRTUALIZATION=y and CONFIG_KVM=y will at least boot. KVM will not work of course without remaining patches.
Currently for BE image CONFIG_VIRTUALIZATION and CONFIG_KVM should be explicitly turned off - otherwise image won't boot.
As far as LE image concerned it is NOP change.
Or do you have any specific idea how it could break bisectability?
Thanks, Victor
-Christoffer
On Tue, Mar 18, 2014 at 03:23:35PM -0700, Christoffer Dall wrote:
On Tue, Feb 11, 2014 at 09:41:27PM -0800, Victor Kamensky wrote:
Switch hypervisor to run in BE mode if image is compiled with CONFIG_CPU_BIG_ENDIAN.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org
arch/arm/kvm/init.S | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/arch/arm/kvm/init.S b/arch/arm/kvm/init.S index 1b9844d..74f0718 100644 --- a/arch/arm/kvm/init.S +++ b/arch/arm/kvm/init.S @@ -22,6 +22,7 @@ #include <asm/kvm_asm.h> #include <asm/kvm_arm.h> #include <asm/kvm_mmu.h> +#include <asm/assembler.h> /********************************************************************
- Hypervisor initialization
@@ -70,6 +71,8 @@ __do_hyp_init: cmp r0, #0 @ We have a SP? bne phase2 @ Yes, second stage init +ARM_BE8(setend be) @ Switch to Big Endian mode if needed
- @ Set the HTTBR to point to the hypervisor PGD pointer passed mcrr p15, 4, r2, r3, c2
1.8.1.4
Won't splitting up the patches this way break bisectability?
Second thought, scratch that, because BE support is not supposed to work before applying these patches anyhow I guess.
Reviewed-by: Christoffer Dall christoffer.dall@linaro.org
-Christoffer
The vgic h/w registers are little endian; when asm code reads/writes from/to them, it needs to do byteswap after/before. Byteswap code uses ARM_BE8 wrapper to add swap only if CONFIG_CPU_BIG_ENDIAN is configured.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org --- arch/arm/kvm/interrupts_head.S | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S index 6f18695..1e9be2f 100644 --- a/arch/arm/kvm/interrupts_head.S +++ b/arch/arm/kvm/interrupts_head.S @@ -1,4 +1,5 @@ #include <linux/irqchip/arm-gic.h> +#include <asm/assembler.h>
#define VCPU_USR_REG(_reg_nr) (VCPU_USR_REGS + (_reg_nr * 4)) #define VCPU_USR_SP (VCPU_USR_REG(13)) @@ -412,6 +413,14 @@ vcpu .req r0 @ vcpu pointer always in r0 ldr r8, [r2, #GICH_ELRSR0] ldr r9, [r2, #GICH_ELRSR1] ldr r10, [r2, #GICH_APR] +ARM_BE8(rev r3, r3 ) +ARM_BE8(rev r4, r4 ) +ARM_BE8(rev r5, r5 ) +ARM_BE8(rev r6, r6 ) +ARM_BE8(rev r7, r7 ) +ARM_BE8(rev r8, r8 ) +ARM_BE8(rev r9, r9 ) +ARM_BE8(rev r10, r10 )
str r3, [r11, #VGIC_CPU_HCR] str r4, [r11, #VGIC_CPU_VMCR] @@ -431,6 +440,7 @@ vcpu .req r0 @ vcpu pointer always in r0 add r3, r11, #VGIC_CPU_LR ldr r4, [r11, #VGIC_CPU_NR_LR] 1: ldr r6, [r2], #4 +ARM_BE8(rev r6, r6 ) str r6, [r3], #4 subs r4, r4, #1 bne 1b @@ -458,6 +468,9 @@ vcpu .req r0 @ vcpu pointer always in r0 ldr r3, [r11, #VGIC_CPU_HCR] ldr r4, [r11, #VGIC_CPU_VMCR] ldr r8, [r11, #VGIC_CPU_APR] +ARM_BE8(rev r3, r3 ) +ARM_BE8(rev r4, r4 ) +ARM_BE8(rev r8, r8 )
str r3, [r2, #GICH_HCR] str r4, [r2, #GICH_VMCR] @@ -468,6 +481,7 @@ vcpu .req r0 @ vcpu pointer always in r0 add r3, r11, #VGIC_CPU_LR ldr r4, [r11, #VGIC_CPU_NR_LR] 1: ldr r6, [r3], #4 +ARM_BE8(rev r6, r6 ) str r6, [r2], #4 subs r4, r4, #1 bne 1b
On Tue, Feb 11, 2014 at 09:41:28PM -0800, Victor Kamensky wrote:
The vgic h/w registers are little endian; when asm code reads/writes
when BE asm code...
from/to them, it needs to do byteswap after/before. Byteswap code uses ARM_BE8 wrapper to add swap only if CONFIG_CPU_BIG_ENDIAN is configured.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org
arch/arm/kvm/interrupts_head.S | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S index 6f18695..1e9be2f 100644 --- a/arch/arm/kvm/interrupts_head.S +++ b/arch/arm/kvm/interrupts_head.S @@ -1,4 +1,5 @@ #include <linux/irqchip/arm-gic.h> +#include <asm/assembler.h> #define VCPU_USR_REG(_reg_nr) (VCPU_USR_REGS + (_reg_nr * 4)) #define VCPU_USR_SP (VCPU_USR_REG(13)) @@ -412,6 +413,14 @@ vcpu .req r0 @ vcpu pointer always in r0 ldr r8, [r2, #GICH_ELRSR0] ldr r9, [r2, #GICH_ELRSR1] ldr r10, [r2, #GICH_APR] +ARM_BE8(rev r3, r3 ) +ARM_BE8(rev r4, r4 ) +ARM_BE8(rev r5, r5 ) +ARM_BE8(rev r6, r6 ) +ARM_BE8(rev r7, r7 ) +ARM_BE8(rev r8, r8 ) +ARM_BE8(rev r9, r9 ) +ARM_BE8(rev r10, r10 ) str r3, [r11, #VGIC_CPU_HCR] str r4, [r11, #VGIC_CPU_VMCR] @@ -431,6 +440,7 @@ vcpu .req r0 @ vcpu pointer always in r0 add r3, r11, #VGIC_CPU_LR ldr r4, [r11, #VGIC_CPU_NR_LR] 1: ldr r6, [r2], #4 +ARM_BE8(rev r6, r6 ) str r6, [r3], #4 subs r4, r4, #1 bne 1b @@ -458,6 +468,9 @@ vcpu .req r0 @ vcpu pointer always in r0 ldr r3, [r11, #VGIC_CPU_HCR] ldr r4, [r11, #VGIC_CPU_VMCR] ldr r8, [r11, #VGIC_CPU_APR] +ARM_BE8(rev r3, r3 ) +ARM_BE8(rev r4, r4 ) +ARM_BE8(rev r8, r8 ) str r3, [r2, #GICH_HCR] str r4, [r2, #GICH_VMCR] @@ -468,6 +481,7 @@ vcpu .req r0 @ vcpu pointer always in r0 add r3, r11, #VGIC_CPU_LR ldr r4, [r11, #VGIC_CPU_NR_LR] 1: ldr r6, [r3], #4 +ARM_BE8(rev r6, r6 ) str r6, [r2], #4 subs r4, r4, #1 bne 1b -- 1.8.1.4
Reviewed-by: Christoffer Dall christoffer.dall@linaro.org
In some cases the mcrr and mrrc instructions in combination with the ldrd and strd instructions need to deal with 64bit value in memory. The ldrd and strd instructions already handle endianness within word (register) boundaries but to get effect of the whole 64bit value represented correctly, rr_lo_hi macro is introduced and is used to swap registers positions when the mcrr and mrrc instructions are used. That has the effect of swapping two words.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org --- arch/arm/include/asm/kvm_asm.h | 23 +++++++++++++++++++++-- arch/arm/kvm/init.S | 4 ++-- arch/arm/kvm/interrupts.S | 4 ++-- arch/arm/kvm/interrupts_head.S | 18 +++++++++--------- 4 files changed, 34 insertions(+), 15 deletions(-)
diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h index 661da11..c6ae937 100644 --- a/arch/arm/include/asm/kvm_asm.h +++ b/arch/arm/include/asm/kvm_asm.h @@ -26,9 +26,9 @@ #define c1_ACTLR 4 /* Auxilliary Control Register */ #define c1_CPACR 5 /* Coprocessor Access Control */ #define c2_TTBR0 6 /* Translation Table Base Register 0 */ -#define c2_TTBR0_high 7 /* TTBR0 top 32 bits */ +#define c2_TTBR0_hilo 7 /* TTBR0 top 32 bits in LE case, low 32 bits in BE case */ #define c2_TTBR1 8 /* Translation Table Base Register 1 */ -#define c2_TTBR1_high 9 /* TTBR1 top 32 bits */ +#define c2_TTBR1_hilo 9 /* TTBR1 top 32 bits in LE case, low 32 bits in BE case */ #define c2_TTBCR 10 /* Translation Table Base Control R. */ #define c3_DACR 11 /* Domain Access Control Register */ #define c5_DFSR 12 /* Data Fault Status Register */ @@ -59,6 +59,25 @@ #define ARM_EXCEPTION_FIQ 6 #define ARM_EXCEPTION_HVC 7
+/* + * The rr_lo_hi macro swap pair of registers positions depending on + * current endianness. It is used in conjunction with ldrd and strd + * instructions that loads/store 64 bit value from/to memory to/from + * pair of registers which are used with the mrrc and mcrr instructions. + * The a1 parameter is register that typically holds lower address + * word (least significant word in LE, most significant in BE). The + * a2 parameter is register that holds higher address word. Note + * within word (single register) the ldrd/strd instruction already + * swap data correctly, only additional manipulation required with order + * of register to have effect of 64 bit value beeing effectively + * swapped. + */ +#ifdef CONFIG_CPU_ENDIAN_BE8 +#define rr_lo_hi(a1, a2) a2, a1 +#else +#define rr_lo_hi(a1, a2) a1, a2 +#endif + #ifndef __ASSEMBLY__ struct kvm; struct kvm_vcpu; diff --git a/arch/arm/kvm/init.S b/arch/arm/kvm/init.S index 74f0718..2d10b2d 100644 --- a/arch/arm/kvm/init.S +++ b/arch/arm/kvm/init.S @@ -74,7 +74,7 @@ __do_hyp_init: ARM_BE8(setend be) @ Switch to Big Endian mode if needed
@ Set the HTTBR to point to the hypervisor PGD pointer passed - mcrr p15, 4, r2, r3, c2 + mcrr p15, 4, rr_lo_hi(r2, r3), c2
@ Set the HTCR and VTCR to the same shareability and cacheability @ settings as the non-secure TTBCR and with T0SZ == 0. @@ -140,7 +140,7 @@ phase2: mov pc, r0
target: @ We're now in the trampoline code, switch page tables - mcrr p15, 4, r2, r3, c2 + mcrr p15, 4, rr_lo_hi(r2, r3), c2 isb
@ Invalidate the old TLBs diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S index ddc1553..f0696bd 100644 --- a/arch/arm/kvm/interrupts.S +++ b/arch/arm/kvm/interrupts.S @@ -52,7 +52,7 @@ ENTRY(__kvm_tlb_flush_vmid_ipa) dsb ishst add r0, r0, #KVM_VTTBR ldrd r2, r3, [r0] - mcrr p15, 6, r2, r3, c2 @ Write VTTBR + mcrr p15, 6, rr_lo_hi(r2, r3), c2 @ Write VTTBR isb mcr p15, 0, r0, c8, c3, 0 @ TLBIALLIS (rt ignored) dsb ish @@ -135,7 +135,7 @@ ENTRY(__kvm_vcpu_run) ldr r1, [vcpu, #VCPU_KVM] add r1, r1, #KVM_VTTBR ldrd r2, r3, [r1] - mcrr p15, 6, r2, r3, c2 @ Write VTTBR + mcrr p15, 6, rr_lo_hi(r2, r3), c2 @ Write VTTBR
@ We're all done, just restore the GPRs and go to the guest restore_guest_regs diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S index 1e9be2f..3409ed6 100644 --- a/arch/arm/kvm/interrupts_head.S +++ b/arch/arm/kvm/interrupts_head.S @@ -252,8 +252,8 @@ vcpu .req r0 @ vcpu pointer always in r0 mrc p15, 0, r3, c1, c0, 2 @ CPACR mrc p15, 0, r4, c2, c0, 2 @ TTBCR mrc p15, 0, r5, c3, c0, 0 @ DACR - mrrc p15, 0, r6, r7, c2 @ TTBR 0 - mrrc p15, 1, r8, r9, c2 @ TTBR 1 + mrrc p15, 0, rr_lo_hi(r6, r7), c2 @ TTBR 0 + mrrc p15, 1, rr_lo_hi(r8, r9), c2 @ TTBR 1 mrc p15, 0, r10, c10, c2, 0 @ PRRR mrc p15, 0, r11, c10, c2, 1 @ NMRR mrc p15, 2, r12, c0, c0, 0 @ CSSELR @@ -303,7 +303,7 @@ vcpu .req r0 @ vcpu pointer always in r0 .endif
mrc p15, 0, r2, c14, c1, 0 @ CNTKCTL - mrrc p15, 0, r4, r5, c7 @ PAR + mrrc p15, 0, rr_lo_hi(r4, r5), c7 @ PAR
.if \store_to_vcpu == 0 push {r2,r4-r5} @@ -331,7 +331,7 @@ vcpu .req r0 @ vcpu pointer always in r0 .endif
mcr p15, 0, r2, c14, c1, 0 @ CNTKCTL - mcrr p15, 0, r4, r5, c7 @ PAR + mcrr p15, 0, rr_lo_hi(r4, r5), c7 @ PAR
.if \read_from_vcpu == 0 pop {r2-r12} @@ -381,8 +381,8 @@ vcpu .req r0 @ vcpu pointer always in r0 mcr p15, 0, r3, c1, c0, 2 @ CPACR mcr p15, 0, r4, c2, c0, 2 @ TTBCR mcr p15, 0, r5, c3, c0, 0 @ DACR - mcrr p15, 0, r6, r7, c2 @ TTBR 0 - mcrr p15, 1, r8, r9, c2 @ TTBR 1 + mcrr p15, 0, rr_lo_hi(r6, r7), c2 @ TTBR 0 + mcrr p15, 1, rr_lo_hi(r8, r9), c2 @ TTBR 1 mcr p15, 0, r10, c10, c2, 0 @ PRRR mcr p15, 0, r11, c10, c2, 1 @ NMRR mcr p15, 2, r12, c0, c0, 0 @ CSSELR @@ -512,7 +512,7 @@ ARM_BE8(rev r6, r6 ) mcr p15, 0, r2, c14, c3, 1 @ CNTV_CTL isb
- mrrc p15, 3, r2, r3, c14 @ CNTV_CVAL + mrrc p15, 3, rr_lo_hi(r2, r3), c14 @ CNTV_CVAL ldr r4, =VCPU_TIMER_CNTV_CVAL add r5, vcpu, r4 strd r2, r3, [r5] @@ -552,12 +552,12 @@ ARM_BE8(rev r6, r6 )
ldr r2, [r4, #KVM_TIMER_CNTVOFF] ldr r3, [r4, #(KVM_TIMER_CNTVOFF + 4)] - mcrr p15, 4, r2, r3, c14 @ CNTVOFF + mcrr p15, 4, rr_lo_hi(r2, r3), c14 @ CNTVOFF
ldr r4, =VCPU_TIMER_CNTV_CVAL add r5, vcpu, r4 ldrd r2, r3, [r5] - mcrr p15, 3, r2, r3, c14 @ CNTV_CVAL + mcrr p15, 3, rr_lo_hi(r2, r3), c14 @ CNTV_CVAL isb
ldr r2, [vcpu, #VCPU_TIMER_CNTV_CTL]
On Tue, Feb 11, 2014 at 09:41:29PM -0800, Victor Kamensky wrote:
In some cases the mcrr and mrrc instructions in combination with the ldrd and strd instructions need to deal with 64bit value in memory. The ldrd and strd instructions already handle endianness within word (register) boundaries but to get effect of the whole 64bit value represented correctly, rr_lo_hi macro is introduced and is used to swap registers positions when the mcrr and mrrc instructions are used. That has the effect of swapping two words.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org
arch/arm/include/asm/kvm_asm.h | 23 +++++++++++++++++++++-- arch/arm/kvm/init.S | 4 ++-- arch/arm/kvm/interrupts.S | 4 ++-- arch/arm/kvm/interrupts_head.S | 18 +++++++++--------- 4 files changed, 34 insertions(+), 15 deletions(-)
diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h index 661da11..c6ae937 100644 --- a/arch/arm/include/asm/kvm_asm.h +++ b/arch/arm/include/asm/kvm_asm.h @@ -26,9 +26,9 @@ #define c1_ACTLR 4 /* Auxilliary Control Register */ #define c1_CPACR 5 /* Coprocessor Access Control */ #define c2_TTBR0 6 /* Translation Table Base Register 0 */ -#define c2_TTBR0_high 7 /* TTBR0 top 32 bits */ +#define c2_TTBR0_hilo 7 /* TTBR0 top 32 bits in LE case, low 32 bits in BE case */ #define c2_TTBR1 8 /* Translation Table Base Register 1 */ -#define c2_TTBR1_high 9 /* TTBR1 top 32 bits */ +#define c2_TTBR1_hilo 9 /* TTBR1 top 32 bits in LE case, low 32 bits in BE case */ #define c2_TTBCR 10 /* Translation Table Base Control R. */ #define c3_DACR 11 /* Domain Access Control Register */ #define c5_DFSR 12 /* Data Fault Status Register */ @@ -59,6 +59,25 @@ #define ARM_EXCEPTION_FIQ 6 #define ARM_EXCEPTION_HVC 7
Thanks for writing this comment below, a few language corrections:
+/*
- The rr_lo_hi macro swap pair of registers positions depending on
swaps a pair of registers depending on
- current endianness. It is used in conjunction with ldrd and strd
- instructions that loads/store 64 bit value from/to memory to/from
that load/store a 64-bit value
- pair of registers which are used with the mrrc and mcrr instructions.
a pair of
- The a1 parameter is register that typically holds lower address
- word (least significant word in LE, most significant in BE). The
- a2 parameter is register that holds higher address word.
I would rewrite this as:
If used with the ldrd/strd instructions, the a1 parameter is the first source/destination register and the a2 parameter is the second source/destination register.
Note
- within word (single register) the ldrd/strd instruction already
- swap data correctly, only additional manipulation required with order
- of register to have effect of 64 bit value beeing effectively
- swapped.
Note that the ldrd/strd instructions already swap the bytes within the words correctly according to the endianness setting, but the order of the registers need to be effectively swapped when used with the mrrc/mcrr instructions.
- */
+#ifdef CONFIG_CPU_ENDIAN_BE8 +#define rr_lo_hi(a1, a2) a2, a1 +#else +#define rr_lo_hi(a1, a2) a1, a2 +#endif
#ifndef __ASSEMBLY__ struct kvm; struct kvm_vcpu; diff --git a/arch/arm/kvm/init.S b/arch/arm/kvm/init.S index 74f0718..2d10b2d 100644 --- a/arch/arm/kvm/init.S +++ b/arch/arm/kvm/init.S @@ -74,7 +74,7 @@ __do_hyp_init: ARM_BE8(setend be) @ Switch to Big Endian mode if needed @ Set the HTTBR to point to the hypervisor PGD pointer passed
- mcrr p15, 4, r2, r3, c2
- mcrr p15, 4, rr_lo_hi(r2, r3), c2
@ Set the HTCR and VTCR to the same shareability and cacheability @ settings as the non-secure TTBCR and with T0SZ == 0. @@ -140,7 +140,7 @@ phase2: mov pc, r0 target: @ We're now in the trampoline code, switch page tables
- mcrr p15, 4, r2, r3, c2
- mcrr p15, 4, rr_lo_hi(r2, r3), c2 isb
@ Invalidate the old TLBs diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S index ddc1553..f0696bd 100644 --- a/arch/arm/kvm/interrupts.S +++ b/arch/arm/kvm/interrupts.S @@ -52,7 +52,7 @@ ENTRY(__kvm_tlb_flush_vmid_ipa) dsb ishst add r0, r0, #KVM_VTTBR ldrd r2, r3, [r0]
- mcrr p15, 6, r2, r3, c2 @ Write VTTBR
- mcrr p15, 6, rr_lo_hi(r2, r3), c2 @ Write VTTBR isb mcr p15, 0, r0, c8, c3, 0 @ TLBIALLIS (rt ignored) dsb ish
@@ -135,7 +135,7 @@ ENTRY(__kvm_vcpu_run) ldr r1, [vcpu, #VCPU_KVM] add r1, r1, #KVM_VTTBR ldrd r2, r3, [r1]
- mcrr p15, 6, r2, r3, c2 @ Write VTTBR
- mcrr p15, 6, rr_lo_hi(r2, r3), c2 @ Write VTTBR
@ We're all done, just restore the GPRs and go to the guest restore_guest_regs diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S index 1e9be2f..3409ed6 100644 --- a/arch/arm/kvm/interrupts_head.S +++ b/arch/arm/kvm/interrupts_head.S @@ -252,8 +252,8 @@ vcpu .req r0 @ vcpu pointer always in r0 mrc p15, 0, r3, c1, c0, 2 @ CPACR mrc p15, 0, r4, c2, c0, 2 @ TTBCR mrc p15, 0, r5, c3, c0, 0 @ DACR
- mrrc p15, 0, r6, r7, c2 @ TTBR 0
- mrrc p15, 1, r8, r9, c2 @ TTBR 1
- mrrc p15, 0, rr_lo_hi(r6, r7), c2 @ TTBR 0
- mrrc p15, 1, rr_lo_hi(r8, r9), c2 @ TTBR 1 mrc p15, 0, r10, c10, c2, 0 @ PRRR mrc p15, 0, r11, c10, c2, 1 @ NMRR mrc p15, 2, r12, c0, c0, 0 @ CSSELR
@@ -303,7 +303,7 @@ vcpu .req r0 @ vcpu pointer always in r0 .endif mrc p15, 0, r2, c14, c1, 0 @ CNTKCTL
- mrrc p15, 0, r4, r5, c7 @ PAR
- mrrc p15, 0, rr_lo_hi(r4, r5), c7 @ PAR
.if \store_to_vcpu == 0 push {r2,r4-r5} @@ -331,7 +331,7 @@ vcpu .req r0 @ vcpu pointer always in r0 .endif mcr p15, 0, r2, c14, c1, 0 @ CNTKCTL
- mcrr p15, 0, r4, r5, c7 @ PAR
- mcrr p15, 0, rr_lo_hi(r4, r5), c7 @ PAR
.if \read_from_vcpu == 0 pop {r2-r12} @@ -381,8 +381,8 @@ vcpu .req r0 @ vcpu pointer always in r0 mcr p15, 0, r3, c1, c0, 2 @ CPACR mcr p15, 0, r4, c2, c0, 2 @ TTBCR mcr p15, 0, r5, c3, c0, 0 @ DACR
- mcrr p15, 0, r6, r7, c2 @ TTBR 0
- mcrr p15, 1, r8, r9, c2 @ TTBR 1
- mcrr p15, 0, rr_lo_hi(r6, r7), c2 @ TTBR 0
- mcrr p15, 1, rr_lo_hi(r8, r9), c2 @ TTBR 1 mcr p15, 0, r10, c10, c2, 0 @ PRRR mcr p15, 0, r11, c10, c2, 1 @ NMRR mcr p15, 2, r12, c0, c0, 0 @ CSSELR
@@ -512,7 +512,7 @@ ARM_BE8(rev r6, r6 ) mcr p15, 0, r2, c14, c3, 1 @ CNTV_CTL isb
- mrrc p15, 3, r2, r3, c14 @ CNTV_CVAL
- mrrc p15, 3, rr_lo_hi(r2, r3), c14 @ CNTV_CVAL ldr r4, =VCPU_TIMER_CNTV_CVAL add r5, vcpu, r4 strd r2, r3, [r5]
@@ -552,12 +552,12 @@ ARM_BE8(rev r6, r6 ) ldr r2, [r4, #KVM_TIMER_CNTVOFF] ldr r3, [r4, #(KVM_TIMER_CNTVOFF + 4)]
- mcrr p15, 4, r2, r3, c14 @ CNTVOFF
- mcrr p15, 4, rr_lo_hi(r2, r3), c14 @ CNTVOFF
ldr r4, =VCPU_TIMER_CNTV_CVAL add r5, vcpu, r4 ldrd r2, r3, [r5]
- mcrr p15, 3, r2, r3, c14 @ CNTV_CVAL
- mcrr p15, 3, rr_lo_hi(r2, r3), c14 @ CNTV_CVAL isb
ldr r2, [vcpu, #VCPU_TIMER_CNTV_CTL] -- 1.8.1.4
Reviewed-by: Christoffer Dall christoffer.dall@linaro.org
The __kvm_vcpu_run function returns a 64-bit result in two registers, which has to be adjusted for BE case.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org --- arch/arm/kvm/interrupts.S | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S index f0696bd..5d27f7f 100644 --- a/arch/arm/kvm/interrupts.S +++ b/arch/arm/kvm/interrupts.S @@ -199,8 +199,13 @@ after_vfp_restore:
restore_host_regs clrex @ Clear exclusive monitor +#ifndef __ARMEB__ mov r0, r1 @ Return the return code mov r1, #0 @ Clear upper bits in return value +#else + @ r1 already has return code + mov r0, #0 @ Clear upper bits in return value +#endif /* __ARMEB__ */ bx lr @ return to IOCTL
/********************************************************************
On Tue, Feb 11, 2014 at 09:41:30PM -0800, Victor Kamensky wrote:
The __kvm_vcpu_run function returns a 64-bit result in two registers, which has to be adjusted for BE case.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org
arch/arm/kvm/interrupts.S | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S index f0696bd..5d27f7f 100644 --- a/arch/arm/kvm/interrupts.S +++ b/arch/arm/kvm/interrupts.S @@ -199,8 +199,13 @@ after_vfp_restore: restore_host_regs clrex @ Clear exclusive monitor +#ifndef __ARMEB__ mov r0, r1 @ Return the return code mov r1, #0 @ Clear upper bits in return value +#else
- @ r1 already has return code
- mov r0, #0 @ Clear upper bits in return value
+#endif /* __ARMEB__ */ bx lr @ return to IOCTL /******************************************************************** -- 1.8.1.4
Acked-by: Christoffer Dall christoffer.dall@linaro.org
Fix code that handles KVM_SET_ONE_REG, KVM_GET_ONE_REG ioctls to work in BE image. Before this fix get/set_one_reg functions worked correctly only in LE case - reg_from_user was taking 'void *' kernel address that actually could be target/source memory of either 4 bytes size or 8 bytes size, and code copied from/to user memory that could hold either 4 bytes register, 8 byte register or pair of 4 bytes registers.
For example note that there was a case when 4 bytes register was read from user-land to kernel target address of 8 bytes value. Because it was working in LE, least significant word was memcpy(ied) and it just worked. In BE code with 'void *' as target/source 'val' type it is impossible to tell whether 4 bytes register from user-land should be copied to 'val' address itself (4 bytes target) or it should be copied to 'val' + 4 (least significant word of 8 bytes value). So first change was to introduce strongly typed functions, where type of target/source 'val' is strongly defined:
reg_from_user64 - reads register from user-land to kernel 'u64 *val' address; register size could be 4 or 8 bytes reg_from_user32 - reads register(s) from user-land to kernel 'u32 *val' address; note it could be one or two 4 bytes registers reg_to_user64 - writes reigster from kernel 'u64 *val' address to user-land register memory; register size could be 4 or 8 bytes ret_to_user32 - writes register(s) from kernel 'u32 *val' address to user-land register(s) memory; note it could be one or two 4 bytes registers
All places where reg_from_user, reg_to_user functions were used, were changed to use either corresponding 64 or 32 bit variant of functions depending on type of source/target kernel memory variable.
In case of 'u64 *val' and register size equals 4 bytes, reg_from_user64 and reg_to_user64 work only with least siginificant word of target/source kernel value.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org --- arch/arm/kvm/coproc.c | 94 +++++++++++++++++++++++++++++++++++++-------------- 1 file changed, 69 insertions(+), 25 deletions(-)
diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c index 78c0885..64b2b94 100644 --- a/arch/arm/kvm/coproc.c +++ b/arch/arm/kvm/coproc.c @@ -634,17 +634,61 @@ static struct coproc_reg invariant_cp15[] = { { CRn( 0), CRm( 0), Op1( 1), Op2( 7), is32, NULL, get_AIDR }, };
-static int reg_from_user(void *val, const void __user *uaddr, u64 id) +static int reg_from_user64(u64 *val, const void __user *uaddr, u64 id) +{ + unsigned long regsize = KVM_REG_SIZE(id); + union { + u32 word; + u64 dword; + } tmp = {0}; + + if (copy_from_user(&tmp, uaddr, regsize) != 0) + return -EFAULT; + + switch (regsize) { + case 4: + *val = tmp.word; + break; + case 8: + *val = tmp.dword; + break; + } + return 0; +} + +/* Note it may really copy two u32 registers */ +static int reg_from_user32(u32 *val, const void __user *uaddr, u64 id) { - /* This Just Works because we are little endian. */ if (copy_from_user(val, uaddr, KVM_REG_SIZE(id)) != 0) return -EFAULT; return 0; }
-static int reg_to_user(void __user *uaddr, const void *val, u64 id) +static int reg_to_user64(void __user *uaddr, const u64 *val, u64 id) +{ + unsigned long regsize = KVM_REG_SIZE(id); + union { + u32 word; + u64 dword; + } tmp; + + switch (regsize) { + case 4: + tmp.word = *val; + break; + case 8: + tmp.dword = *val; + break; + } + + if (copy_to_user(uaddr, &tmp, regsize) != 0) + return -EFAULT; + return 0; +} + +/* Note it may really copy two u32 registers */ +static int reg_to_user32(void __user *uaddr, const u32 *val, u64 id) { - /* This Just Works because we are little endian. */ if (copy_to_user(uaddr, val, KVM_REG_SIZE(id)) != 0) return -EFAULT; return 0; @@ -662,7 +706,7 @@ static int get_invariant_cp15(u64 id, void __user *uaddr) if (!r) return -ENOENT;
- return reg_to_user(uaddr, &r->val, id); + return reg_to_user64(uaddr, &r->val, id); }
static int set_invariant_cp15(u64 id, void __user *uaddr) @@ -678,7 +722,7 @@ static int set_invariant_cp15(u64 id, void __user *uaddr) if (!r) return -ENOENT;
- err = reg_from_user(&val, uaddr, id); + err = reg_from_user64(&val, uaddr, id); if (err) return err;
@@ -846,7 +890,7 @@ static int vfp_get_reg(const struct kvm_vcpu *vcpu, u64 id, void __user *uaddr) if (vfpid < num_fp_regs()) { if (KVM_REG_SIZE(id) != 8) return -ENOENT; - return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpregs[vfpid], + return reg_to_user64(uaddr, &vcpu->arch.vfp_guest.fpregs[vfpid], id); }
@@ -856,22 +900,22 @@ static int vfp_get_reg(const struct kvm_vcpu *vcpu, u64 id, void __user *uaddr)
switch (vfpid) { case KVM_REG_ARM_VFP_FPEXC: - return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpexc, id); + return reg_to_user32(uaddr, &vcpu->arch.vfp_guest.fpexc, id); case KVM_REG_ARM_VFP_FPSCR: - return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpscr, id); + return reg_to_user32(uaddr, &vcpu->arch.vfp_guest.fpscr, id); case KVM_REG_ARM_VFP_FPINST: - return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpinst, id); + return reg_to_user32(uaddr, &vcpu->arch.vfp_guest.fpinst, id); case KVM_REG_ARM_VFP_FPINST2: - return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpinst2, id); + return reg_to_user32(uaddr, &vcpu->arch.vfp_guest.fpinst2, id); case KVM_REG_ARM_VFP_MVFR0: val = fmrx(MVFR0); - return reg_to_user(uaddr, &val, id); + return reg_to_user32(uaddr, &val, id); case KVM_REG_ARM_VFP_MVFR1: val = fmrx(MVFR1); - return reg_to_user(uaddr, &val, id); + return reg_to_user32(uaddr, &val, id); case KVM_REG_ARM_VFP_FPSID: val = fmrx(FPSID); - return reg_to_user(uaddr, &val, id); + return reg_to_user32(uaddr, &val, id); default: return -ENOENT; } @@ -890,8 +934,8 @@ static int vfp_set_reg(struct kvm_vcpu *vcpu, u64 id, const void __user *uaddr) if (vfpid < num_fp_regs()) { if (KVM_REG_SIZE(id) != 8) return -ENOENT; - return reg_from_user(&vcpu->arch.vfp_guest.fpregs[vfpid], - uaddr, id); + return reg_from_user64(&vcpu->arch.vfp_guest.fpregs[vfpid], + uaddr, id); }
/* FP control registers are all 32 bit. */ @@ -900,28 +944,28 @@ static int vfp_set_reg(struct kvm_vcpu *vcpu, u64 id, const void __user *uaddr)
switch (vfpid) { case KVM_REG_ARM_VFP_FPEXC: - return reg_from_user(&vcpu->arch.vfp_guest.fpexc, uaddr, id); + return reg_from_user32(&vcpu->arch.vfp_guest.fpexc, uaddr, id); case KVM_REG_ARM_VFP_FPSCR: - return reg_from_user(&vcpu->arch.vfp_guest.fpscr, uaddr, id); + return reg_from_user32(&vcpu->arch.vfp_guest.fpscr, uaddr, id); case KVM_REG_ARM_VFP_FPINST: - return reg_from_user(&vcpu->arch.vfp_guest.fpinst, uaddr, id); + return reg_from_user32(&vcpu->arch.vfp_guest.fpinst, uaddr, id); case KVM_REG_ARM_VFP_FPINST2: - return reg_from_user(&vcpu->arch.vfp_guest.fpinst2, uaddr, id); + return reg_from_user32(&vcpu->arch.vfp_guest.fpinst2, uaddr, id); /* These are invariant. */ case KVM_REG_ARM_VFP_MVFR0: - if (reg_from_user(&val, uaddr, id)) + if (reg_from_user32(&val, uaddr, id)) return -EFAULT; if (val != fmrx(MVFR0)) return -EINVAL; return 0; case KVM_REG_ARM_VFP_MVFR1: - if (reg_from_user(&val, uaddr, id)) + if (reg_from_user32(&val, uaddr, id)) return -EFAULT; if (val != fmrx(MVFR1)) return -EINVAL; return 0; case KVM_REG_ARM_VFP_FPSID: - if (reg_from_user(&val, uaddr, id)) + if (reg_from_user32(&val, uaddr, id)) return -EFAULT; if (val != fmrx(FPSID)) return -EINVAL; @@ -968,7 +1012,7 @@ int kvm_arm_coproc_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) return get_invariant_cp15(reg->id, uaddr);
/* Note: copies two regs if size is 64 bit. */ - return reg_to_user(uaddr, &vcpu->arch.cp15[r->reg], reg->id); + return reg_to_user32(uaddr, &vcpu->arch.cp15[r->reg], reg->id); }
int kvm_arm_coproc_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) @@ -987,7 +1031,7 @@ int kvm_arm_coproc_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) return set_invariant_cp15(reg->id, uaddr);
/* Note: copies two regs if size is 64 bit */ - return reg_from_user(&vcpu->arch.cp15[r->reg], uaddr, reg->id); + return reg_from_user32(&vcpu->arch.cp15[r->reg], uaddr, reg->id); }
static unsigned int num_demux_regs(void)
On 12.02.2014, at 06:41, Victor Kamensky victor.kamensky@linaro.org wrote:
Fix code that handles KVM_SET_ONE_REG, KVM_GET_ONE_REG ioctls to work in BE image. Before this fix get/set_one_reg functions worked correctly only in LE case - reg_from_user was taking 'void *' kernel address that actually could be target/source memory of either 4 bytes size or 8 bytes size, and code copied from/to user memory that could hold either 4 bytes register, 8 byte register or pair of 4 bytes registers.
For example note that there was a case when 4 bytes register was read from user-land to kernel target address of 8 bytes value. Because it was working in LE, least significant word was memcpy(ied) and it just worked. In BE code with 'void *' as target/source 'val' type it is impossible to tell whether 4 bytes register from user-land should be copied to 'val' address itself (4 bytes target) or it should be copied to 'val' + 4 (least significant word of 8 bytes value). So first change was to introduce strongly typed functions, where type of target/source 'val' is strongly defined:
reg_from_user64 - reads register from user-land to kernel 'u64 *val' address; register size could be 4 or 8 bytes reg_from_user32 - reads register(s) from user-land to kernel 'u32 *val' address; note it could be one or two 4 bytes registers reg_to_user64 - writes reigster from kernel 'u64 *val' address to user-land register memory; register size could be 4 or 8 bytes ret_to_user32 - writes register(s) from kernel 'u32 *val' address to user-land register(s) memory; note it could be one or two 4 bytes registers
All places where reg_from_user, reg_to_user functions were used, were changed to use either corresponding 64 or 32 bit variant of functions depending on type of source/target kernel memory variable.
In case of 'u64 *val' and register size equals 4 bytes, reg_from_user64 and reg_to_user64 work only with least siginificant word of target/source kernel value.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org
Do you think it'd be possible to converge to a single solution across PPC and ARM? On PPC we currently have "get_reg_val" and "set_reg_val" helpers that encode/decode reg structs for us. The actual copy_from_user and copy_to_user can be generic because we know the size of the access from the ONE_REG constant.
Alex
On Tue, Feb 11, 2014 at 09:41:31PM -0800, Victor Kamensky wrote:
Fix code that handles KVM_SET_ONE_REG, KVM_GET_ONE_REG ioctls to work in BE image. Before this fix get/set_one_reg functions worked correctly only in LE case - reg_from_user was taking 'void *' kernel address that actually could be target/source memory of either 4 bytes size or 8 bytes size, and code copied from/to user memory that could hold either 4 bytes register, 8 byte register or pair of 4 bytes registers.
For example note that there was a case when 4 bytes register was read from user-land to kernel target address of 8 bytes value. Because it was working in LE, least significant word was memcpy(ied) and it just worked. In BE code with 'void *' as target/source 'val' type it is impossible to tell whether 4 bytes register from user-land should be copied to 'val' address itself (4 bytes target) or it should be copied to 'val' + 4 (least significant word of 8 bytes value). So first change was to introduce strongly typed functions, where type of target/source 'val' is strongly defined:
reg_from_user64 - reads register from user-land to kernel 'u64 *val' address; register size could be 4 or 8 bytes reg_from_user32 - reads register(s) from user-land to kernel 'u32 *val' address; note it could be one or two 4 bytes registers reg_to_user64 - writes reigster from kernel 'u64 *val' address to
s/reigster/register/
user-land register memory; register size could be 4 or 8 bytes
ret_to_user32 - writes register(s) from kernel 'u32 *val' address to user-land register(s) memory; note it could be one or two 4 bytes registers
If we are really going to keep this scheme, then please add these comments to functions themselves.
All places where reg_from_user, reg_to_user functions were used, were changed to use either corresponding 64 or 32 bit variant of functions depending on type of source/target kernel memory variable.
In case of 'u64 *val' and register size equals 4 bytes, reg_from_user64 and reg_to_user64 work only with least siginificant word of target/source kernel value.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org
arch/arm/kvm/coproc.c | 94 +++++++++++++++++++++++++++++++++++++-------------- 1 file changed, 69 insertions(+), 25 deletions(-)
diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c index 78c0885..64b2b94 100644 --- a/arch/arm/kvm/coproc.c +++ b/arch/arm/kvm/coproc.c @@ -634,17 +634,61 @@ static struct coproc_reg invariant_cp15[] = { { CRn( 0), CRm( 0), Op1( 1), Op2( 7), is32, NULL, get_AIDR }, }; -static int reg_from_user(void *val, const void __user *uaddr, u64 id) +static int reg_from_user64(u64 *val, const void __user *uaddr, u64 id) +{
- unsigned long regsize = KVM_REG_SIZE(id);
- union {
u32 word;
u64 dword;
- } tmp = {0};
- if (copy_from_user(&tmp, uaddr, regsize) != 0)
return -EFAULT;
- switch (regsize) {
- case 4:
*val = tmp.word;
break;
- case 8:
*val = tmp.dword;
break;
- }
- return 0;
+}
instead of allowing this 'packing 32 bit values in 64 bit values and packing two 32 bit values in 64 bit values' kernel implementation'y stuff to be passed to these user-space ABI catering function, I really think you want to make reg_from_user64 deal with 64-bit registers, that is, BUG_ON(KVM_REG_SIZE(id) != 8.
If the kernel stores what user space sees as a 32-bit register in a 64-bit register, then let the caller deal with this mess.
If the kernel stores what user space sees as a 64-bit register in two 32-bit registers, then let the caller deal with that mess.
Imagine someone who hasn't been through the development of this code seeing these two functions for the first time without having read your commit message, I think the margin for error here is too large.
If you can share these functions like Alex suggests, then that would make for a much cleaner function API as well.
+/* Note it may really copy two u32 registers */ +static int reg_from_user32(u32 *val, const void __user *uaddr, u64 id) {
- /* This Just Works because we are little endian. */ if (copy_from_user(val, uaddr, KVM_REG_SIZE(id)) != 0) return -EFAULT; return 0;
} -static int reg_to_user(void __user *uaddr, const void *val, u64 id) +static int reg_to_user64(void __user *uaddr, const u64 *val, u64 id) +{
- unsigned long regsize = KVM_REG_SIZE(id);
- union {
u32 word;
u64 dword;
- } tmp;
- switch (regsize) {
- case 4:
tmp.word = *val;
break;
- case 8:
tmp.dword = *val;
break;
- }
- if (copy_to_user(uaddr, &tmp, regsize) != 0)
return -EFAULT;
- return 0;
+}
+/* Note it may really copy two u32 registers */ +static int reg_to_user32(void __user *uaddr, const u32 *val, u64 id) {
- /* This Just Works because we are little endian. */ if (copy_to_user(uaddr, val, KVM_REG_SIZE(id)) != 0) return -EFAULT; return 0;
@@ -662,7 +706,7 @@ static int get_invariant_cp15(u64 id, void __user *uaddr) if (!r) return -ENOENT;
- return reg_to_user(uaddr, &r->val, id);
- return reg_to_user64(uaddr, &r->val, id);
} static int set_invariant_cp15(u64 id, void __user *uaddr) @@ -678,7 +722,7 @@ static int set_invariant_cp15(u64 id, void __user *uaddr) if (!r) return -ENOENT;
- err = reg_from_user(&val, uaddr, id);
- err = reg_from_user64(&val, uaddr, id); if (err) return err;
@@ -846,7 +890,7 @@ static int vfp_get_reg(const struct kvm_vcpu *vcpu, u64 id, void __user *uaddr) if (vfpid < num_fp_regs()) { if (KVM_REG_SIZE(id) != 8) return -ENOENT;
return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpregs[vfpid],
}return reg_to_user64(uaddr, &vcpu->arch.vfp_guest.fpregs[vfpid], id);
@@ -856,22 +900,22 @@ static int vfp_get_reg(const struct kvm_vcpu *vcpu, u64 id, void __user *uaddr) switch (vfpid) { case KVM_REG_ARM_VFP_FPEXC:
return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpexc, id);
case KVM_REG_ARM_VFP_FPSCR:return reg_to_user32(uaddr, &vcpu->arch.vfp_guest.fpexc, id);
return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpscr, id);
case KVM_REG_ARM_VFP_FPINST:return reg_to_user32(uaddr, &vcpu->arch.vfp_guest.fpscr, id);
return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpinst, id);
case KVM_REG_ARM_VFP_FPINST2:return reg_to_user32(uaddr, &vcpu->arch.vfp_guest.fpinst, id);
return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpinst2, id);
case KVM_REG_ARM_VFP_MVFR0: val = fmrx(MVFR0);return reg_to_user32(uaddr, &vcpu->arch.vfp_guest.fpinst2, id);
return reg_to_user(uaddr, &val, id);
case KVM_REG_ARM_VFP_MVFR1: val = fmrx(MVFR1);return reg_to_user32(uaddr, &val, id);
return reg_to_user(uaddr, &val, id);
case KVM_REG_ARM_VFP_FPSID: val = fmrx(FPSID);return reg_to_user32(uaddr, &val, id);
return reg_to_user(uaddr, &val, id);
default: return -ENOENT; }return reg_to_user32(uaddr, &val, id);
@@ -890,8 +934,8 @@ static int vfp_set_reg(struct kvm_vcpu *vcpu, u64 id, const void __user *uaddr) if (vfpid < num_fp_regs()) { if (KVM_REG_SIZE(id) != 8) return -ENOENT;
return reg_from_user(&vcpu->arch.vfp_guest.fpregs[vfpid],
uaddr, id);
return reg_from_user64(&vcpu->arch.vfp_guest.fpregs[vfpid],
}uaddr, id);
/* FP control registers are all 32 bit. */ @@ -900,28 +944,28 @@ static int vfp_set_reg(struct kvm_vcpu *vcpu, u64 id, const void __user *uaddr) switch (vfpid) { case KVM_REG_ARM_VFP_FPEXC:
return reg_from_user(&vcpu->arch.vfp_guest.fpexc, uaddr, id);
case KVM_REG_ARM_VFP_FPSCR:return reg_from_user32(&vcpu->arch.vfp_guest.fpexc, uaddr, id);
return reg_from_user(&vcpu->arch.vfp_guest.fpscr, uaddr, id);
case KVM_REG_ARM_VFP_FPINST:return reg_from_user32(&vcpu->arch.vfp_guest.fpscr, uaddr, id);
return reg_from_user(&vcpu->arch.vfp_guest.fpinst, uaddr, id);
case KVM_REG_ARM_VFP_FPINST2:return reg_from_user32(&vcpu->arch.vfp_guest.fpinst, uaddr, id);
return reg_from_user(&vcpu->arch.vfp_guest.fpinst2, uaddr, id);
/* These are invariant. */ case KVM_REG_ARM_VFP_MVFR0:return reg_from_user32(&vcpu->arch.vfp_guest.fpinst2, uaddr, id);
if (reg_from_user(&val, uaddr, id))
if (val != fmrx(MVFR0)) return -EINVAL; return 0; case KVM_REG_ARM_VFP_MVFR1:if (reg_from_user32(&val, uaddr, id)) return -EFAULT;
if (reg_from_user(&val, uaddr, id))
if (val != fmrx(MVFR1)) return -EINVAL; return 0; case KVM_REG_ARM_VFP_FPSID:if (reg_from_user32(&val, uaddr, id)) return -EFAULT;
if (reg_from_user(&val, uaddr, id))
if (val != fmrx(FPSID)) return -EINVAL;if (reg_from_user32(&val, uaddr, id)) return -EFAULT;
@@ -968,7 +1012,7 @@ int kvm_arm_coproc_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) return get_invariant_cp15(reg->id, uaddr); /* Note: copies two regs if size is 64 bit. */
- return reg_to_user(uaddr, &vcpu->arch.cp15[r->reg], reg->id);
- return reg_to_user32(uaddr, &vcpu->arch.cp15[r->reg], reg->id);
} int kvm_arm_coproc_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) @@ -987,7 +1031,7 @@ int kvm_arm_coproc_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) return set_invariant_cp15(reg->id, uaddr); /* Note: copies two regs if size is 64 bit */
- return reg_from_user(&vcpu->arch.cp15[r->reg], uaddr, reg->id);
- return reg_from_user32(&vcpu->arch.cp15[r->reg], uaddr, reg->id);
} static unsigned int num_demux_regs(void) -- 1.8.1.4
Thanks,
Hi Christoffer,
Appreciate your time and review comments. Please see response inline.
On 19 March 2014 18:11, Christoffer Dall christoffer.dall@linaro.org wrote:
On Tue, Feb 11, 2014 at 09:41:31PM -0800, Victor Kamensky wrote:
Fix code that handles KVM_SET_ONE_REG, KVM_GET_ONE_REG ioctls to work in BE image. Before this fix get/set_one_reg functions worked correctly only in LE case - reg_from_user was taking 'void *' kernel address that actually could be target/source memory of either 4 bytes size or 8 bytes size, and code copied from/to user memory that could hold either 4 bytes register, 8 byte register or pair of 4 bytes registers.
For example note that there was a case when 4 bytes register was read from user-land to kernel target address of 8 bytes value. Because it was working in LE, least significant word was memcpy(ied) and it just worked. In BE code with 'void *' as target/source 'val' type it is impossible to tell whether 4 bytes register from user-land should be copied to 'val' address itself (4 bytes target) or it should be copied to 'val' + 4 (least significant word of 8 bytes value). So first change was to introduce strongly typed functions, where type of target/source 'val' is strongly defined:
reg_from_user64 - reads register from user-land to kernel 'u64 *val' address; register size could be 4 or 8 bytes reg_from_user32 - reads register(s) from user-land to kernel 'u32 *val' address; note it could be one or two 4 bytes registers reg_to_user64 - writes reigster from kernel 'u64 *val' address to
s/reigster/register/
user-land register memory; register size could be 4 or 8 bytes
ret_to_user32 - writes register(s) from kernel 'u32 *val' address to user-land register(s) memory; note it could be one or two 4 bytes registers
If we are really going to keep this scheme, then please add these comments to functions themselves.
All places where reg_from_user, reg_to_user functions were used, were changed to use either corresponding 64 or 32 bit variant of functions depending on type of source/target kernel memory variable.
In case of 'u64 *val' and register size equals 4 bytes, reg_from_user64 and reg_to_user64 work only with least siginificant word of target/source kernel value.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org
arch/arm/kvm/coproc.c | 94 +++++++++++++++++++++++++++++++++++++-------------- 1 file changed, 69 insertions(+), 25 deletions(-)
diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c index 78c0885..64b2b94 100644 --- a/arch/arm/kvm/coproc.c +++ b/arch/arm/kvm/coproc.c @@ -634,17 +634,61 @@ static struct coproc_reg invariant_cp15[] = { { CRn( 0), CRm( 0), Op1( 1), Op2( 7), is32, NULL, get_AIDR }, };
-static int reg_from_user(void *val, const void __user *uaddr, u64 id) +static int reg_from_user64(u64 *val, const void __user *uaddr, u64 id) +{
unsigned long regsize = KVM_REG_SIZE(id);
union {
u32 word;
u64 dword;
} tmp = {0};
if (copy_from_user(&tmp, uaddr, regsize) != 0)
return -EFAULT;
switch (regsize) {
case 4:
*val = tmp.word;
break;
case 8:
*val = tmp.dword;
break;
}
return 0;
+}
instead of allowing this 'packing 32 bit values in 64 bit values and packing two 32 bit values in 64 bit values' kernel implementation'y stuff to be passed to these user-space ABI catering function, I really think you want to make reg_from_user64 deal with 64-bit registers, that is, BUG_ON(KVM_REG_SIZE(id) != 8.
If the kernel stores what user space sees as a 32-bit register in a 64-bit register, then let the caller deal with this mess.
If the kernel stores what user space sees as a 64-bit register in two 32-bit registers, then let the caller deal with that mess.
Imagine someone who hasn't been through the development of this code seeing these two functions for the first time without having read your commit message, I think the margin for error here is too large.
If you can share these functions like Alex suggests, then that would make for a much cleaner function API as well.
During LCA14 I had discussion with Alex about changing and sharing one_reg endianness functions with ppc code and with V8 side as well ... Alex outlined how it should be done and showed me ppc side of this code. He asked me to check with you before I start moving into this direction. I could not catch you during remaining LCA14 days. But now it looks you are on the same page ...
Let me try to rework code along Alex's suggestion and see how it will look like. I will take in account your above suggestions and will try to clean 64/32 bit overload. Since it will have bigger scope and shared with ppc side I wonder maybe I should pull this patch out of this series and handle it as later additions.
Thanks, Victor
+/* Note it may really copy two u32 registers */ +static int reg_from_user32(u32 *val, const void __user *uaddr, u64 id) {
/* This Just Works because we are little endian. */ if (copy_from_user(val, uaddr, KVM_REG_SIZE(id)) != 0) return -EFAULT; return 0;
}
-static int reg_to_user(void __user *uaddr, const void *val, u64 id) +static int reg_to_user64(void __user *uaddr, const u64 *val, u64 id) +{
unsigned long regsize = KVM_REG_SIZE(id);
union {
u32 word;
u64 dword;
} tmp;
switch (regsize) {
case 4:
tmp.word = *val;
break;
case 8:
tmp.dword = *val;
break;
}
if (copy_to_user(uaddr, &tmp, regsize) != 0)
return -EFAULT;
return 0;
+}
+/* Note it may really copy two u32 registers */ +static int reg_to_user32(void __user *uaddr, const u32 *val, u64 id) {
/* This Just Works because we are little endian. */ if (copy_to_user(uaddr, val, KVM_REG_SIZE(id)) != 0) return -EFAULT; return 0;
@@ -662,7 +706,7 @@ static int get_invariant_cp15(u64 id, void __user *uaddr) if (!r) return -ENOENT;
return reg_to_user(uaddr, &r->val, id);
return reg_to_user64(uaddr, &r->val, id);
}
static int set_invariant_cp15(u64 id, void __user *uaddr) @@ -678,7 +722,7 @@ static int set_invariant_cp15(u64 id, void __user *uaddr) if (!r) return -ENOENT;
err = reg_from_user(&val, uaddr, id);
err = reg_from_user64(&val, uaddr, id); if (err) return err;
@@ -846,7 +890,7 @@ static int vfp_get_reg(const struct kvm_vcpu *vcpu, u64 id, void __user *uaddr) if (vfpid < num_fp_regs()) { if (KVM_REG_SIZE(id) != 8) return -ENOENT;
return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpregs[vfpid],
return reg_to_user64(uaddr, &vcpu->arch.vfp_guest.fpregs[vfpid], id); }
@@ -856,22 +900,22 @@ static int vfp_get_reg(const struct kvm_vcpu *vcpu, u64 id, void __user *uaddr)
switch (vfpid) { case KVM_REG_ARM_VFP_FPEXC:
return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpexc, id);
return reg_to_user32(uaddr, &vcpu->arch.vfp_guest.fpexc, id); case KVM_REG_ARM_VFP_FPSCR:
return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpscr, id);
return reg_to_user32(uaddr, &vcpu->arch.vfp_guest.fpscr, id); case KVM_REG_ARM_VFP_FPINST:
return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpinst, id);
return reg_to_user32(uaddr, &vcpu->arch.vfp_guest.fpinst, id); case KVM_REG_ARM_VFP_FPINST2:
return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpinst2, id);
return reg_to_user32(uaddr, &vcpu->arch.vfp_guest.fpinst2, id); case KVM_REG_ARM_VFP_MVFR0: val = fmrx(MVFR0);
return reg_to_user(uaddr, &val, id);
return reg_to_user32(uaddr, &val, id); case KVM_REG_ARM_VFP_MVFR1: val = fmrx(MVFR1);
return reg_to_user(uaddr, &val, id);
return reg_to_user32(uaddr, &val, id); case KVM_REG_ARM_VFP_FPSID: val = fmrx(FPSID);
return reg_to_user(uaddr, &val, id);
return reg_to_user32(uaddr, &val, id); default: return -ENOENT; }
@@ -890,8 +934,8 @@ static int vfp_set_reg(struct kvm_vcpu *vcpu, u64 id, const void __user *uaddr) if (vfpid < num_fp_regs()) { if (KVM_REG_SIZE(id) != 8) return -ENOENT;
return reg_from_user(&vcpu->arch.vfp_guest.fpregs[vfpid],
uaddr, id);
return reg_from_user64(&vcpu->arch.vfp_guest.fpregs[vfpid],
uaddr, id); } /* FP control registers are all 32 bit. */
@@ -900,28 +944,28 @@ static int vfp_set_reg(struct kvm_vcpu *vcpu, u64 id, const void __user *uaddr)
switch (vfpid) { case KVM_REG_ARM_VFP_FPEXC:
return reg_from_user(&vcpu->arch.vfp_guest.fpexc, uaddr, id);
return reg_from_user32(&vcpu->arch.vfp_guest.fpexc, uaddr, id); case KVM_REG_ARM_VFP_FPSCR:
return reg_from_user(&vcpu->arch.vfp_guest.fpscr, uaddr, id);
return reg_from_user32(&vcpu->arch.vfp_guest.fpscr, uaddr, id); case KVM_REG_ARM_VFP_FPINST:
return reg_from_user(&vcpu->arch.vfp_guest.fpinst, uaddr, id);
return reg_from_user32(&vcpu->arch.vfp_guest.fpinst, uaddr, id); case KVM_REG_ARM_VFP_FPINST2:
return reg_from_user(&vcpu->arch.vfp_guest.fpinst2, uaddr, id);
return reg_from_user32(&vcpu->arch.vfp_guest.fpinst2, uaddr, id); /* These are invariant. */ case KVM_REG_ARM_VFP_MVFR0:
if (reg_from_user(&val, uaddr, id))
if (reg_from_user32(&val, uaddr, id)) return -EFAULT; if (val != fmrx(MVFR0)) return -EINVAL; return 0; case KVM_REG_ARM_VFP_MVFR1:
if (reg_from_user(&val, uaddr, id))
if (reg_from_user32(&val, uaddr, id)) return -EFAULT; if (val != fmrx(MVFR1)) return -EINVAL; return 0; case KVM_REG_ARM_VFP_FPSID:
if (reg_from_user(&val, uaddr, id))
if (reg_from_user32(&val, uaddr, id)) return -EFAULT; if (val != fmrx(FPSID)) return -EINVAL;
@@ -968,7 +1012,7 @@ int kvm_arm_coproc_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) return get_invariant_cp15(reg->id, uaddr);
/* Note: copies two regs if size is 64 bit. */
return reg_to_user(uaddr, &vcpu->arch.cp15[r->reg], reg->id);
return reg_to_user32(uaddr, &vcpu->arch.cp15[r->reg], reg->id);
}
int kvm_arm_coproc_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) @@ -987,7 +1031,7 @@ int kvm_arm_coproc_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) return set_invariant_cp15(reg->id, uaddr);
/* Note: copies two regs if size is 64 bit */
return reg_from_user(&vcpu->arch.cp15[r->reg], uaddr, reg->id);
return reg_from_user32(&vcpu->arch.cp15[r->reg], uaddr, reg->id);
}
static unsigned int num_demux_regs(void)
1.8.1.4
Thanks,
Christoffer
On Wed, Mar 19, 2014 at 06:48:07PM -0700, Victor Kamensky wrote:
Hi Christoffer,
Appreciate your time and review comments. Please see response inline.
On 19 March 2014 18:11, Christoffer Dall christoffer.dall@linaro.org wrote:
On Tue, Feb 11, 2014 at 09:41:31PM -0800, Victor Kamensky wrote:
Fix code that handles KVM_SET_ONE_REG, KVM_GET_ONE_REG ioctls to work in BE image. Before this fix get/set_one_reg functions worked correctly only in LE case - reg_from_user was taking 'void *' kernel address that actually could be target/source memory of either 4 bytes size or 8 bytes size, and code copied from/to user memory that could hold either 4 bytes register, 8 byte register or pair of 4 bytes registers.
For example note that there was a case when 4 bytes register was read from user-land to kernel target address of 8 bytes value. Because it was working in LE, least significant word was memcpy(ied) and it just worked. In BE code with 'void *' as target/source 'val' type it is impossible to tell whether 4 bytes register from user-land should be copied to 'val' address itself (4 bytes target) or it should be copied to 'val' + 4 (least significant word of 8 bytes value). So first change was to introduce strongly typed functions, where type of target/source 'val' is strongly defined:
reg_from_user64 - reads register from user-land to kernel 'u64 *val' address; register size could be 4 or 8 bytes reg_from_user32 - reads register(s) from user-land to kernel 'u32 *val' address; note it could be one or two 4 bytes registers reg_to_user64 - writes reigster from kernel 'u64 *val' address to
s/reigster/register/
user-land register memory; register size could be 4 or 8 bytes
ret_to_user32 - writes register(s) from kernel 'u32 *val' address to user-land register(s) memory; note it could be one or two 4 bytes registers
If we are really going to keep this scheme, then please add these comments to functions themselves.
All places where reg_from_user, reg_to_user functions were used, were changed to use either corresponding 64 or 32 bit variant of functions depending on type of source/target kernel memory variable.
In case of 'u64 *val' and register size equals 4 bytes, reg_from_user64 and reg_to_user64 work only with least siginificant word of target/source kernel value.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org
arch/arm/kvm/coproc.c | 94 +++++++++++++++++++++++++++++++++++++-------------- 1 file changed, 69 insertions(+), 25 deletions(-)
diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c index 78c0885..64b2b94 100644 --- a/arch/arm/kvm/coproc.c +++ b/arch/arm/kvm/coproc.c @@ -634,17 +634,61 @@ static struct coproc_reg invariant_cp15[] = { { CRn( 0), CRm( 0), Op1( 1), Op2( 7), is32, NULL, get_AIDR }, };
-static int reg_from_user(void *val, const void __user *uaddr, u64 id) +static int reg_from_user64(u64 *val, const void __user *uaddr, u64 id) +{
unsigned long regsize = KVM_REG_SIZE(id);
union {
u32 word;
u64 dword;
} tmp = {0};
if (copy_from_user(&tmp, uaddr, regsize) != 0)
return -EFAULT;
switch (regsize) {
case 4:
*val = tmp.word;
break;
case 8:
*val = tmp.dword;
break;
}
return 0;
+}
instead of allowing this 'packing 32 bit values in 64 bit values and packing two 32 bit values in 64 bit values' kernel implementation'y stuff to be passed to these user-space ABI catering function, I really think you want to make reg_from_user64 deal with 64-bit registers, that is, BUG_ON(KVM_REG_SIZE(id) != 8.
If the kernel stores what user space sees as a 32-bit register in a 64-bit register, then let the caller deal with this mess.
If the kernel stores what user space sees as a 64-bit register in two 32-bit registers, then let the caller deal with that mess.
Imagine someone who hasn't been through the development of this code seeing these two functions for the first time without having read your commit message, I think the margin for error here is too large.
If you can share these functions like Alex suggests, then that would make for a much cleaner function API as well.
During LCA14 I had discussion with Alex about changing and sharing one_reg endianness functions with ppc code and with V8 side as well ... Alex outlined how it should be done and showed me ppc side of this code. He asked me to check with you before I start moving into this direction. I could not catch you during remaining LCA14 days. But now it looks you are on the same page ...
yeah, sorry, lots of stuff happening at Linaro Connect always. I am all for sharing this code.
Let me try to rework code along Alex's suggestion and see how it will look like. I will take in account your above suggestions and will try to clean 64/32 bit overload. Since it will have bigger scope and shared with ppc side I wonder maybe I should pull this patch out of this series and handle it as later additions.
I think you should have that as a preparatory patch series on which this series depends.
Thanks, -Christoffer
Hi Christoffer,
In just posted update on ARM BE KVM series. I posted new version of this patch [1] where I could not address all below comments. Please see my responses inline
On 19 March 2014 18:11, Christoffer Dall christoffer.dall@linaro.org wrote:
On Tue, Feb 11, 2014 at 09:41:31PM -0800, Victor Kamensky wrote:
Fix code that handles KVM_SET_ONE_REG, KVM_GET_ONE_REG ioctls to work in BE image. Before this fix get/set_one_reg functions worked correctly only in LE case - reg_from_user was taking 'void *' kernel address that actually could be target/source memory of either 4 bytes size or 8 bytes size, and code copied from/to user memory that could hold either 4 bytes register, 8 byte register or pair of 4 bytes registers.
For example note that there was a case when 4 bytes register was read from user-land to kernel target address of 8 bytes value. Because it was working in LE, least significant word was memcpy(ied) and it just worked. In BE code with 'void *' as target/source 'val' type it is impossible to tell whether 4 bytes register from user-land should be copied to 'val' address itself (4 bytes target) or it should be copied to 'val' + 4 (least significant word of 8 bytes value). So first change was to introduce strongly typed functions, where type of target/source 'val' is strongly defined:
reg_from_user64 - reads register from user-land to kernel 'u64 *val' address; register size could be 4 or 8 bytes reg_from_user32 - reads register(s) from user-land to kernel 'u32 *val' address; note it could be one or two 4 bytes registers reg_to_user64 - writes reigster from kernel 'u64 *val' address to
s/reigster/register/
Done
user-land register memory; register size could be 4 or 8 bytes
ret_to_user32 - writes register(s) from kernel 'u32 *val' address to user-land register(s) memory; note it could be one or two 4 bytes registers
If we are really going to keep this scheme, then please add these comments to functions themselves.
Done
All places where reg_from_user, reg_to_user functions were used, were changed to use either corresponding 64 or 32 bit variant of functions depending on type of source/target kernel memory variable.
In case of 'u64 *val' and register size equals 4 bytes, reg_from_user64 and reg_to_user64 work only with least siginificant word of target/source kernel value.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org
arch/arm/kvm/coproc.c | 94 +++++++++++++++++++++++++++++++++++++-------------- 1 file changed, 69 insertions(+), 25 deletions(-)
diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c index 78c0885..64b2b94 100644 --- a/arch/arm/kvm/coproc.c +++ b/arch/arm/kvm/coproc.c @@ -634,17 +634,61 @@ static struct coproc_reg invariant_cp15[] = { { CRn( 0), CRm( 0), Op1( 1), Op2( 7), is32, NULL, get_AIDR }, };
-static int reg_from_user(void *val, const void __user *uaddr, u64 id) +static int reg_from_user64(u64 *val, const void __user *uaddr, u64 id) +{
unsigned long regsize = KVM_REG_SIZE(id);
union {
u32 word;
u64 dword;
} tmp = {0};
if (copy_from_user(&tmp, uaddr, regsize) != 0)
return -EFAULT;
switch (regsize) {
case 4:
*val = tmp.word;
break;
case 8:
*val = tmp.dword;
break;
}
return 0;
+}
instead of allowing this 'packing 32 bit values in 64 bit values and packing two 32 bit values in 64 bit values' kernel implementation'y stuff to be passed to these user-space ABI catering function, I really think you want to make reg_from_user64 deal with 64-bit registers, that is, BUG_ON(KVM_REG_SIZE(id) != 8.
Example of kernel storing value in u64 but register is 32bit is set/get_invariant_cp15. Note this function passes struct coproc_reg val field address into reg_to_user_xxx function. Since struct coproc_reg is generic structure used to describe both 32bit and 64bit coproc registers val type cannot be changed to u32. One may try to describe val as union of u32 and u64 but I have concern that it would create bunch of 'if (KVM_REG_SIZE(id) == 4) { ... } else { ..}' pieces all over the place. I think it is better to have function that just can read/write 32bit register from/to u64 value in kernel.
If the kernel stores what user space sees as a 32-bit register in a 64-bit register, then let the caller deal with this mess.
If the kernel stores what user space sees as a 64-bit register in two 32-bit registers, then let the caller deal with that mess.
Example of above is TTBR0 register. It is accessed by user land as 64bit register but in kernel it is stored as two u32 register TTBR0_low and TTBR0_high. Note code that stores and restores those does not treat as one 64bit register instead it is a pair
Imagine someone who hasn't been through the development of this code seeing these two functions for the first time without having read your commit message, I think the margin for error here is too large.
I understand concern. I added comments around reg_to/from_user_xx functions and I think I picked better name for functions. Hopefully it is enough if this approach is followed.
Also I could be missing better way to address your comments, if if you have ideas how it should be implemented please let me know.
If you can share these functions like Alex suggests, then that would make for a much cleaner function API as well.
I'll looked at approach that Alex suggested, I will describe issues with it in separate email.
Thanks, Victor
[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2014-May/255856.html
+/* Note it may really copy two u32 registers */ +static int reg_from_user32(u32 *val, const void __user *uaddr, u64 id) {
/* This Just Works because we are little endian. */ if (copy_from_user(val, uaddr, KVM_REG_SIZE(id)) != 0) return -EFAULT; return 0;
}
-static int reg_to_user(void __user *uaddr, const void *val, u64 id) +static int reg_to_user64(void __user *uaddr, const u64 *val, u64 id) +{
unsigned long regsize = KVM_REG_SIZE(id);
union {
u32 word;
u64 dword;
} tmp;
switch (regsize) {
case 4:
tmp.word = *val;
break;
case 8:
tmp.dword = *val;
break;
}
if (copy_to_user(uaddr, &tmp, regsize) != 0)
return -EFAULT;
return 0;
+}
+/* Note it may really copy two u32 registers */ +static int reg_to_user32(void __user *uaddr, const u32 *val, u64 id) {
/* This Just Works because we are little endian. */ if (copy_to_user(uaddr, val, KVM_REG_SIZE(id)) != 0) return -EFAULT; return 0;
@@ -662,7 +706,7 @@ static int get_invariant_cp15(u64 id, void __user *uaddr) if (!r) return -ENOENT;
return reg_to_user(uaddr, &r->val, id);
return reg_to_user64(uaddr, &r->val, id);
}
static int set_invariant_cp15(u64 id, void __user *uaddr) @@ -678,7 +722,7 @@ static int set_invariant_cp15(u64 id, void __user *uaddr) if (!r) return -ENOENT;
err = reg_from_user(&val, uaddr, id);
err = reg_from_user64(&val, uaddr, id); if (err) return err;
@@ -846,7 +890,7 @@ static int vfp_get_reg(const struct kvm_vcpu *vcpu, u64 id, void __user *uaddr) if (vfpid < num_fp_regs()) { if (KVM_REG_SIZE(id) != 8) return -ENOENT;
return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpregs[vfpid],
return reg_to_user64(uaddr, &vcpu->arch.vfp_guest.fpregs[vfpid], id); }
@@ -856,22 +900,22 @@ static int vfp_get_reg(const struct kvm_vcpu *vcpu, u64 id, void __user *uaddr)
switch (vfpid) { case KVM_REG_ARM_VFP_FPEXC:
return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpexc, id);
return reg_to_user32(uaddr, &vcpu->arch.vfp_guest.fpexc, id); case KVM_REG_ARM_VFP_FPSCR:
return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpscr, id);
return reg_to_user32(uaddr, &vcpu->arch.vfp_guest.fpscr, id); case KVM_REG_ARM_VFP_FPINST:
return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpinst, id);
return reg_to_user32(uaddr, &vcpu->arch.vfp_guest.fpinst, id); case KVM_REG_ARM_VFP_FPINST2:
return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpinst2, id);
return reg_to_user32(uaddr, &vcpu->arch.vfp_guest.fpinst2, id); case KVM_REG_ARM_VFP_MVFR0: val = fmrx(MVFR0);
return reg_to_user(uaddr, &val, id);
return reg_to_user32(uaddr, &val, id); case KVM_REG_ARM_VFP_MVFR1: val = fmrx(MVFR1);
return reg_to_user(uaddr, &val, id);
return reg_to_user32(uaddr, &val, id); case KVM_REG_ARM_VFP_FPSID: val = fmrx(FPSID);
return reg_to_user(uaddr, &val, id);
return reg_to_user32(uaddr, &val, id); default: return -ENOENT; }
@@ -890,8 +934,8 @@ static int vfp_set_reg(struct kvm_vcpu *vcpu, u64 id, const void __user *uaddr) if (vfpid < num_fp_regs()) { if (KVM_REG_SIZE(id) != 8) return -ENOENT;
return reg_from_user(&vcpu->arch.vfp_guest.fpregs[vfpid],
uaddr, id);
return reg_from_user64(&vcpu->arch.vfp_guest.fpregs[vfpid],
uaddr, id); } /* FP control registers are all 32 bit. */
@@ -900,28 +944,28 @@ static int vfp_set_reg(struct kvm_vcpu *vcpu, u64 id, const void __user *uaddr)
switch (vfpid) { case KVM_REG_ARM_VFP_FPEXC:
return reg_from_user(&vcpu->arch.vfp_guest.fpexc, uaddr, id);
return reg_from_user32(&vcpu->arch.vfp_guest.fpexc, uaddr, id); case KVM_REG_ARM_VFP_FPSCR:
return reg_from_user(&vcpu->arch.vfp_guest.fpscr, uaddr, id);
return reg_from_user32(&vcpu->arch.vfp_guest.fpscr, uaddr, id); case KVM_REG_ARM_VFP_FPINST:
return reg_from_user(&vcpu->arch.vfp_guest.fpinst, uaddr, id);
return reg_from_user32(&vcpu->arch.vfp_guest.fpinst, uaddr, id); case KVM_REG_ARM_VFP_FPINST2:
return reg_from_user(&vcpu->arch.vfp_guest.fpinst2, uaddr, id);
return reg_from_user32(&vcpu->arch.vfp_guest.fpinst2, uaddr, id); /* These are invariant. */ case KVM_REG_ARM_VFP_MVFR0:
if (reg_from_user(&val, uaddr, id))
if (reg_from_user32(&val, uaddr, id)) return -EFAULT; if (val != fmrx(MVFR0)) return -EINVAL; return 0; case KVM_REG_ARM_VFP_MVFR1:
if (reg_from_user(&val, uaddr, id))
if (reg_from_user32(&val, uaddr, id)) return -EFAULT; if (val != fmrx(MVFR1)) return -EINVAL; return 0; case KVM_REG_ARM_VFP_FPSID:
if (reg_from_user(&val, uaddr, id))
if (reg_from_user32(&val, uaddr, id)) return -EFAULT; if (val != fmrx(FPSID)) return -EINVAL;
@@ -968,7 +1012,7 @@ int kvm_arm_coproc_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) return get_invariant_cp15(reg->id, uaddr);
/* Note: copies two regs if size is 64 bit. */
return reg_to_user(uaddr, &vcpu->arch.cp15[r->reg], reg->id);
return reg_to_user32(uaddr, &vcpu->arch.cp15[r->reg], reg->id);
}
int kvm_arm_coproc_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) @@ -987,7 +1031,7 @@ int kvm_arm_coproc_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) return set_invariant_cp15(reg->id, uaddr);
/* Note: copies two regs if size is 64 bit */
return reg_from_user(&vcpu->arch.cp15[r->reg], uaddr, reg->id);
return reg_from_user32(&vcpu->arch.cp15[r->reg], uaddr, reg->id);
}
static unsigned int num_demux_regs(void)
1.8.1.4
Thanks,
Christoffer
On Tue, May 13, 2014 at 01:11:34PM -0700, Victor Kamensky wrote:
Hi Christoffer,
[...]
diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c index 78c0885..64b2b94 100644 --- a/arch/arm/kvm/coproc.c +++ b/arch/arm/kvm/coproc.c @@ -634,17 +634,61 @@ static struct coproc_reg invariant_cp15[] = { { CRn( 0), CRm( 0), Op1( 1), Op2( 7), is32, NULL, get_AIDR }, };
-static int reg_from_user(void *val, const void __user *uaddr, u64 id) +static int reg_from_user64(u64 *val, const void __user *uaddr, u64 id) +{
unsigned long regsize = KVM_REG_SIZE(id);
union {
u32 word;
u64 dword;
} tmp = {0};
if (copy_from_user(&tmp, uaddr, regsize) != 0)
return -EFAULT;
switch (regsize) {
case 4:
*val = tmp.word;
break;
case 8:
*val = tmp.dword;
break;
}
return 0;
+}
instead of allowing this 'packing 32 bit values in 64 bit values and packing two 32 bit values in 64 bit values' kernel implementation'y stuff to be passed to these user-space ABI catering function, I really think you want to make reg_from_user64 deal with 64-bit registers, that is, BUG_ON(KVM_REG_SIZE(id) != 8.
Example of kernel storing value in u64 but register is 32bit is set/get_invariant_cp15. Note this function passes struct coproc_reg val field address into reg_to_user_xxx function. Since struct coproc_reg is generic structure used to describe both 32bit and 64bit coproc registers val type cannot be changed to u32. One may try to describe val as union of u32 and u64 but I have concern that it would create bunch of 'if (KVM_REG_SIZE(id) == 4) { ... } else { ..}' pieces all over the place. I think it is better to have function that just can read/write 32bit register from/to u64 value in kernel.
I know what the kernel does internally. I just think these functions become conceptually very complicated to cater to one special case. You could deal with this in the caller. Consider get_invariant_cp15:
static int get_invariant_cp15(u64 id, void __user *uaddr) { struct coproc_params params; const struct coproc_reg *r;
if (!index_to_params(id, ¶ms)) return -ENOENT;
r = find_reg(¶ms, invariant_cp15, ARRAY_SIZE(invariant_cp15)); if (!r) return -ENOENT;
if (KVM_REG_SIZE(id) == 4) { u32 val = r->val; ret = reg_to_user32(uaddr, &val, id); } else if (KVM_REG_SIZE(id) == 8) { ret = reg_to_user32(uaddr, &r->val, id); } return ret; }
Did you actually try this approach and see how the code looks?
-Christoffer
According to recent clarifications of mmio.data array meaning - the mmio.data array should hold bytes as they would appear in memory. Vgic is little endian device. And in case of BE image kernel side that emulates vgic, holds data in BE form. So we need to byteswap cpu<->le32 vgic registers when we read/write them from mmio.data[].
Change has no effect in LE case because cpu already runs in le32.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org --- virt/kvm/arm/vgic.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c index 685fc72..7e11458 100644 --- a/virt/kvm/arm/vgic.c +++ b/virt/kvm/arm/vgic.c @@ -236,12 +236,12 @@ static void vgic_cpu_irq_clear(struct kvm_vcpu *vcpu, int irq)
static u32 mmio_data_read(struct kvm_exit_mmio *mmio, u32 mask) { - return *((u32 *)mmio->data) & mask; + return le32_to_cpu(*((u32 *)mmio->data)) & mask; }
static void mmio_data_write(struct kvm_exit_mmio *mmio, u32 mask, u32 value) { - *((u32 *)mmio->data) = value & mask; + *((u32 *)mmio->data) = cpu_to_le32(value) & mask; }
/**
On Tue, Feb 11, 2014 at 09:41:32PM -0800, Victor Kamensky wrote:
According to recent clarifications of mmio.data array meaning - the mmio.data array should hold bytes as they would appear in memory. Vgic is little endian device. And in case of BE image kernel side that emulates vgic, holds data in BE form. So we need to byteswap cpu<->le32 vgic registers when we read/write them from mmio.data[].
Change has no effect in LE case because cpu already runs in le32.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org
virt/kvm/arm/vgic.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c index 685fc72..7e11458 100644 --- a/virt/kvm/arm/vgic.c +++ b/virt/kvm/arm/vgic.c @@ -236,12 +236,12 @@ static void vgic_cpu_irq_clear(struct kvm_vcpu *vcpu, int irq) static u32 mmio_data_read(struct kvm_exit_mmio *mmio, u32 mask) {
- return *((u32 *)mmio->data) & mask;
- return le32_to_cpu(*((u32 *)mmio->data)) & mask;
} static void mmio_data_write(struct kvm_exit_mmio *mmio, u32 mask, u32 value) {
- *((u32 *)mmio->data) = value & mask;
- *((u32 *)mmio->data) = cpu_to_le32(value) & mask;
} /** -- 1.8.1.4
I hate the fact that we have endianness handling code inside the vgic emulation, I would strongly prefer that the interface to the vgic code is a typed union, but, that being said, I haven't looked at how the changes to the code would look to accomplish that. You didn't reply to my same comment last time around, but did you ahve a look?
The code here, with the ABI specification does look correct, so assuming that ABI specification gets merged:
Reviewed-by: Christoffer Dall christoffer.dall@linaro.org
In case of status register E bit is not set (LE mode) and host runs in BE mode we need byteswap data, so read/write is emulated correctly.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org --- arch/arm/include/asm/kvm_emulate.h | 22 ++++++++++++++++++---- 1 file changed, 18 insertions(+), 4 deletions(-)
diff --git a/arch/arm/include/asm/kvm_emulate.h b/arch/arm/include/asm/kvm_emulate.h index 0fa90c9..69b7469 100644 --- a/arch/arm/include/asm/kvm_emulate.h +++ b/arch/arm/include/asm/kvm_emulate.h @@ -185,9 +185,16 @@ static inline unsigned long vcpu_data_guest_to_host(struct kvm_vcpu *vcpu, default: return be32_to_cpu(data); } + } else { + switch (len) { + case 1: + return data & 0xff; + case 2: + return le16_to_cpu(data & 0xffff); + default: + return le32_to_cpu(data); + } } - - return data; /* Leave LE untouched */ }
static inline unsigned long vcpu_data_host_to_guest(struct kvm_vcpu *vcpu, @@ -203,9 +210,16 @@ static inline unsigned long vcpu_data_host_to_guest(struct kvm_vcpu *vcpu, default: return cpu_to_be32(data); } + } else { + switch (len) { + case 1: + return data & 0xff; + case 2: + return cpu_to_le16(data & 0xffff); + default: + return cpu_to_le32(data); + } } - - return data; /* Leave LE untouched */ }
#endif /* __ARM_KVM_EMULATE_H__ */
On Tue, Feb 11, 2014 at 09:41:33PM -0800, Victor Kamensky wrote:
In case of status register E bit is not set (LE mode) and host runs in BE mode we need byteswap data, so read/write is emulated correctly.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org
arch/arm/include/asm/kvm_emulate.h | 22 ++++++++++++++++++---- 1 file changed, 18 insertions(+), 4 deletions(-)
diff --git a/arch/arm/include/asm/kvm_emulate.h b/arch/arm/include/asm/kvm_emulate.h index 0fa90c9..69b7469 100644 --- a/arch/arm/include/asm/kvm_emulate.h +++ b/arch/arm/include/asm/kvm_emulate.h @@ -185,9 +185,16 @@ static inline unsigned long vcpu_data_guest_to_host(struct kvm_vcpu *vcpu, default: return be32_to_cpu(data); }
- } else {
switch (len) {
case 1:
return data & 0xff;
case 2:
return le16_to_cpu(data & 0xffff);
default:
return le32_to_cpu(data);
}}
- return data; /* Leave LE untouched */
} static inline unsigned long vcpu_data_host_to_guest(struct kvm_vcpu *vcpu, @@ -203,9 +210,16 @@ static inline unsigned long vcpu_data_host_to_guest(struct kvm_vcpu *vcpu, default: return cpu_to_be32(data); }
- } else {
switch (len) {
case 1:
return data & 0xff;
case 2:
return cpu_to_le16(data & 0xffff);
default:
return cpu_to_le32(data);
}}
- return data; /* Leave LE untouched */
}
#endif /* __ARM_KVM_EMULATE_H__ */
1.8.1.4
Somehow in my head the way I think about this is
if (guest_be != host_be) { return swab(data); }
but it's probably not more clear in terms of the actual code...
Anyway, given that the ABI clarification text gets merged:
Reviewed-by: Christoffer Dall christoffer.dall@linaro.org
linaro-kernel@lists.linaro.org