Hi,
Here is updated patches to address KVM host support in big endian ARM V7/V8.
Changes since previous version [1] and [2]:
x) joined both v7 [1] and v8 [2] series into one
x) addressed, nearly all, Christoffer's comments to previous patch series
x) Fixed few issues in 'ARM: KVM: handle 64bit values passed to mrcc or from mcrr instructions in BE case' that got discovered after moving to the latest v3.15-rc5 kernel
x) Fixed issue in supporting 32bit guest by V8 BE KVM host
All patches except 'ARM: KVM: one_reg coproc set and get BE fixes', as far as I see, are best possible proposal. 'ARM: KVM: one_reg coproc set and get BE fixes' changes still need more discussion. Patch included with this series covers one possible solution, I'll post notes about other approach previously suggested by Alex Graf separately.
The patches were tested on top of v3.15-rc5 on TC2 (V7) and fastmodels (V8). Tested all possible combinations of KVM host and guests (V7/V8, BE/LE, 64bit/32bit).
Thanks, Victor
[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2014-February/231881.h...
[2] http://lists.infradead.org/pipermail/linux-arm-kernel/2014-February/231889.h...
Victor Kamensky (14): ARM: KVM: switch hypervisor into BE mode in case of BE host ARM: KVM: fix vgic V7 assembler code to work in BE image ARM: KVM: handle 64bit values passed to mrcc or from mcrr instructions in BE case ARM: KVM: __kvm_vcpu_run function return result fix in BE case ARM: KVM: vgic mmio should hold data as LE bytes array in BE case ARM: KVM: MMIO support BE host running LE code ARM: KVM: one_reg coproc set and get BE fixes ARM: KVM: enable KVM in Kconfig on big-endian systems ARM64: KVM: MMIO support BE host running LE code ARM64: KVM: store kvm_vcpu_fault_info est_el2 as word ARM64: KVM: fix vgic_bitmap_get_reg function for BE 64bit case ARM64: KVM: vgic_elrsr and vgic_eisr need to be byteswapped in BE case ARM64: KVM: set and get of sys registers in BE case ARM64: KVM: fix big endian issue in access_vm_reg for 32bit guest
arch/arm/include/asm/kvm_asm.h | 18 ++++++ arch/arm/include/asm/kvm_emulate.h | 22 +++++-- arch/arm/kvm/Kconfig | 2 +- arch/arm/kvm/coproc.c | 118 +++++++++++++++++++++++++++-------- arch/arm/kvm/init.S | 7 ++- arch/arm/kvm/interrupts.S | 9 ++- arch/arm/kvm/interrupts_head.S | 20 +++++- arch/arm64/include/asm/kvm_emulate.h | 22 +++++++ arch/arm64/kvm/hyp.S | 9 ++- arch/arm64/kvm/sys_regs.c | 33 +++++++--- virt/kvm/arm/vgic.c | 28 +++++++-- 11 files changed, 237 insertions(+), 51 deletions(-)
Switch hypervisor to run in BE mode if image is compiled with CONFIG_CPU_BIG_ENDIAN.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org Reviewed-by: Christoffer Dall christoffer.dall@linaro.org --- arch/arm/kvm/init.S | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/arch/arm/kvm/init.S b/arch/arm/kvm/init.S index 1b9844d..74f0718 100644 --- a/arch/arm/kvm/init.S +++ b/arch/arm/kvm/init.S @@ -22,6 +22,7 @@ #include <asm/kvm_asm.h> #include <asm/kvm_arm.h> #include <asm/kvm_mmu.h> +#include <asm/assembler.h>
/******************************************************************** * Hypervisor initialization @@ -70,6 +71,8 @@ __do_hyp_init: cmp r0, #0 @ We have a SP? bne phase2 @ Yes, second stage init
+ARM_BE8(setend be) @ Switch to Big Endian mode if needed + @ Set the HTTBR to point to the hypervisor PGD pointer passed mcrr p15, 4, r2, r3, c2
On 13/05/14 17:13, Victor Kamensky wrote:
Switch hypervisor to run in BE mode if image is compiled with CONFIG_CPU_BIG_ENDIAN.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org Reviewed-by: Christoffer Dall christoffer.dall@linaro.org
arch/arm/kvm/init.S | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/arch/arm/kvm/init.S b/arch/arm/kvm/init.S index 1b9844d..74f0718 100644 --- a/arch/arm/kvm/init.S +++ b/arch/arm/kvm/init.S @@ -22,6 +22,7 @@ #include <asm/kvm_asm.h> #include <asm/kvm_arm.h> #include <asm/kvm_mmu.h> +#include <asm/assembler.h> /********************************************************************
- Hypervisor initialization
@@ -70,6 +71,8 @@ __do_hyp_init: cmp r0, #0 @ We have a SP? bne phase2 @ Yes, second stage init +ARM_BE8(setend be) @ Switch to Big Endian mode if needed
- @ Set the HTTBR to point to the hypervisor PGD pointer passed mcrr p15, 4, r2, r3, c2
Acked-by: Marc Zyngier marc.zyngier@arm.com
M.
The vgic h/w registers are little endian; when BE asm code reads/writes from/to them, it needs to do byteswap after/before. Byteswap code uses ARM_BE8 wrapper to add swap only if CONFIG_CPU_BIG_ENDIAN is configured.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org Reviewed-by: Christoffer Dall christoffer.dall@linaro.org --- arch/arm/kvm/interrupts_head.S | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S index 76af9302..e627858 100644 --- a/arch/arm/kvm/interrupts_head.S +++ b/arch/arm/kvm/interrupts_head.S @@ -1,4 +1,5 @@ #include <linux/irqchip/arm-gic.h> +#include <asm/assembler.h>
#define VCPU_USR_REG(_reg_nr) (VCPU_USR_REGS + (_reg_nr * 4)) #define VCPU_USR_SP (VCPU_USR_REG(13)) @@ -420,6 +421,14 @@ vcpu .req r0 @ vcpu pointer always in r0 ldr r8, [r2, #GICH_ELRSR0] ldr r9, [r2, #GICH_ELRSR1] ldr r10, [r2, #GICH_APR] +ARM_BE8(rev r3, r3 ) +ARM_BE8(rev r4, r4 ) +ARM_BE8(rev r5, r5 ) +ARM_BE8(rev r6, r6 ) +ARM_BE8(rev r7, r7 ) +ARM_BE8(rev r8, r8 ) +ARM_BE8(rev r9, r9 ) +ARM_BE8(rev r10, r10 )
str r3, [r11, #VGIC_CPU_HCR] str r4, [r11, #VGIC_CPU_VMCR] @@ -439,6 +448,7 @@ vcpu .req r0 @ vcpu pointer always in r0 add r3, r11, #VGIC_CPU_LR ldr r4, [r11, #VGIC_CPU_NR_LR] 1: ldr r6, [r2], #4 +ARM_BE8(rev r6, r6 ) str r6, [r3], #4 subs r4, r4, #1 bne 1b @@ -466,6 +476,9 @@ vcpu .req r0 @ vcpu pointer always in r0 ldr r3, [r11, #VGIC_CPU_HCR] ldr r4, [r11, #VGIC_CPU_VMCR] ldr r8, [r11, #VGIC_CPU_APR] +ARM_BE8(rev r3, r3 ) +ARM_BE8(rev r4, r4 ) +ARM_BE8(rev r8, r8 )
str r3, [r2, #GICH_HCR] str r4, [r2, #GICH_VMCR] @@ -476,6 +489,7 @@ vcpu .req r0 @ vcpu pointer always in r0 add r3, r11, #VGIC_CPU_LR ldr r4, [r11, #VGIC_CPU_NR_LR] 1: ldr r6, [r3], #4 +ARM_BE8(rev r6, r6 ) str r6, [r2], #4 subs r4, r4, #1 bne 1b
On 13/05/14 17:13, Victor Kamensky wrote:
The vgic h/w registers are little endian; when BE asm code reads/writes from/to them, it needs to do byteswap after/before. Byteswap code uses ARM_BE8 wrapper to add swap only if CONFIG_CPU_BIG_ENDIAN is configured.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org Reviewed-by: Christoffer Dall christoffer.dall@linaro.org
arch/arm/kvm/interrupts_head.S | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S index 76af9302..e627858 100644 --- a/arch/arm/kvm/interrupts_head.S +++ b/arch/arm/kvm/interrupts_head.S @@ -1,4 +1,5 @@ #include <linux/irqchip/arm-gic.h> +#include <asm/assembler.h> #define VCPU_USR_REG(_reg_nr) (VCPU_USR_REGS + (_reg_nr * 4)) #define VCPU_USR_SP (VCPU_USR_REG(13)) @@ -420,6 +421,14 @@ vcpu .req r0 @ vcpu pointer always in r0 ldr r8, [r2, #GICH_ELRSR0] ldr r9, [r2, #GICH_ELRSR1] ldr r10, [r2, #GICH_APR] +ARM_BE8(rev r3, r3 ) +ARM_BE8(rev r4, r4 ) +ARM_BE8(rev r5, r5 ) +ARM_BE8(rev r6, r6 ) +ARM_BE8(rev r7, r7 ) +ARM_BE8(rev r8, r8 ) +ARM_BE8(rev r9, r9 ) +ARM_BE8(rev r10, r10 ) str r3, [r11, #VGIC_CPU_HCR] str r4, [r11, #VGIC_CPU_VMCR] @@ -439,6 +448,7 @@ vcpu .req r0 @ vcpu pointer always in r0 add r3, r11, #VGIC_CPU_LR ldr r4, [r11, #VGIC_CPU_NR_LR] 1: ldr r6, [r2], #4 +ARM_BE8(rev r6, r6 ) str r6, [r3], #4 subs r4, r4, #1 bne 1b @@ -466,6 +476,9 @@ vcpu .req r0 @ vcpu pointer always in r0 ldr r3, [r11, #VGIC_CPU_HCR] ldr r4, [r11, #VGIC_CPU_VMCR] ldr r8, [r11, #VGIC_CPU_APR] +ARM_BE8(rev r3, r3 ) +ARM_BE8(rev r4, r4 ) +ARM_BE8(rev r8, r8 ) str r3, [r2, #GICH_HCR] str r4, [r2, #GICH_VMCR] @@ -476,6 +489,7 @@ vcpu .req r0 @ vcpu pointer always in r0 add r3, r11, #VGIC_CPU_LR ldr r4, [r11, #VGIC_CPU_NR_LR] 1: ldr r6, [r3], #4 +ARM_BE8(rev r6, r6 ) str r6, [r2], #4 subs r4, r4, #1 bne 1b
Acked-by: Marc Zyngier marc.zyngier@arm.com
M.
In some cases the mcrr and mrrc instructions in combination with the ldrd and strd instructions need to deal with 64bit value in memory. The ldrd and strd instructions already handle endianness within word (register) boundaries but to get effect of the whole 64bit value represented correctly, rr_lo_hi macro is introduced and is used to swap registers positions when the mcrr and mrrc instructions are used. That has the effect of swapping two words.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org --- arch/arm/include/asm/kvm_asm.h | 18 ++++++++++++++++++ arch/arm/kvm/init.S | 4 ++-- arch/arm/kvm/interrupts.S | 4 ++-- arch/arm/kvm/interrupts_head.S | 6 +++--- 4 files changed, 25 insertions(+), 7 deletions(-)
diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h index 53b3c4a..3a67bec 100644 --- a/arch/arm/include/asm/kvm_asm.h +++ b/arch/arm/include/asm/kvm_asm.h @@ -61,6 +61,24 @@ #define ARM_EXCEPTION_FIQ 6 #define ARM_EXCEPTION_HVC 7
+/* + * The rr_lo_hi macro swaps a pair of registers depending on + * current endianness. It is used in conjunction with ldrd and strd + * instructions that load/store a 64-bit value from/to memory to/from + * a pair of registers which are used with the mrrc and mcrr instructions. + * If used with the ldrd/strd instructions, the a1 parameter is the first + * source/destination register and the a2 parameter is the second + * source/destination register. Note that the ldrd/strd instructions + * already swap the bytes within the words correctly according to the + * endianness setting, but the order of the registers need to be effectively + * swapped when used with the mrrc/mcrr instructions. + */ +#ifdef CONFIG_CPU_ENDIAN_BE8 +#define rr_lo_hi(a1, a2) a2, a1 +#else +#define rr_lo_hi(a1, a2) a1, a2 +#endif + #ifndef __ASSEMBLY__ struct kvm; struct kvm_vcpu; diff --git a/arch/arm/kvm/init.S b/arch/arm/kvm/init.S index 74f0718..2d10b2d 100644 --- a/arch/arm/kvm/init.S +++ b/arch/arm/kvm/init.S @@ -74,7 +74,7 @@ __do_hyp_init: ARM_BE8(setend be) @ Switch to Big Endian mode if needed
@ Set the HTTBR to point to the hypervisor PGD pointer passed - mcrr p15, 4, r2, r3, c2 + mcrr p15, 4, rr_lo_hi(r2, r3), c2
@ Set the HTCR and VTCR to the same shareability and cacheability @ settings as the non-secure TTBCR and with T0SZ == 0. @@ -140,7 +140,7 @@ phase2: mov pc, r0
target: @ We're now in the trampoline code, switch page tables - mcrr p15, 4, r2, r3, c2 + mcrr p15, 4, rr_lo_hi(r2, r3), c2 isb
@ Invalidate the old TLBs diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S index 0d68d40..24d4e65 100644 --- a/arch/arm/kvm/interrupts.S +++ b/arch/arm/kvm/interrupts.S @@ -52,7 +52,7 @@ ENTRY(__kvm_tlb_flush_vmid_ipa) dsb ishst add r0, r0, #KVM_VTTBR ldrd r2, r3, [r0] - mcrr p15, 6, r2, r3, c2 @ Write VTTBR + mcrr p15, 6, rr_lo_hi(r2, r3), c2 @ Write VTTBR isb mcr p15, 0, r0, c8, c3, 0 @ TLBIALLIS (rt ignored) dsb ish @@ -135,7 +135,7 @@ ENTRY(__kvm_vcpu_run) ldr r1, [vcpu, #VCPU_KVM] add r1, r1, #KVM_VTTBR ldrd r2, r3, [r1] - mcrr p15, 6, r2, r3, c2 @ Write VTTBR + mcrr p15, 6, rr_lo_hi(r2, r3), c2 @ Write VTTBR
@ We're all done, just restore the GPRs and go to the guest restore_guest_regs diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S index e627858..104977f 100644 --- a/arch/arm/kvm/interrupts_head.S +++ b/arch/arm/kvm/interrupts_head.S @@ -520,7 +520,7 @@ ARM_BE8(rev r6, r6 ) mcr p15, 0, r2, c14, c3, 1 @ CNTV_CTL isb
- mrrc p15, 3, r2, r3, c14 @ CNTV_CVAL + mrrc p15, 3, rr_lo_hi(r2, r3), c14 @ CNTV_CVAL ldr r4, =VCPU_TIMER_CNTV_CVAL add r5, vcpu, r4 strd r2, r3, [r5] @@ -560,12 +560,12 @@ ARM_BE8(rev r6, r6 )
ldr r2, [r4, #KVM_TIMER_CNTVOFF] ldr r3, [r4, #(KVM_TIMER_CNTVOFF + 4)] - mcrr p15, 4, r2, r3, c14 @ CNTVOFF + mcrr p15, 4, rr_lo_hi(r2, r3), c14 @ CNTVOFF
ldr r4, =VCPU_TIMER_CNTV_CVAL add r5, vcpu, r4 ldrd r2, r3, [r5] - mcrr p15, 3, r2, r3, c14 @ CNTV_CVAL + mcrr p15, 3, rr_lo_hi(r2, r3), c14 @ CNTV_CVAL isb
ldr r2, [vcpu, #VCPU_TIMER_CNTV_CTL]
On Tue, May 13, 2014 at 09:13:55AM -0700, Victor Kamensky wrote:
In some cases the mcrr and mrrc instructions in combination with the ldrd and strd instructions need to deal with 64bit value in memory. The ldrd and strd instructions already handle endianness within word (register) boundaries but to get effect of the whole 64bit value represented correctly, rr_lo_hi macro is introduced and is used to swap registers positions when the mcrr and mrrc instructions are used. That has the effect of swapping two words.
So in this version you decided to keep the ordering of the TTBRX and TTBRX_high bits (high being most significant). I assume this is because of the trapping of VM registers introduced by mark and to maintain access_vm_reg()?
Any other changes since the last version?
If none, then for the content of this patch:
Reviewed-by: Christoffer Dall christoffer.dall@linaro.org
On 26 May 2014 08:28, Christoffer Dall christoffer.dall@linaro.org wrote:
On Tue, May 13, 2014 at 09:13:55AM -0700, Victor Kamensky wrote:
In some cases the mcrr and mrrc instructions in combination with the ldrd and strd instructions need to deal with 64bit value in memory. The ldrd and strd instructions already handle endianness within word (register) boundaries but to get effect of the whole 64bit value represented correctly, rr_lo_hi macro is introduced and is used to swap registers positions when the mcrr and mrrc instructions are used. That has the effect of swapping two words.
So in this version you decided to keep the ordering of the TTBRX and TTBRX_high bits (high being most significant). I assume this is because of the trapping of VM registers introduced by mark and to maintain access_vm_reg()?
Yes, that is correct.
Any other changes since the last version?
No
If none, then for the content of this patch:
Reviewed-by: Christoffer Dall christoffer.dall@linaro.org
Thanks, Victor
On 13/05/14 17:13, Victor Kamensky wrote:
In some cases the mcrr and mrrc instructions in combination with the ldrd and strd instructions need to deal with 64bit value in memory. The ldrd and strd instructions already handle endianness within word (register) boundaries but to get effect of the whole 64bit value represented correctly, rr_lo_hi macro is introduced and is used to swap registers positions when the mcrr and mrrc instructions are used. That has the effect of swapping two words.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org
arch/arm/include/asm/kvm_asm.h | 18 ++++++++++++++++++ arch/arm/kvm/init.S | 4 ++-- arch/arm/kvm/interrupts.S | 4 ++-- arch/arm/kvm/interrupts_head.S | 6 +++--- 4 files changed, 25 insertions(+), 7 deletions(-)
diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h index 53b3c4a..3a67bec 100644 --- a/arch/arm/include/asm/kvm_asm.h +++ b/arch/arm/include/asm/kvm_asm.h @@ -61,6 +61,24 @@ #define ARM_EXCEPTION_FIQ 6 #define ARM_EXCEPTION_HVC 7 +/*
- The rr_lo_hi macro swaps a pair of registers depending on
- current endianness. It is used in conjunction with ldrd and strd
- instructions that load/store a 64-bit value from/to memory to/from
- a pair of registers which are used with the mrrc and mcrr instructions.
- If used with the ldrd/strd instructions, the a1 parameter is the first
- source/destination register and the a2 parameter is the second
- source/destination register. Note that the ldrd/strd instructions
- already swap the bytes within the words correctly according to the
- endianness setting, but the order of the registers need to be effectively
- swapped when used with the mrrc/mcrr instructions.
- */
+#ifdef CONFIG_CPU_ENDIAN_BE8 +#define rr_lo_hi(a1, a2) a2, a1 +#else +#define rr_lo_hi(a1, a2) a1, a2 +#endif
#ifndef __ASSEMBLY__ struct kvm; struct kvm_vcpu; diff --git a/arch/arm/kvm/init.S b/arch/arm/kvm/init.S index 74f0718..2d10b2d 100644 --- a/arch/arm/kvm/init.S +++ b/arch/arm/kvm/init.S @@ -74,7 +74,7 @@ __do_hyp_init: ARM_BE8(setend be) @ Switch to Big Endian mode if needed @ Set the HTTBR to point to the hypervisor PGD pointer passed
- mcrr p15, 4, r2, r3, c2
- mcrr p15, 4, rr_lo_hi(r2, r3), c2
@ Set the HTCR and VTCR to the same shareability and cacheability @ settings as the non-secure TTBCR and with T0SZ == 0. @@ -140,7 +140,7 @@ phase2: mov pc, r0 target: @ We're now in the trampoline code, switch page tables
- mcrr p15, 4, r2, r3, c2
- mcrr p15, 4, rr_lo_hi(r2, r3), c2 isb
@ Invalidate the old TLBs diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S index 0d68d40..24d4e65 100644 --- a/arch/arm/kvm/interrupts.S +++ b/arch/arm/kvm/interrupts.S @@ -52,7 +52,7 @@ ENTRY(__kvm_tlb_flush_vmid_ipa) dsb ishst add r0, r0, #KVM_VTTBR ldrd r2, r3, [r0]
- mcrr p15, 6, r2, r3, c2 @ Write VTTBR
- mcrr p15, 6, rr_lo_hi(r2, r3), c2 @ Write VTTBR isb mcr p15, 0, r0, c8, c3, 0 @ TLBIALLIS (rt ignored) dsb ish
@@ -135,7 +135,7 @@ ENTRY(__kvm_vcpu_run) ldr r1, [vcpu, #VCPU_KVM] add r1, r1, #KVM_VTTBR ldrd r2, r3, [r1]
- mcrr p15, 6, r2, r3, c2 @ Write VTTBR
- mcrr p15, 6, rr_lo_hi(r2, r3), c2 @ Write VTTBR
@ We're all done, just restore the GPRs and go to the guest restore_guest_regs diff --git a/arch/arm/kvm/interrupts_head.S b/arch/arm/kvm/interrupts_head.S index e627858..104977f 100644 --- a/arch/arm/kvm/interrupts_head.S +++ b/arch/arm/kvm/interrupts_head.S @@ -520,7 +520,7 @@ ARM_BE8(rev r6, r6 ) mcr p15, 0, r2, c14, c3, 1 @ CNTV_CTL isb
- mrrc p15, 3, r2, r3, c14 @ CNTV_CVAL
- mrrc p15, 3, rr_lo_hi(r2, r3), c14 @ CNTV_CVAL ldr r4, =VCPU_TIMER_CNTV_CVAL add r5, vcpu, r4 strd r2, r3, [r5]
@@ -560,12 +560,12 @@ ARM_BE8(rev r6, r6 ) ldr r2, [r4, #KVM_TIMER_CNTVOFF] ldr r3, [r4, #(KVM_TIMER_CNTVOFF + 4)]
- mcrr p15, 4, r2, r3, c14 @ CNTVOFF
- mcrr p15, 4, rr_lo_hi(r2, r3), c14 @ CNTVOFF
ldr r4, =VCPU_TIMER_CNTV_CVAL add r5, vcpu, r4 ldrd r2, r3, [r5]
- mcrr p15, 3, r2, r3, c14 @ CNTV_CVAL
- mcrr p15, 3, rr_lo_hi(r2, r3), c14 @ CNTV_CVAL isb
ldr r2, [vcpu, #VCPU_TIMER_CNTV_CTL]
As much as I hate the macro, I can't find a better solution. As such:
Acked-by: Marc Zyngier marc.zyngier@arm.com
M.
The __kvm_vcpu_run function returns a 64-bit result in two registers, which has to be adjusted for BE case.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org Acked-by: Christoffer Dall christoffer.dall@linaro.org --- arch/arm/kvm/interrupts.S | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S index 24d4e65..7dfe9e4 100644 --- a/arch/arm/kvm/interrupts.S +++ b/arch/arm/kvm/interrupts.S @@ -199,8 +199,13 @@ after_vfp_restore:
restore_host_regs clrex @ Clear exclusive monitor +#ifndef __ARMEB__ mov r0, r1 @ Return the return code mov r1, #0 @ Clear upper bits in return value +#else + @ r1 already has return code + mov r0, #0 @ Clear upper bits in return value +#endif /* __ARMEB__ */ bx lr @ return to IOCTL
/********************************************************************
On 13/05/14 17:13, Victor Kamensky wrote:
The __kvm_vcpu_run function returns a 64-bit result in two registers, which has to be adjusted for BE case.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org Acked-by: Christoffer Dall christoffer.dall@linaro.org
arch/arm/kvm/interrupts.S | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S index 24d4e65..7dfe9e4 100644 --- a/arch/arm/kvm/interrupts.S +++ b/arch/arm/kvm/interrupts.S @@ -199,8 +199,13 @@ after_vfp_restore: restore_host_regs clrex @ Clear exclusive monitor +#ifndef __ARMEB__ mov r0, r1 @ Return the return code mov r1, #0 @ Clear upper bits in return value +#else
- @ r1 already has return code
- mov r0, #0 @ Clear upper bits in return value
+#endif /* __ARMEB__ */
Why using __ARMEB__ while the rest of the series is using CONFIG_CPU_ENDIAN_BE8?
bx lr @ return to IOCTL /********************************************************************
Aside from the above nit:
Acked-by: Marc Zyngier marc.zyngier@arm.com
M.
On 27 May 2014 08:02, Marc Zyngier marc.zyngier@arm.com wrote:
On 13/05/14 17:13, Victor Kamensky wrote:
The __kvm_vcpu_run function returns a 64-bit result in two registers, which has to be adjusted for BE case.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org Acked-by: Christoffer Dall christoffer.dall@linaro.org
arch/arm/kvm/interrupts.S | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S index 24d4e65..7dfe9e4 100644 --- a/arch/arm/kvm/interrupts.S +++ b/arch/arm/kvm/interrupts.S @@ -199,8 +199,13 @@ after_vfp_restore:
restore_host_regs clrex @ Clear exclusive monitor
+#ifndef __ARMEB__ mov r0, r1 @ Return the return code mov r1, #0 @ Clear upper bits in return value +#else
@ r1 already has return code
mov r0, #0 @ Clear upper bits in return value
+#endif /* __ARMEB__ */
Why using __ARMEB__ while the rest of the series is using CONFIG_CPU_ENDIAN_BE8?
It felt like big endian ABI related issue, applicable to both BE8 and BE32. But ARM KVM itself applicable only to V7 BE8 so yes, it would not matter. I'll change it to CONFIG_CPU_ENDIAN_BE8 for consistency.
Thanks, Victor
bx lr @ return to IOCTL
/********************************************************************
Aside from the above nit:
Acked-by: Marc Zyngier marc.zyngier@arm.com
M.
-- Jazz is not dead. It just smells funny...
According to recent clarifications of mmio.data array meaning - the mmio.data array should hold bytes as they would appear in memory. Vgic is little endian device. And in case of BE image kernel side that emulates vgic, holds data in BE form. So we need to byteswap cpu<->le32 vgic registers when we read/write them from mmio.data[].
Change has no effect in LE case because cpu already runs in le32.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org Reviewed-by: Christoffer Dall christoffer.dall@linaro.org --- virt/kvm/arm/vgic.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c index 56ff9be..529c336 100644 --- a/virt/kvm/arm/vgic.c +++ b/virt/kvm/arm/vgic.c @@ -241,12 +241,12 @@ static void vgic_cpu_irq_clear(struct kvm_vcpu *vcpu, int irq)
static u32 mmio_data_read(struct kvm_exit_mmio *mmio, u32 mask) { - return *((u32 *)mmio->data) & mask; + return le32_to_cpu(*((u32 *)mmio->data)) & mask; }
static void mmio_data_write(struct kvm_exit_mmio *mmio, u32 mask, u32 value) { - *((u32 *)mmio->data) = value & mask; + *((u32 *)mmio->data) = cpu_to_le32(value) & mask; }
/**
On 13/05/14 17:13, Victor Kamensky wrote:
According to recent clarifications of mmio.data array meaning - the mmio.data array should hold bytes as they would appear in memory. Vgic is little endian device. And in case of BE image kernel side that emulates vgic, holds data in BE form. So we need to byteswap cpu<->le32 vgic registers when we read/write them from mmio.data[].
Change has no effect in LE case because cpu already runs in le32.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org Reviewed-by: Christoffer Dall christoffer.dall@linaro.org
virt/kvm/arm/vgic.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c index 56ff9be..529c336 100644 --- a/virt/kvm/arm/vgic.c +++ b/virt/kvm/arm/vgic.c @@ -241,12 +241,12 @@ static void vgic_cpu_irq_clear(struct kvm_vcpu *vcpu, int irq) static u32 mmio_data_read(struct kvm_exit_mmio *mmio, u32 mask) {
- return *((u32 *)mmio->data) & mask;
- return le32_to_cpu(*((u32 *)mmio->data)) & mask;
} static void mmio_data_write(struct kvm_exit_mmio *mmio, u32 mask, u32 value) {
- *((u32 *)mmio->data) = value & mask;
- *((u32 *)mmio->data) = cpu_to_le32(value) & mask;
} /**
Acked-by: Marc Zyngier marc.zyngier@arm.com
M.
In case of status register E bit is not set (LE mode) and host runs in BE mode we need byteswap data, so read/write is emulated correctly.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org Reviewed-by: Christoffer Dall christoffer.dall@linaro.org --- arch/arm/include/asm/kvm_emulate.h | 22 ++++++++++++++++++---- 1 file changed, 18 insertions(+), 4 deletions(-)
diff --git a/arch/arm/include/asm/kvm_emulate.h b/arch/arm/include/asm/kvm_emulate.h index 0fa90c9..69b7469 100644 --- a/arch/arm/include/asm/kvm_emulate.h +++ b/arch/arm/include/asm/kvm_emulate.h @@ -185,9 +185,16 @@ static inline unsigned long vcpu_data_guest_to_host(struct kvm_vcpu *vcpu, default: return be32_to_cpu(data); } + } else { + switch (len) { + case 1: + return data & 0xff; + case 2: + return le16_to_cpu(data & 0xffff); + default: + return le32_to_cpu(data); + } } - - return data; /* Leave LE untouched */ }
static inline unsigned long vcpu_data_host_to_guest(struct kvm_vcpu *vcpu, @@ -203,9 +210,16 @@ static inline unsigned long vcpu_data_host_to_guest(struct kvm_vcpu *vcpu, default: return cpu_to_be32(data); } + } else { + switch (len) { + case 1: + return data & 0xff; + case 2: + return cpu_to_le16(data & 0xffff); + default: + return cpu_to_le32(data); + } } - - return data; /* Leave LE untouched */ }
#endif /* __ARM_KVM_EMULATE_H__ */
On 13/05/14 17:13, Victor Kamensky wrote:
In case of status register E bit is not set (LE mode) and host runs in BE mode we need byteswap data, so read/write is emulated correctly.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org Reviewed-by: Christoffer Dall christoffer.dall@linaro.org
arch/arm/include/asm/kvm_emulate.h | 22 ++++++++++++++++++---- 1 file changed, 18 insertions(+), 4 deletions(-)
diff --git a/arch/arm/include/asm/kvm_emulate.h b/arch/arm/include/asm/kvm_emulate.h index 0fa90c9..69b7469 100644 --- a/arch/arm/include/asm/kvm_emulate.h +++ b/arch/arm/include/asm/kvm_emulate.h @@ -185,9 +185,16 @@ static inline unsigned long vcpu_data_guest_to_host(struct kvm_vcpu *vcpu, default: return be32_to_cpu(data); }
- } else {
switch (len) {
case 1:
return data & 0xff;
case 2:
return le16_to_cpu(data & 0xffff);
default:
return le32_to_cpu(data);
}}
- return data; /* Leave LE untouched */
} static inline unsigned long vcpu_data_host_to_guest(struct kvm_vcpu *vcpu, @@ -203,9 +210,16 @@ static inline unsigned long vcpu_data_host_to_guest(struct kvm_vcpu *vcpu, default: return cpu_to_be32(data); }
- } else {
switch (len) {
case 1:
return data & 0xff;
case 2:
return cpu_to_le16(data & 0xffff);
default:
return cpu_to_le32(data);
}}
- return data; /* Leave LE untouched */
} #endif /* __ARM_KVM_EMULATE_H__ */
Reviewed-by: Marc Zyngier marc.zyngier@arm.com
M.
Fix code that handles KVM_SET_ONE_REG, KVM_GET_ONE_REG ioctls to work in BE image. Before this fix get/set_one_reg functions worked correctly only in LE case - reg_from_user was taking 'void *' kernel address that actually could be target/source memory of either 4 bytes size or 8 bytes size, and code copied from/to user memory that could hold either 4 bytes register, 8 byte register or pair of 4 bytes registers.
For example note that there was a case when 4 bytes register was read from user-land to kernel target address of 8 bytes value. Because it was working in LE, least significant word was memcpy(ied) and it just worked. In BE code with 'void *' as target/source 'val' type it is impossible to tell whether 4 bytes register from user-land should be copied to 'val' address itself (4 bytes target) or it should be copied to 'val' + 4 (least significant word of 8 bytes value). So first change was to introduce strongly typed functions, where type of target/source 'val' is strongly defined:
reg_from_user_to_u64 - reads register from user-land to kernel 'u64 *val' address; register size could be 4 or 8 bytes reg_from_user_to_u32 - reads register(s) from user-land to kernel 'u32 *val' address; note it could be one or two 4 bytes registers reg_to_user_from_u64 - writes register from kernel 'u64 *val' address to user-land register memory; register size could be 4 or 8 bytes ret_to_user_from_u32 - writes register(s) from kernel 'u32 *val' address to user-land register(s) memory; note it could be one or two 4 bytes registers
All places where reg_from_user, reg_to_user functions were used, were changed to use either corresponding u64 or u32 variants of functions depending on type of source/target kernel memory variable.
In case of 'u64 *val' and register size equals 4 bytes, reg_from_user_to_u64 and reg_to_user_from_u64 work only with least siginificant word of target/source kernel value. It would be nice to deal only with u32 kernel values in _u32 functions and with u64 kernel values in _u64 functions, however it is not always possible because functions like set_invariant_cp15 get_invariant_cp15 store values in u64 types but registers are 32bit.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org --- arch/arm/kvm/coproc.c | 118 +++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 92 insertions(+), 26 deletions(-)
diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c index c58a351..5ca6582 100644 --- a/arch/arm/kvm/coproc.c +++ b/arch/arm/kvm/coproc.c @@ -682,17 +682,83 @@ static struct coproc_reg invariant_cp15[] = { { CRn( 0), CRm( 0), Op1( 1), Op2( 7), is32, NULL, get_AIDR }, };
-static int reg_from_user(void *val, const void __user *uaddr, u64 id) +/* + * Reads register value from user-land uaddr address into kernel u64 value + * given by val address. Note register size could be 4 bytes, so user-land + * 4 bytes value will be copied into least significant word. Or register + * size could be 8 bytes, so function works as regular copy. + */ +static int reg_from_user_to_u64(u64 *val, const void __user *uaddr, u64 id) +{ + unsigned long regsize = KVM_REG_SIZE(id); + union { + u32 word; + u64 dword; + } tmp = {0}; + + if (copy_from_user(&tmp, uaddr, regsize) != 0) + return -EFAULT; + + switch (regsize) { + case 4: + *val = tmp.word; + break; + case 8: + *val = tmp.dword; + break; + } + return 0; +} + +/* + * Reads register value from user-land uaddr address to kernel u32 value + * or array of two u32 values. I.e. it may really copy two u32 registers + * when used with register which size is 8 bytes (for example V7 64 + * bit registers like TTBR0/TTBR1). + */ +static int reg_from_user_to_u32(u32 *val, const void __user *uaddr, u64 id) { - /* This Just Works because we are little endian. */ if (copy_from_user(val, uaddr, KVM_REG_SIZE(id)) != 0) return -EFAULT; return 0; }
-static int reg_to_user(void __user *uaddr, const void *val, u64 id) +/* + * Writes register value to user-land uaddr address from kernel u64 value + * given by val address. Note register size could be 4 bytes, so only 4 + * bytes from least significant word will by copied into uaddr address. + * In case of 8 bytes sized register it works as regular copy. + */ +static int reg_to_user_from_u64(void __user *uaddr, const u64 *val, u64 id) +{ + unsigned long regsize = KVM_REG_SIZE(id); + union { + u32 word; + u64 dword; + } tmp; + + switch (regsize) { + case 4: + tmp.word = *val; + break; + case 8: + tmp.dword = *val; + break; + } + + if (copy_to_user(uaddr, &tmp, regsize) != 0) + return -EFAULT; + return 0; +} + +/* + * Writes register value to user-land uaddr address from kernel u32 value + * or arra of two u32 values. I.e it may really copy two u32 registers + * when used with register which size is 8 bytes (for example V7 64 + * bit registers like TTBR0/TTBR1). + */ +static int reg_to_user_from_u32(void __user *uaddr, const u32 *val, u64 id) { - /* This Just Works because we are little endian. */ if (copy_to_user(uaddr, val, KVM_REG_SIZE(id)) != 0) return -EFAULT; return 0; @@ -710,7 +776,7 @@ static int get_invariant_cp15(u64 id, void __user *uaddr) if (!r) return -ENOENT;
- return reg_to_user(uaddr, &r->val, id); + return reg_to_user_from_u64(uaddr, &r->val, id); }
static int set_invariant_cp15(u64 id, void __user *uaddr) @@ -726,7 +792,7 @@ static int set_invariant_cp15(u64 id, void __user *uaddr) if (!r) return -ENOENT;
- err = reg_from_user(&val, uaddr, id); + err = reg_from_user_to_u64(&val, uaddr, id); if (err) return err;
@@ -894,8 +960,8 @@ static int vfp_get_reg(const struct kvm_vcpu *vcpu, u64 id, void __user *uaddr) if (vfpid < num_fp_regs()) { if (KVM_REG_SIZE(id) != 8) return -ENOENT; - return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpregs[vfpid], - id); + return reg_to_user_from_u64(uaddr, &vcpu->arch.vfp_guest.fpregs[vfpid], + id); }
/* FP control registers are all 32 bit. */ @@ -904,22 +970,22 @@ static int vfp_get_reg(const struct kvm_vcpu *vcpu, u64 id, void __user *uaddr)
switch (vfpid) { case KVM_REG_ARM_VFP_FPEXC: - return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpexc, id); + return reg_to_user_from_u32(uaddr, &vcpu->arch.vfp_guest.fpexc, id); case KVM_REG_ARM_VFP_FPSCR: - return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpscr, id); + return reg_to_user_from_u32(uaddr, &vcpu->arch.vfp_guest.fpscr, id); case KVM_REG_ARM_VFP_FPINST: - return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpinst, id); + return reg_to_user_from_u32(uaddr, &vcpu->arch.vfp_guest.fpinst, id); case KVM_REG_ARM_VFP_FPINST2: - return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpinst2, id); + return reg_to_user_from_u32(uaddr, &vcpu->arch.vfp_guest.fpinst2, id); case KVM_REG_ARM_VFP_MVFR0: val = fmrx(MVFR0); - return reg_to_user(uaddr, &val, id); + return reg_to_user_from_u32(uaddr, &val, id); case KVM_REG_ARM_VFP_MVFR1: val = fmrx(MVFR1); - return reg_to_user(uaddr, &val, id); + return reg_to_user_from_u32(uaddr, &val, id); case KVM_REG_ARM_VFP_FPSID: val = fmrx(FPSID); - return reg_to_user(uaddr, &val, id); + return reg_to_user_from_u32(uaddr, &val, id); default: return -ENOENT; } @@ -938,8 +1004,8 @@ static int vfp_set_reg(struct kvm_vcpu *vcpu, u64 id, const void __user *uaddr) if (vfpid < num_fp_regs()) { if (KVM_REG_SIZE(id) != 8) return -ENOENT; - return reg_from_user(&vcpu->arch.vfp_guest.fpregs[vfpid], - uaddr, id); + return reg_from_user_to_u64(&vcpu->arch.vfp_guest.fpregs[vfpid], + uaddr, id); }
/* FP control registers are all 32 bit. */ @@ -948,28 +1014,28 @@ static int vfp_set_reg(struct kvm_vcpu *vcpu, u64 id, const void __user *uaddr)
switch (vfpid) { case KVM_REG_ARM_VFP_FPEXC: - return reg_from_user(&vcpu->arch.vfp_guest.fpexc, uaddr, id); + return reg_from_user_to_u32(&vcpu->arch.vfp_guest.fpexc, uaddr, id); case KVM_REG_ARM_VFP_FPSCR: - return reg_from_user(&vcpu->arch.vfp_guest.fpscr, uaddr, id); + return reg_from_user_to_u32(&vcpu->arch.vfp_guest.fpscr, uaddr, id); case KVM_REG_ARM_VFP_FPINST: - return reg_from_user(&vcpu->arch.vfp_guest.fpinst, uaddr, id); + return reg_from_user_to_u32(&vcpu->arch.vfp_guest.fpinst, uaddr, id); case KVM_REG_ARM_VFP_FPINST2: - return reg_from_user(&vcpu->arch.vfp_guest.fpinst2, uaddr, id); + return reg_from_user_to_u32(&vcpu->arch.vfp_guest.fpinst2, uaddr, id); /* These are invariant. */ case KVM_REG_ARM_VFP_MVFR0: - if (reg_from_user(&val, uaddr, id)) + if (reg_from_user_to_u32(&val, uaddr, id)) return -EFAULT; if (val != fmrx(MVFR0)) return -EINVAL; return 0; case KVM_REG_ARM_VFP_MVFR1: - if (reg_from_user(&val, uaddr, id)) + if (reg_from_user_to_u32(&val, uaddr, id)) return -EFAULT; if (val != fmrx(MVFR1)) return -EINVAL; return 0; case KVM_REG_ARM_VFP_FPSID: - if (reg_from_user(&val, uaddr, id)) + if (reg_from_user_to_u32(&val, uaddr, id)) return -EFAULT; if (val != fmrx(FPSID)) return -EINVAL; @@ -1016,7 +1082,7 @@ int kvm_arm_coproc_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) return get_invariant_cp15(reg->id, uaddr);
/* Note: copies two regs if size is 64 bit. */ - return reg_to_user(uaddr, &vcpu->arch.cp15[r->reg], reg->id); + return reg_to_user_from_u32(uaddr, &vcpu->arch.cp15[r->reg], reg->id); }
int kvm_arm_coproc_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) @@ -1035,7 +1101,7 @@ int kvm_arm_coproc_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) return set_invariant_cp15(reg->id, uaddr);
/* Note: copies two regs if size is 64 bit */ - return reg_from_user(&vcpu->arch.cp15[r->reg], uaddr, reg->id); + return reg_from_user_to_u32(&vcpu->arch.cp15[r->reg], uaddr, reg->id); }
static unsigned int num_demux_regs(void)
On Tue, May 13, 2014 at 09:13:59AM -0700, Victor Kamensky wrote:
Fix code that handles KVM_SET_ONE_REG, KVM_GET_ONE_REG ioctls to work in BE image. Before this fix get/set_one_reg functions worked correctly only in LE case - reg_from_user was taking 'void *' kernel address that actually could be target/source memory of either 4 bytes size or 8 bytes size, and code copied from/to user memory that could hold either 4 bytes register, 8 byte register or pair of 4 bytes registers.
For example note that there was a case when 4 bytes register was read from user-land to kernel target address of 8 bytes value. Because it was working in LE, least significant word was memcpy(ied) and it just worked. In BE code with 'void *' as target/source 'val' type it is impossible to tell whether 4 bytes register from user-land should be copied to 'val' address itself (4 bytes target) or it should be copied to 'val' + 4 (least significant word of 8 bytes value). So first change was to introduce strongly typed functions, where type of target/source 'val' is strongly defined:
reg_from_user_to_u64 - reads register from user-land to kernel 'u64 *val' address; register size could be 4 or 8 bytes reg_from_user_to_u32 - reads register(s) from user-land to kernel 'u32 *val' address; note it could be one or two 4 bytes registers reg_to_user_from_u64 - writes register from kernel 'u64 *val' address to user-land register memory; register size could be 4 or 8 bytes ret_to_user_from_u32 - writes register(s) from kernel 'u32 *val' address to user-land register(s) memory; note it could be one or two 4 bytes registers
All places where reg_from_user, reg_to_user functions were used, were changed to use either corresponding u64 or u32 variants of functions depending on type of source/target kernel memory variable.
In case of 'u64 *val' and register size equals 4 bytes, reg_from_user_to_u64 and reg_to_user_from_u64 work only with least siginificant word of target/source kernel value. It would be nice to deal only with u32 kernel values in _u32 functions and with u64 kernel values in _u64 functions, however it is not always possible because functions like set_invariant_cp15 get_invariant_cp15 store values in u64 types but registers are 32bit.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org
arch/arm/kvm/coproc.c | 118 +++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 92 insertions(+), 26 deletions(-)
diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c index c58a351..5ca6582 100644 --- a/arch/arm/kvm/coproc.c +++ b/arch/arm/kvm/coproc.c @@ -682,17 +682,83 @@ static struct coproc_reg invariant_cp15[] = { { CRn( 0), CRm( 0), Op1( 1), Op2( 7), is32, NULL, get_AIDR }, }; -static int reg_from_user(void *val, const void __user *uaddr, u64 id) +/*
- Reads register value from user-land uaddr address into kernel u64 value
- given by val address. Note register size could be 4 bytes, so user-land
- 4 bytes value will be copied into least significant word. Or register
- size could be 8 bytes, so function works as regular copy.
- */
Reads a register value from a userspace address to a kernel u64 variable. Note that the id may still encode a register size of 4 bytes, in which case only 4 bytes will be copied from userspace into the least significant word of *val.
+static int reg_from_user_to_u64(u64 *val, const void __user *uaddr, u64 id) +{
- unsigned long regsize = KVM_REG_SIZE(id);
- union {
u32 word;
u64 dword;
- } tmp = {0};
- if (copy_from_user(&tmp, uaddr, regsize) != 0)
return -EFAULT;
- switch (regsize) {
- case 4:
*val = tmp.word;
break;
- case 8:
*val = tmp.dword;
break;
- }
- return 0;
+}
+/*
- Reads register value from user-land uaddr address to kernel u32 value
- or array of two u32 values. I.e. it may really copy two u32 registers
- when used with register which size is 8 bytes (for example V7 64
- bit registers like TTBR0/TTBR1).
- */
Reads a register value from a userspace address to a kernel u32 variable or a kernel array of two u32 values. That is, it may really copy two u32 registers when used with registers defined by the ABI to be 8 bytes long (e.g. TTBR0/TTBR1).
+static int reg_from_user_to_u32(u32 *val, const void __user *uaddr, u64 id) {
- /* This Just Works because we are little endian. */ if (copy_from_user(val, uaddr, KVM_REG_SIZE(id)) != 0) return -EFAULT; return 0;
} -static int reg_to_user(void __user *uaddr, const void *val, u64 id) +/*
- Writes register value to user-land uaddr address from kernel u64 value
- given by val address. Note register size could be 4 bytes, so only 4
- bytes from least significant word will by copied into uaddr address.
- In case of 8 bytes sized register it works as regular copy.
- */
Writes a register value to a userspace address from a kernel u64 value. Note that the register size could be only 4 bytes, in which case only the least significant 4 bytes will be copied into to the userspace address.
+static int reg_to_user_from_u64(void __user *uaddr, const u64 *val, u64 id) +{
- unsigned long regsize = KVM_REG_SIZE(id);
- union {
u32 word;
u64 dword;
- } tmp;
- switch (regsize) {
- case 4:
tmp.word = *val;
break;
- case 8:
tmp.dword = *val;
break;
- }
- if (copy_to_user(uaddr, &tmp, regsize) != 0)
return -EFAULT;
- return 0;
+}
+/*
- Writes register value to user-land uaddr address from kernel u32 value
- or arra of two u32 values. I.e it may really copy two u32 registers
- when used with register which size is 8 bytes (for example V7 64
- bit registers like TTBR0/TTBR1).
- */
Writes a register value to a userspace address from a kernel u32 variable or a kernel array of two u32 values. That is, it may really copy two u32 registers when used with registers defined by the ABI to be 8 bytes long (e.g. TTBR0/TTBR1).
+static int reg_to_user_from_u32(void __user *uaddr, const u32 *val, u64 id) {
- /* This Just Works because we are little endian. */ if (copy_to_user(uaddr, val, KVM_REG_SIZE(id)) != 0) return -EFAULT; return 0;
@@ -710,7 +776,7 @@ static int get_invariant_cp15(u64 id, void __user *uaddr) if (!r) return -ENOENT;
- return reg_to_user(uaddr, &r->val, id);
- return reg_to_user_from_u64(uaddr, &r->val, id);
} static int set_invariant_cp15(u64 id, void __user *uaddr) @@ -726,7 +792,7 @@ static int set_invariant_cp15(u64 id, void __user *uaddr) if (!r) return -ENOENT;
- err = reg_from_user(&val, uaddr, id);
- err = reg_from_user_to_u64(&val, uaddr, id); if (err) return err;
@@ -894,8 +960,8 @@ static int vfp_get_reg(const struct kvm_vcpu *vcpu, u64 id, void __user *uaddr) if (vfpid < num_fp_regs()) { if (KVM_REG_SIZE(id) != 8) return -ENOENT;
return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpregs[vfpid],
id);
return reg_to_user_from_u64(uaddr, &vcpu->arch.vfp_guest.fpregs[vfpid],
}id);
/* FP control registers are all 32 bit. */ @@ -904,22 +970,22 @@ static int vfp_get_reg(const struct kvm_vcpu *vcpu, u64 id, void __user *uaddr) switch (vfpid) { case KVM_REG_ARM_VFP_FPEXC:
return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpexc, id);
case KVM_REG_ARM_VFP_FPSCR:return reg_to_user_from_u32(uaddr, &vcpu->arch.vfp_guest.fpexc, id);
return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpscr, id);
case KVM_REG_ARM_VFP_FPINST:return reg_to_user_from_u32(uaddr, &vcpu->arch.vfp_guest.fpscr, id);
return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpinst, id);
case KVM_REG_ARM_VFP_FPINST2:return reg_to_user_from_u32(uaddr, &vcpu->arch.vfp_guest.fpinst, id);
return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpinst2, id);
case KVM_REG_ARM_VFP_MVFR0: val = fmrx(MVFR0);return reg_to_user_from_u32(uaddr, &vcpu->arch.vfp_guest.fpinst2, id);
return reg_to_user(uaddr, &val, id);
case KVM_REG_ARM_VFP_MVFR1: val = fmrx(MVFR1);return reg_to_user_from_u32(uaddr, &val, id);
return reg_to_user(uaddr, &val, id);
case KVM_REG_ARM_VFP_FPSID: val = fmrx(FPSID);return reg_to_user_from_u32(uaddr, &val, id);
return reg_to_user(uaddr, &val, id);
default: return -ENOENT; }return reg_to_user_from_u32(uaddr, &val, id);
@@ -938,8 +1004,8 @@ static int vfp_set_reg(struct kvm_vcpu *vcpu, u64 id, const void __user *uaddr) if (vfpid < num_fp_regs()) { if (KVM_REG_SIZE(id) != 8) return -ENOENT;
return reg_from_user(&vcpu->arch.vfp_guest.fpregs[vfpid],
uaddr, id);
return reg_from_user_to_u64(&vcpu->arch.vfp_guest.fpregs[vfpid],
}uaddr, id);
/* FP control registers are all 32 bit. */ @@ -948,28 +1014,28 @@ static int vfp_set_reg(struct kvm_vcpu *vcpu, u64 id, const void __user *uaddr) switch (vfpid) { case KVM_REG_ARM_VFP_FPEXC:
return reg_from_user(&vcpu->arch.vfp_guest.fpexc, uaddr, id);
case KVM_REG_ARM_VFP_FPSCR:return reg_from_user_to_u32(&vcpu->arch.vfp_guest.fpexc, uaddr, id);
return reg_from_user(&vcpu->arch.vfp_guest.fpscr, uaddr, id);
case KVM_REG_ARM_VFP_FPINST:return reg_from_user_to_u32(&vcpu->arch.vfp_guest.fpscr, uaddr, id);
return reg_from_user(&vcpu->arch.vfp_guest.fpinst, uaddr, id);
case KVM_REG_ARM_VFP_FPINST2:return reg_from_user_to_u32(&vcpu->arch.vfp_guest.fpinst, uaddr, id);
return reg_from_user(&vcpu->arch.vfp_guest.fpinst2, uaddr, id);
/* These are invariant. */ case KVM_REG_ARM_VFP_MVFR0:return reg_from_user_to_u32(&vcpu->arch.vfp_guest.fpinst2, uaddr, id);
if (reg_from_user(&val, uaddr, id))
if (val != fmrx(MVFR0)) return -EINVAL; return 0; case KVM_REG_ARM_VFP_MVFR1:if (reg_from_user_to_u32(&val, uaddr, id)) return -EFAULT;
if (reg_from_user(&val, uaddr, id))
if (val != fmrx(MVFR1)) return -EINVAL; return 0; case KVM_REG_ARM_VFP_FPSID:if (reg_from_user_to_u32(&val, uaddr, id)) return -EFAULT;
if (reg_from_user(&val, uaddr, id))
if (val != fmrx(FPSID)) return -EINVAL;if (reg_from_user_to_u32(&val, uaddr, id)) return -EFAULT;
@@ -1016,7 +1082,7 @@ int kvm_arm_coproc_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) return get_invariant_cp15(reg->id, uaddr); /* Note: copies two regs if size is 64 bit. */
- return reg_to_user(uaddr, &vcpu->arch.cp15[r->reg], reg->id);
- return reg_to_user_from_u32(uaddr, &vcpu->arch.cp15[r->reg], reg->id);
} int kvm_arm_coproc_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) @@ -1035,7 +1101,7 @@ int kvm_arm_coproc_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) return set_invariant_cp15(reg->id, uaddr); /* Note: copies two regs if size is 64 bit */
- return reg_from_user(&vcpu->arch.cp15[r->reg], uaddr, reg->id);
- return reg_from_user_to_u32(&vcpu->arch.cp15[r->reg], uaddr, reg->id);
} static unsigned int num_demux_regs(void) -- 1.8.1.4
It's definitely an improvement with the new naming, and I do appreciate you taking the time to write comments. (Apologies for suggesting a raw rewrite, but there were a number of language issues with the proposed versions and it felt overly burdensome on you to ask you to address all of them).
However, I'm still not crazy about this overall approach (see separate mail on the old thread).
I'm curious, how many special cases are there really, and how many places would we have to introduce a conditional on the calling side?
Thanks, -Christoffer
On 25 May 2014 12:14, Christoffer Dall christoffer.dall@linaro.org wrote:
On Tue, May 13, 2014 at 09:13:59AM -0700, Victor Kamensky wrote:
Fix code that handles KVM_SET_ONE_REG, KVM_GET_ONE_REG ioctls to work in BE image. Before this fix get/set_one_reg functions worked correctly only in LE case - reg_from_user was taking 'void *' kernel address that actually could be target/source memory of either 4 bytes size or 8 bytes size, and code copied from/to user memory that could hold either 4 bytes register, 8 byte register or pair of 4 bytes registers.
For example note that there was a case when 4 bytes register was read from user-land to kernel target address of 8 bytes value. Because it was working in LE, least significant word was memcpy(ied) and it just worked. In BE code with 'void *' as target/source 'val' type it is impossible to tell whether 4 bytes register from user-land should be copied to 'val' address itself (4 bytes target) or it should be copied to 'val' + 4 (least significant word of 8 bytes value). So first change was to introduce strongly typed functions, where type of target/source 'val' is strongly defined:
reg_from_user_to_u64 - reads register from user-land to kernel 'u64 *val' address; register size could be 4 or 8 bytes reg_from_user_to_u32 - reads register(s) from user-land to kernel 'u32 *val' address; note it could be one or two 4 bytes registers reg_to_user_from_u64 - writes register from kernel 'u64 *val' address to user-land register memory; register size could be 4 or 8 bytes ret_to_user_from_u32 - writes register(s) from kernel 'u32 *val' address to user-land register(s) memory; note it could be one or two 4 bytes registers
All places where reg_from_user, reg_to_user functions were used, were changed to use either corresponding u64 or u32 variants of functions depending on type of source/target kernel memory variable.
In case of 'u64 *val' and register size equals 4 bytes, reg_from_user_to_u64 and reg_to_user_from_u64 work only with least siginificant word of target/source kernel value. It would be nice to deal only with u32 kernel values in _u32 functions and with u64 kernel values in _u64 functions, however it is not always possible because functions like set_invariant_cp15 get_invariant_cp15 store values in u64 types but registers are 32bit.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org
arch/arm/kvm/coproc.c | 118 +++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 92 insertions(+), 26 deletions(-)
diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c index c58a351..5ca6582 100644 --- a/arch/arm/kvm/coproc.c +++ b/arch/arm/kvm/coproc.c @@ -682,17 +682,83 @@ static struct coproc_reg invariant_cp15[] = { { CRn( 0), CRm( 0), Op1( 1), Op2( 7), is32, NULL, get_AIDR }, };
-static int reg_from_user(void *val, const void __user *uaddr, u64 id) +/*
- Reads register value from user-land uaddr address into kernel u64 value
- given by val address. Note register size could be 4 bytes, so user-land
- 4 bytes value will be copied into least significant word. Or register
- size could be 8 bytes, so function works as regular copy.
- */
Reads a register value from a userspace address to a kernel u64 variable. Note that the id may still encode a register size of 4 bytes, in which case only 4 bytes will be copied from userspace into the least significant word of *val.
+static int reg_from_user_to_u64(u64 *val, const void __user *uaddr, u64 id) +{
unsigned long regsize = KVM_REG_SIZE(id);
union {
u32 word;
u64 dword;
} tmp = {0};
if (copy_from_user(&tmp, uaddr, regsize) != 0)
return -EFAULT;
switch (regsize) {
case 4:
*val = tmp.word;
break;
case 8:
*val = tmp.dword;
break;
}
return 0;
+}
+/*
- Reads register value from user-land uaddr address to kernel u32 value
- or array of two u32 values. I.e. it may really copy two u32 registers
- when used with register which size is 8 bytes (for example V7 64
- bit registers like TTBR0/TTBR1).
- */
Reads a register value from a userspace address to a kernel u32 variable or a kernel array of two u32 values. That is, it may really copy two u32 registers when used with registers defined by the ABI to be 8 bytes long (e.g. TTBR0/TTBR1).
+static int reg_from_user_to_u32(u32 *val, const void __user *uaddr, u64 id) {
/* This Just Works because we are little endian. */ if (copy_from_user(val, uaddr, KVM_REG_SIZE(id)) != 0) return -EFAULT; return 0;
}
-static int reg_to_user(void __user *uaddr, const void *val, u64 id) +/*
- Writes register value to user-land uaddr address from kernel u64 value
- given by val address. Note register size could be 4 bytes, so only 4
- bytes from least significant word will by copied into uaddr address.
- In case of 8 bytes sized register it works as regular copy.
- */
Writes a register value to a userspace address from a kernel u64 value. Note that the register size could be only 4 bytes, in which case only the least significant 4 bytes will be copied into to the userspace address.
+static int reg_to_user_from_u64(void __user *uaddr, const u64 *val, u64 id) +{
unsigned long regsize = KVM_REG_SIZE(id);
union {
u32 word;
u64 dword;
} tmp;
switch (regsize) {
case 4:
tmp.word = *val;
break;
case 8:
tmp.dword = *val;
break;
}
if (copy_to_user(uaddr, &tmp, regsize) != 0)
return -EFAULT;
return 0;
+}
+/*
- Writes register value to user-land uaddr address from kernel u32 value
- or arra of two u32 values. I.e it may really copy two u32 registers
- when used with register which size is 8 bytes (for example V7 64
- bit registers like TTBR0/TTBR1).
- */
Writes a register value to a userspace address from a kernel u32 variable or a kernel array of two u32 values. That is, it may really copy two u32 registers when used with registers defined by the ABI to be 8 bytes long (e.g. TTBR0/TTBR1).
+static int reg_to_user_from_u32(void __user *uaddr, const u32 *val, u64 id) {
/* This Just Works because we are little endian. */ if (copy_to_user(uaddr, val, KVM_REG_SIZE(id)) != 0) return -EFAULT; return 0;
@@ -710,7 +776,7 @@ static int get_invariant_cp15(u64 id, void __user *uaddr) if (!r) return -ENOENT;
return reg_to_user(uaddr, &r->val, id);
return reg_to_user_from_u64(uaddr, &r->val, id);
}
static int set_invariant_cp15(u64 id, void __user *uaddr) @@ -726,7 +792,7 @@ static int set_invariant_cp15(u64 id, void __user *uaddr) if (!r) return -ENOENT;
err = reg_from_user(&val, uaddr, id);
err = reg_from_user_to_u64(&val, uaddr, id); if (err) return err;
@@ -894,8 +960,8 @@ static int vfp_get_reg(const struct kvm_vcpu *vcpu, u64 id, void __user *uaddr) if (vfpid < num_fp_regs()) { if (KVM_REG_SIZE(id) != 8) return -ENOENT;
return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpregs[vfpid],
id);
return reg_to_user_from_u64(uaddr, &vcpu->arch.vfp_guest.fpregs[vfpid],
id); } /* FP control registers are all 32 bit. */
@@ -904,22 +970,22 @@ static int vfp_get_reg(const struct kvm_vcpu *vcpu, u64 id, void __user *uaddr)
switch (vfpid) { case KVM_REG_ARM_VFP_FPEXC:
return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpexc, id);
return reg_to_user_from_u32(uaddr, &vcpu->arch.vfp_guest.fpexc, id); case KVM_REG_ARM_VFP_FPSCR:
return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpscr, id);
return reg_to_user_from_u32(uaddr, &vcpu->arch.vfp_guest.fpscr, id); case KVM_REG_ARM_VFP_FPINST:
return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpinst, id);
return reg_to_user_from_u32(uaddr, &vcpu->arch.vfp_guest.fpinst, id); case KVM_REG_ARM_VFP_FPINST2:
return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpinst2, id);
return reg_to_user_from_u32(uaddr, &vcpu->arch.vfp_guest.fpinst2, id); case KVM_REG_ARM_VFP_MVFR0: val = fmrx(MVFR0);
return reg_to_user(uaddr, &val, id);
return reg_to_user_from_u32(uaddr, &val, id); case KVM_REG_ARM_VFP_MVFR1: val = fmrx(MVFR1);
return reg_to_user(uaddr, &val, id);
return reg_to_user_from_u32(uaddr, &val, id); case KVM_REG_ARM_VFP_FPSID: val = fmrx(FPSID);
return reg_to_user(uaddr, &val, id);
return reg_to_user_from_u32(uaddr, &val, id); default: return -ENOENT; }
@@ -938,8 +1004,8 @@ static int vfp_set_reg(struct kvm_vcpu *vcpu, u64 id, const void __user *uaddr) if (vfpid < num_fp_regs()) { if (KVM_REG_SIZE(id) != 8) return -ENOENT;
return reg_from_user(&vcpu->arch.vfp_guest.fpregs[vfpid],
uaddr, id);
return reg_from_user_to_u64(&vcpu->arch.vfp_guest.fpregs[vfpid],
uaddr, id); } /* FP control registers are all 32 bit. */
@@ -948,28 +1014,28 @@ static int vfp_set_reg(struct kvm_vcpu *vcpu, u64 id, const void __user *uaddr)
switch (vfpid) { case KVM_REG_ARM_VFP_FPEXC:
return reg_from_user(&vcpu->arch.vfp_guest.fpexc, uaddr, id);
return reg_from_user_to_u32(&vcpu->arch.vfp_guest.fpexc, uaddr, id); case KVM_REG_ARM_VFP_FPSCR:
return reg_from_user(&vcpu->arch.vfp_guest.fpscr, uaddr, id);
return reg_from_user_to_u32(&vcpu->arch.vfp_guest.fpscr, uaddr, id); case KVM_REG_ARM_VFP_FPINST:
return reg_from_user(&vcpu->arch.vfp_guest.fpinst, uaddr, id);
return reg_from_user_to_u32(&vcpu->arch.vfp_guest.fpinst, uaddr, id); case KVM_REG_ARM_VFP_FPINST2:
return reg_from_user(&vcpu->arch.vfp_guest.fpinst2, uaddr, id);
return reg_from_user_to_u32(&vcpu->arch.vfp_guest.fpinst2, uaddr, id); /* These are invariant. */ case KVM_REG_ARM_VFP_MVFR0:
if (reg_from_user(&val, uaddr, id))
if (reg_from_user_to_u32(&val, uaddr, id)) return -EFAULT; if (val != fmrx(MVFR0)) return -EINVAL; return 0; case KVM_REG_ARM_VFP_MVFR1:
if (reg_from_user(&val, uaddr, id))
if (reg_from_user_to_u32(&val, uaddr, id)) return -EFAULT; if (val != fmrx(MVFR1)) return -EINVAL; return 0; case KVM_REG_ARM_VFP_FPSID:
if (reg_from_user(&val, uaddr, id))
if (reg_from_user_to_u32(&val, uaddr, id)) return -EFAULT; if (val != fmrx(FPSID)) return -EINVAL;
@@ -1016,7 +1082,7 @@ int kvm_arm_coproc_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) return get_invariant_cp15(reg->id, uaddr);
/* Note: copies two regs if size is 64 bit. */
return reg_to_user(uaddr, &vcpu->arch.cp15[r->reg], reg->id);
return reg_to_user_from_u32(uaddr, &vcpu->arch.cp15[r->reg], reg->id);
}
int kvm_arm_coproc_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) @@ -1035,7 +1101,7 @@ int kvm_arm_coproc_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) return set_invariant_cp15(reg->id, uaddr);
/* Note: copies two regs if size is 64 bit */
return reg_from_user(&vcpu->arch.cp15[r->reg], uaddr, reg->id);
return reg_from_user_to_u32(&vcpu->arch.cp15[r->reg], uaddr, reg->id);
}
static unsigned int num_demux_regs(void)
1.8.1.4
It's definitely an improvement with the new naming, and I do appreciate you taking the time to write comments. (Apologies for suggesting a raw rewrite, but there were a number of language issues with the proposed versions and it felt overly burdensome on you to ask you to address all of them).
However, I'm still not crazy about this overall approach (see separate mail on the old thread).
I'm curious, how many special cases are there really, and how many places would we have to introduce a conditional on the calling side?
I've just posted as RFC updated version of the patch. And yes, you are right u64 regsize 4 access happens only in one place, I fixed as you suggested and got rid of regsize == 4 in _u64 functions.
Reading/wring array of two u32 values as regsize == 8 still remains I don't know how to handle it more cleanly.
Thanks, Victor
Thanks, -Christoffer
On Tue, May 27, 2014 at 11:19:59PM -0700, Victor Kamensky wrote:
On 25 May 2014 12:14, Christoffer Dall christoffer.dall@linaro.org wrote:
On Tue, May 13, 2014 at 09:13:59AM -0700, Victor Kamensky wrote:
Fix code that handles KVM_SET_ONE_REG, KVM_GET_ONE_REG ioctls to work in BE image. Before this fix get/set_one_reg functions worked correctly only in LE case - reg_from_user was taking 'void *' kernel address that actually could be target/source memory of either 4 bytes size or 8 bytes size, and code copied from/to user memory that could hold either 4 bytes register, 8 byte register or pair of 4 bytes registers.
For example note that there was a case when 4 bytes register was read from user-land to kernel target address of 8 bytes value. Because it was working in LE, least significant word was memcpy(ied) and it just worked. In BE code with 'void *' as target/source 'val' type it is impossible to tell whether 4 bytes register from user-land should be copied to 'val' address itself (4 bytes target) or it should be copied to 'val' + 4 (least significant word of 8 bytes value). So first change was to introduce strongly typed functions, where type of target/source 'val' is strongly defined:
reg_from_user_to_u64 - reads register from user-land to kernel 'u64 *val' address; register size could be 4 or 8 bytes reg_from_user_to_u32 - reads register(s) from user-land to kernel 'u32 *val' address; note it could be one or two 4 bytes registers reg_to_user_from_u64 - writes register from kernel 'u64 *val' address to user-land register memory; register size could be 4 or 8 bytes ret_to_user_from_u32 - writes register(s) from kernel 'u32 *val' address to user-land register(s) memory; note it could be one or two 4 bytes registers
All places where reg_from_user, reg_to_user functions were used, were changed to use either corresponding u64 or u32 variants of functions depending on type of source/target kernel memory variable.
In case of 'u64 *val' and register size equals 4 bytes, reg_from_user_to_u64 and reg_to_user_from_u64 work only with least siginificant word of target/source kernel value. It would be nice to deal only with u32 kernel values in _u32 functions and with u64 kernel values in _u64 functions, however it is not always possible because functions like set_invariant_cp15 get_invariant_cp15 store values in u64 types but registers are 32bit.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org
arch/arm/kvm/coproc.c | 118 +++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 92 insertions(+), 26 deletions(-)
diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c index c58a351..5ca6582 100644 --- a/arch/arm/kvm/coproc.c +++ b/arch/arm/kvm/coproc.c @@ -682,17 +682,83 @@ static struct coproc_reg invariant_cp15[] = { { CRn( 0), CRm( 0), Op1( 1), Op2( 7), is32, NULL, get_AIDR }, };
-static int reg_from_user(void *val, const void __user *uaddr, u64 id) +/*
- Reads register value from user-land uaddr address into kernel u64 value
- given by val address. Note register size could be 4 bytes, so user-land
- 4 bytes value will be copied into least significant word. Or register
- size could be 8 bytes, so function works as regular copy.
- */
Reads a register value from a userspace address to a kernel u64 variable. Note that the id may still encode a register size of 4 bytes, in which case only 4 bytes will be copied from userspace into the least significant word of *val.
+static int reg_from_user_to_u64(u64 *val, const void __user *uaddr, u64 id) +{
unsigned long regsize = KVM_REG_SIZE(id);
union {
u32 word;
u64 dword;
} tmp = {0};
if (copy_from_user(&tmp, uaddr, regsize) != 0)
return -EFAULT;
switch (regsize) {
case 4:
*val = tmp.word;
break;
case 8:
*val = tmp.dword;
break;
}
return 0;
+}
+/*
- Reads register value from user-land uaddr address to kernel u32 value
- or array of two u32 values. I.e. it may really copy two u32 registers
- when used with register which size is 8 bytes (for example V7 64
- bit registers like TTBR0/TTBR1).
- */
Reads a register value from a userspace address to a kernel u32 variable or a kernel array of two u32 values. That is, it may really copy two u32 registers when used with registers defined by the ABI to be 8 bytes long (e.g. TTBR0/TTBR1).
+static int reg_from_user_to_u32(u32 *val, const void __user *uaddr, u64 id) {
/* This Just Works because we are little endian. */ if (copy_from_user(val, uaddr, KVM_REG_SIZE(id)) != 0) return -EFAULT; return 0;
}
-static int reg_to_user(void __user *uaddr, const void *val, u64 id) +/*
- Writes register value to user-land uaddr address from kernel u64 value
- given by val address. Note register size could be 4 bytes, so only 4
- bytes from least significant word will by copied into uaddr address.
- In case of 8 bytes sized register it works as regular copy.
- */
Writes a register value to a userspace address from a kernel u64 value. Note that the register size could be only 4 bytes, in which case only the least significant 4 bytes will be copied into to the userspace address.
+static int reg_to_user_from_u64(void __user *uaddr, const u64 *val, u64 id) +{
unsigned long regsize = KVM_REG_SIZE(id);
union {
u32 word;
u64 dword;
} tmp;
switch (regsize) {
case 4:
tmp.word = *val;
break;
case 8:
tmp.dword = *val;
break;
}
if (copy_to_user(uaddr, &tmp, regsize) != 0)
return -EFAULT;
return 0;
+}
+/*
- Writes register value to user-land uaddr address from kernel u32 value
- or arra of two u32 values. I.e it may really copy two u32 registers
- when used with register which size is 8 bytes (for example V7 64
- bit registers like TTBR0/TTBR1).
- */
Writes a register value to a userspace address from a kernel u32 variable or a kernel array of two u32 values. That is, it may really copy two u32 registers when used with registers defined by the ABI to be 8 bytes long (e.g. TTBR0/TTBR1).
+static int reg_to_user_from_u32(void __user *uaddr, const u32 *val, u64 id) {
/* This Just Works because we are little endian. */ if (copy_to_user(uaddr, val, KVM_REG_SIZE(id)) != 0) return -EFAULT; return 0;
@@ -710,7 +776,7 @@ static int get_invariant_cp15(u64 id, void __user *uaddr) if (!r) return -ENOENT;
return reg_to_user(uaddr, &r->val, id);
return reg_to_user_from_u64(uaddr, &r->val, id);
}
static int set_invariant_cp15(u64 id, void __user *uaddr) @@ -726,7 +792,7 @@ static int set_invariant_cp15(u64 id, void __user *uaddr) if (!r) return -ENOENT;
err = reg_from_user(&val, uaddr, id);
err = reg_from_user_to_u64(&val, uaddr, id); if (err) return err;
@@ -894,8 +960,8 @@ static int vfp_get_reg(const struct kvm_vcpu *vcpu, u64 id, void __user *uaddr) if (vfpid < num_fp_regs()) { if (KVM_REG_SIZE(id) != 8) return -ENOENT;
return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpregs[vfpid],
id);
return reg_to_user_from_u64(uaddr, &vcpu->arch.vfp_guest.fpregs[vfpid],
id); } /* FP control registers are all 32 bit. */
@@ -904,22 +970,22 @@ static int vfp_get_reg(const struct kvm_vcpu *vcpu, u64 id, void __user *uaddr)
switch (vfpid) { case KVM_REG_ARM_VFP_FPEXC:
return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpexc, id);
return reg_to_user_from_u32(uaddr, &vcpu->arch.vfp_guest.fpexc, id); case KVM_REG_ARM_VFP_FPSCR:
return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpscr, id);
return reg_to_user_from_u32(uaddr, &vcpu->arch.vfp_guest.fpscr, id); case KVM_REG_ARM_VFP_FPINST:
return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpinst, id);
return reg_to_user_from_u32(uaddr, &vcpu->arch.vfp_guest.fpinst, id); case KVM_REG_ARM_VFP_FPINST2:
return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpinst2, id);
return reg_to_user_from_u32(uaddr, &vcpu->arch.vfp_guest.fpinst2, id); case KVM_REG_ARM_VFP_MVFR0: val = fmrx(MVFR0);
return reg_to_user(uaddr, &val, id);
return reg_to_user_from_u32(uaddr, &val, id); case KVM_REG_ARM_VFP_MVFR1: val = fmrx(MVFR1);
return reg_to_user(uaddr, &val, id);
return reg_to_user_from_u32(uaddr, &val, id); case KVM_REG_ARM_VFP_FPSID: val = fmrx(FPSID);
return reg_to_user(uaddr, &val, id);
return reg_to_user_from_u32(uaddr, &val, id); default: return -ENOENT; }
@@ -938,8 +1004,8 @@ static int vfp_set_reg(struct kvm_vcpu *vcpu, u64 id, const void __user *uaddr) if (vfpid < num_fp_regs()) { if (KVM_REG_SIZE(id) != 8) return -ENOENT;
return reg_from_user(&vcpu->arch.vfp_guest.fpregs[vfpid],
uaddr, id);
return reg_from_user_to_u64(&vcpu->arch.vfp_guest.fpregs[vfpid],
uaddr, id); } /* FP control registers are all 32 bit. */
@@ -948,28 +1014,28 @@ static int vfp_set_reg(struct kvm_vcpu *vcpu, u64 id, const void __user *uaddr)
switch (vfpid) { case KVM_REG_ARM_VFP_FPEXC:
return reg_from_user(&vcpu->arch.vfp_guest.fpexc, uaddr, id);
return reg_from_user_to_u32(&vcpu->arch.vfp_guest.fpexc, uaddr, id); case KVM_REG_ARM_VFP_FPSCR:
return reg_from_user(&vcpu->arch.vfp_guest.fpscr, uaddr, id);
return reg_from_user_to_u32(&vcpu->arch.vfp_guest.fpscr, uaddr, id); case KVM_REG_ARM_VFP_FPINST:
return reg_from_user(&vcpu->arch.vfp_guest.fpinst, uaddr, id);
return reg_from_user_to_u32(&vcpu->arch.vfp_guest.fpinst, uaddr, id); case KVM_REG_ARM_VFP_FPINST2:
return reg_from_user(&vcpu->arch.vfp_guest.fpinst2, uaddr, id);
return reg_from_user_to_u32(&vcpu->arch.vfp_guest.fpinst2, uaddr, id); /* These are invariant. */ case KVM_REG_ARM_VFP_MVFR0:
if (reg_from_user(&val, uaddr, id))
if (reg_from_user_to_u32(&val, uaddr, id)) return -EFAULT; if (val != fmrx(MVFR0)) return -EINVAL; return 0; case KVM_REG_ARM_VFP_MVFR1:
if (reg_from_user(&val, uaddr, id))
if (reg_from_user_to_u32(&val, uaddr, id)) return -EFAULT; if (val != fmrx(MVFR1)) return -EINVAL; return 0; case KVM_REG_ARM_VFP_FPSID:
if (reg_from_user(&val, uaddr, id))
if (reg_from_user_to_u32(&val, uaddr, id)) return -EFAULT; if (val != fmrx(FPSID)) return -EINVAL;
@@ -1016,7 +1082,7 @@ int kvm_arm_coproc_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) return get_invariant_cp15(reg->id, uaddr);
/* Note: copies two regs if size is 64 bit. */
return reg_to_user(uaddr, &vcpu->arch.cp15[r->reg], reg->id);
return reg_to_user_from_u32(uaddr, &vcpu->arch.cp15[r->reg], reg->id);
}
int kvm_arm_coproc_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) @@ -1035,7 +1101,7 @@ int kvm_arm_coproc_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) return set_invariant_cp15(reg->id, uaddr);
/* Note: copies two regs if size is 64 bit */
return reg_from_user(&vcpu->arch.cp15[r->reg], uaddr, reg->id);
return reg_from_user_to_u32(&vcpu->arch.cp15[r->reg], uaddr, reg->id);
}
static unsigned int num_demux_regs(void)
1.8.1.4
It's definitely an improvement with the new naming, and I do appreciate you taking the time to write comments. (Apologies for suggesting a raw rewrite, but there were a number of language issues with the proposed versions and it felt overly burdensome on you to ask you to address all of them).
However, I'm still not crazy about this overall approach (see separate mail on the old thread).
I'm curious, how many special cases are there really, and how many places would we have to introduce a conditional on the calling side?
I've just posted as RFC updated version of the patch. And yes, you are right u64 regsize 4 access happens only in one place, I fixed as you suggested and got rid of regsize == 4 in _u64 functions.
Reading/wring array of two u32 values as regsize == 8 still remains I don't know how to handle it more cleanly.
The kernel-side caller has to put the value in a u64 (if it's a write), then call the user space 64-bit copying function, and then (if it's a read) convert that into the two u32s afterwards.
-Christoffer
On 13/05/14 17:13, Victor Kamensky wrote:
Fix code that handles KVM_SET_ONE_REG, KVM_GET_ONE_REG ioctls to work in BE image. Before this fix get/set_one_reg functions worked correctly only in LE case - reg_from_user was taking 'void *' kernel address that actually could be target/source memory of either 4 bytes size or 8 bytes size, and code copied from/to user memory that could hold either 4 bytes register, 8 byte register or pair of 4 bytes registers.
For example note that there was a case when 4 bytes register was read from user-land to kernel target address of 8 bytes value. Because it was working in LE, least significant word was memcpy(ied) and it just worked. In BE code with 'void *' as target/source 'val' type it is impossible to tell whether 4 bytes register from user-land should be copied to 'val' address itself (4 bytes target) or it should be copied to 'val' + 4 (least significant word of 8 bytes value). So first change was to introduce strongly typed functions, where type of target/source 'val' is strongly defined:
reg_from_user_to_u64 - reads register from user-land to kernel 'u64 *val' address; register size could be 4 or 8 bytes reg_from_user_to_u32 - reads register(s) from user-land to kernel 'u32 *val' address; note it could be one or two 4 bytes registers reg_to_user_from_u64 - writes register from kernel 'u64 *val' address to user-land register memory; register size could be 4 or 8 bytes ret_to_user_from_u32 - writes register(s) from kernel 'u32 *val' address to user-land register(s) memory; note it could be one or two 4 bytes registers
All places where reg_from_user, reg_to_user functions were used, were changed to use either corresponding u64 or u32 variants of functions depending on type of source/target kernel memory variable.
In case of 'u64 *val' and register size equals 4 bytes, reg_from_user_to_u64 and reg_to_user_from_u64 work only with least siginificant word of target/source kernel value. It would be nice to deal only with u32 kernel values in _u32 functions and with u64 kernel values in _u64 functions, however it is not always possible because functions like set_invariant_cp15 get_invariant_cp15 store values in u64 types but registers are 32bit.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org
arch/arm/kvm/coproc.c | 118 +++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 92 insertions(+), 26 deletions(-)
diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c index c58a351..5ca6582 100644 --- a/arch/arm/kvm/coproc.c +++ b/arch/arm/kvm/coproc.c @@ -682,17 +682,83 @@ static struct coproc_reg invariant_cp15[] = { { CRn( 0), CRm( 0), Op1( 1), Op2( 7), is32, NULL, get_AIDR }, }; -static int reg_from_user(void *val, const void __user *uaddr, u64 id) +/*
- Reads register value from user-land uaddr address into kernel u64 value
- given by val address. Note register size could be 4 bytes, so user-land
- 4 bytes value will be copied into least significant word. Or register
- size could be 8 bytes, so function works as regular copy.
- */
+static int reg_from_user_to_u64(u64 *val, const void __user *uaddr, u64 id) +{
- unsigned long regsize = KVM_REG_SIZE(id);
- union {
u32 word;
u64 dword;
- } tmp = {0};
- if (copy_from_user(&tmp, uaddr, regsize) != 0)
return -EFAULT;
- switch (regsize) {
- case 4:
*val = tmp.word;
break;
- case 8:
*val = tmp.dword;
break;
- }
- return 0;
+}
+/*
- Reads register value from user-land uaddr address to kernel u32 value
- or array of two u32 values. I.e. it may really copy two u32 registers
- when used with register which size is 8 bytes (for example V7 64
- bit registers like TTBR0/TTBR1).
- */
+static int reg_from_user_to_u32(u32 *val, const void __user *uaddr, u64 id) {
- /* This Just Works because we are little endian. */ if (copy_from_user(val, uaddr, KVM_REG_SIZE(id)) != 0) return -EFAULT; return 0;
} -static int reg_to_user(void __user *uaddr, const void *val, u64 id) +/*
- Writes register value to user-land uaddr address from kernel u64 value
- given by val address. Note register size could be 4 bytes, so only 4
- bytes from least significant word will by copied into uaddr address.
- In case of 8 bytes sized register it works as regular copy.
- */
+static int reg_to_user_from_u64(void __user *uaddr, const u64 *val, u64 id) +{
- unsigned long regsize = KVM_REG_SIZE(id);
- union {
u32 word;
u64 dword;
- } tmp;
- switch (regsize) {
- case 4:
tmp.word = *val;
break;
- case 8:
tmp.dword = *val;
break;
- }
- if (copy_to_user(uaddr, &tmp, regsize) != 0)
return -EFAULT;
- return 0;
+}
+/*
- Writes register value to user-land uaddr address from kernel u32 value
- or arra of two u32 values. I.e it may really copy two u32 registers
- when used with register which size is 8 bytes (for example V7 64
- bit registers like TTBR0/TTBR1).
- */
+static int reg_to_user_from_u32(void __user *uaddr, const u32 *val, u64 id) {
- /* This Just Works because we are little endian. */ if (copy_to_user(uaddr, val, KVM_REG_SIZE(id)) != 0) return -EFAULT; return 0;
@@ -710,7 +776,7 @@ static int get_invariant_cp15(u64 id, void __user *uaddr) if (!r) return -ENOENT;
- return reg_to_user(uaddr, &r->val, id);
- return reg_to_user_from_u64(uaddr, &r->val, id);
} static int set_invariant_cp15(u64 id, void __user *uaddr) @@ -726,7 +792,7 @@ static int set_invariant_cp15(u64 id, void __user *uaddr) if (!r) return -ENOENT;
- err = reg_from_user(&val, uaddr, id);
- err = reg_from_user_to_u64(&val, uaddr, id); if (err) return err;
@@ -894,8 +960,8 @@ static int vfp_get_reg(const struct kvm_vcpu *vcpu, u64 id, void __user *uaddr) if (vfpid < num_fp_regs()) { if (KVM_REG_SIZE(id) != 8) return -ENOENT;
return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpregs[vfpid],
id);
return reg_to_user_from_u64(uaddr, &vcpu->arch.vfp_guest.fpregs[vfpid],
}id);
/* FP control registers are all 32 bit. */ @@ -904,22 +970,22 @@ static int vfp_get_reg(const struct kvm_vcpu *vcpu, u64 id, void __user *uaddr) switch (vfpid) { case KVM_REG_ARM_VFP_FPEXC:
return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpexc, id);
case KVM_REG_ARM_VFP_FPSCR:return reg_to_user_from_u32(uaddr, &vcpu->arch.vfp_guest.fpexc, id);
return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpscr, id);
case KVM_REG_ARM_VFP_FPINST:return reg_to_user_from_u32(uaddr, &vcpu->arch.vfp_guest.fpscr, id);
return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpinst, id);
case KVM_REG_ARM_VFP_FPINST2:return reg_to_user_from_u32(uaddr, &vcpu->arch.vfp_guest.fpinst, id);
return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpinst2, id);
case KVM_REG_ARM_VFP_MVFR0: val = fmrx(MVFR0);return reg_to_user_from_u32(uaddr, &vcpu->arch.vfp_guest.fpinst2, id);
return reg_to_user(uaddr, &val, id);
case KVM_REG_ARM_VFP_MVFR1: val = fmrx(MVFR1);return reg_to_user_from_u32(uaddr, &val, id);
return reg_to_user(uaddr, &val, id);
case KVM_REG_ARM_VFP_FPSID: val = fmrx(FPSID);return reg_to_user_from_u32(uaddr, &val, id);
return reg_to_user(uaddr, &val, id);
default: return -ENOENT; }return reg_to_user_from_u32(uaddr, &val, id);
@@ -938,8 +1004,8 @@ static int vfp_set_reg(struct kvm_vcpu *vcpu, u64 id, const void __user *uaddr) if (vfpid < num_fp_regs()) { if (KVM_REG_SIZE(id) != 8) return -ENOENT;
return reg_from_user(&vcpu->arch.vfp_guest.fpregs[vfpid],
uaddr, id);
return reg_from_user_to_u64(&vcpu->arch.vfp_guest.fpregs[vfpid],
}uaddr, id);
/* FP control registers are all 32 bit. */ @@ -948,28 +1014,28 @@ static int vfp_set_reg(struct kvm_vcpu *vcpu, u64 id, const void __user *uaddr) switch (vfpid) { case KVM_REG_ARM_VFP_FPEXC:
return reg_from_user(&vcpu->arch.vfp_guest.fpexc, uaddr, id);
case KVM_REG_ARM_VFP_FPSCR:return reg_from_user_to_u32(&vcpu->arch.vfp_guest.fpexc, uaddr, id);
return reg_from_user(&vcpu->arch.vfp_guest.fpscr, uaddr, id);
case KVM_REG_ARM_VFP_FPINST:return reg_from_user_to_u32(&vcpu->arch.vfp_guest.fpscr, uaddr, id);
return reg_from_user(&vcpu->arch.vfp_guest.fpinst, uaddr, id);
case KVM_REG_ARM_VFP_FPINST2:return reg_from_user_to_u32(&vcpu->arch.vfp_guest.fpinst, uaddr, id);
return reg_from_user(&vcpu->arch.vfp_guest.fpinst2, uaddr, id);
/* These are invariant. */ case KVM_REG_ARM_VFP_MVFR0:return reg_from_user_to_u32(&vcpu->arch.vfp_guest.fpinst2, uaddr, id);
if (reg_from_user(&val, uaddr, id))
if (val != fmrx(MVFR0)) return -EINVAL; return 0; case KVM_REG_ARM_VFP_MVFR1:if (reg_from_user_to_u32(&val, uaddr, id)) return -EFAULT;
if (reg_from_user(&val, uaddr, id))
if (val != fmrx(MVFR1)) return -EINVAL; return 0; case KVM_REG_ARM_VFP_FPSID:if (reg_from_user_to_u32(&val, uaddr, id)) return -EFAULT;
if (reg_from_user(&val, uaddr, id))
if (val != fmrx(FPSID)) return -EINVAL;if (reg_from_user_to_u32(&val, uaddr, id)) return -EFAULT;
@@ -1016,7 +1082,7 @@ int kvm_arm_coproc_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) return get_invariant_cp15(reg->id, uaddr); /* Note: copies two regs if size is 64 bit. */
- return reg_to_user(uaddr, &vcpu->arch.cp15[r->reg], reg->id);
- return reg_to_user_from_u32(uaddr, &vcpu->arch.cp15[r->reg], reg->id);
} int kvm_arm_coproc_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) @@ -1035,7 +1101,7 @@ int kvm_arm_coproc_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) return set_invariant_cp15(reg->id, uaddr); /* Note: copies two regs if size is 64 bit */
- return reg_from_user(&vcpu->arch.cp15[r->reg], uaddr, reg->id);
- return reg_from_user_to_u32(&vcpu->arch.cp15[r->reg], uaddr, reg->id);
} static unsigned int num_demux_regs(void)
I really dislike this patch because of the asymmetric behaviour. Why can't we always use the "_u64" version?
Actually, why don't we use a construct similar to what we have for mmio_{read,write}_buf? This would give us a unified access method.
Thanks,
M.
On 27 May 2014 11:22, Marc Zyngier marc.zyngier@arm.com wrote:
On 13/05/14 17:13, Victor Kamensky wrote:
Fix code that handles KVM_SET_ONE_REG, KVM_GET_ONE_REG ioctls to work in BE image. Before this fix get/set_one_reg functions worked correctly only in LE case - reg_from_user was taking 'void *' kernel address that actually could be target/source memory of either 4 bytes size or 8 bytes size, and code copied from/to user memory that could hold either 4 bytes register, 8 byte register or pair of 4 bytes registers.
For example note that there was a case when 4 bytes register was read from user-land to kernel target address of 8 bytes value. Because it was working in LE, least significant word was memcpy(ied) and it just worked. In BE code with 'void *' as target/source 'val' type it is impossible to tell whether 4 bytes register from user-land should be copied to 'val' address itself (4 bytes target) or it should be copied to 'val' + 4 (least significant word of 8 bytes value). So first change was to introduce strongly typed functions, where type of target/source 'val' is strongly defined:
reg_from_user_to_u64 - reads register from user-land to kernel 'u64 *val' address; register size could be 4 or 8 bytes reg_from_user_to_u32 - reads register(s) from user-land to kernel 'u32 *val' address; note it could be one or two 4 bytes registers reg_to_user_from_u64 - writes register from kernel 'u64 *val' address to user-land register memory; register size could be 4 or 8 bytes ret_to_user_from_u32 - writes register(s) from kernel 'u32 *val' address to user-land register(s) memory; note it could be one or two 4 bytes registers
All places where reg_from_user, reg_to_user functions were used, were changed to use either corresponding u64 or u32 variants of functions depending on type of source/target kernel memory variable.
In case of 'u64 *val' and register size equals 4 bytes, reg_from_user_to_u64 and reg_to_user_from_u64 work only with least siginificant word of target/source kernel value. It would be nice to deal only with u32 kernel values in _u32 functions and with u64 kernel values in _u64 functions, however it is not always possible because functions like set_invariant_cp15 get_invariant_cp15 store values in u64 types but registers are 32bit.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org
arch/arm/kvm/coproc.c | 118 +++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 92 insertions(+), 26 deletions(-)
diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c index c58a351..5ca6582 100644 --- a/arch/arm/kvm/coproc.c +++ b/arch/arm/kvm/coproc.c @@ -682,17 +682,83 @@ static struct coproc_reg invariant_cp15[] = { { CRn( 0), CRm( 0), Op1( 1), Op2( 7), is32, NULL, get_AIDR }, };
-static int reg_from_user(void *val, const void __user *uaddr, u64 id) +/*
- Reads register value from user-land uaddr address into kernel u64 value
- given by val address. Note register size could be 4 bytes, so user-land
- 4 bytes value will be copied into least significant word. Or register
- size could be 8 bytes, so function works as regular copy.
- */
+static int reg_from_user_to_u64(u64 *val, const void __user *uaddr, u64 id) +{
unsigned long regsize = KVM_REG_SIZE(id);
union {
u32 word;
u64 dword;
} tmp = {0};
if (copy_from_user(&tmp, uaddr, regsize) != 0)
return -EFAULT;
switch (regsize) {
case 4:
*val = tmp.word;
break;
case 8:
*val = tmp.dword;
break;
}
return 0;
+}
+/*
- Reads register value from user-land uaddr address to kernel u32 value
- or array of two u32 values. I.e. it may really copy two u32 registers
- when used with register which size is 8 bytes (for example V7 64
- bit registers like TTBR0/TTBR1).
- */
+static int reg_from_user_to_u32(u32 *val, const void __user *uaddr, u64 id) {
/* This Just Works because we are little endian. */ if (copy_from_user(val, uaddr, KVM_REG_SIZE(id)) != 0) return -EFAULT; return 0;
}
-static int reg_to_user(void __user *uaddr, const void *val, u64 id) +/*
- Writes register value to user-land uaddr address from kernel u64 value
- given by val address. Note register size could be 4 bytes, so only 4
- bytes from least significant word will by copied into uaddr address.
- In case of 8 bytes sized register it works as regular copy.
- */
+static int reg_to_user_from_u64(void __user *uaddr, const u64 *val, u64 id) +{
unsigned long regsize = KVM_REG_SIZE(id);
union {
u32 word;
u64 dword;
} tmp;
switch (regsize) {
case 4:
tmp.word = *val;
break;
case 8:
tmp.dword = *val;
break;
}
if (copy_to_user(uaddr, &tmp, regsize) != 0)
return -EFAULT;
return 0;
+}
+/*
- Writes register value to user-land uaddr address from kernel u32 value
- or arra of two u32 values. I.e it may really copy two u32 registers
- when used with register which size is 8 bytes (for example V7 64
- bit registers like TTBR0/TTBR1).
- */
+static int reg_to_user_from_u32(void __user *uaddr, const u32 *val, u64 id) {
/* This Just Works because we are little endian. */ if (copy_to_user(uaddr, val, KVM_REG_SIZE(id)) != 0) return -EFAULT; return 0;
@@ -710,7 +776,7 @@ static int get_invariant_cp15(u64 id, void __user *uaddr) if (!r) return -ENOENT;
return reg_to_user(uaddr, &r->val, id);
return reg_to_user_from_u64(uaddr, &r->val, id);
}
static int set_invariant_cp15(u64 id, void __user *uaddr) @@ -726,7 +792,7 @@ static int set_invariant_cp15(u64 id, void __user *uaddr) if (!r) return -ENOENT;
err = reg_from_user(&val, uaddr, id);
err = reg_from_user_to_u64(&val, uaddr, id); if (err) return err;
@@ -894,8 +960,8 @@ static int vfp_get_reg(const struct kvm_vcpu *vcpu, u64 id, void __user *uaddr) if (vfpid < num_fp_regs()) { if (KVM_REG_SIZE(id) != 8) return -ENOENT;
return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpregs[vfpid],
id);
return reg_to_user_from_u64(uaddr, &vcpu->arch.vfp_guest.fpregs[vfpid],
id); } /* FP control registers are all 32 bit. */
@@ -904,22 +970,22 @@ static int vfp_get_reg(const struct kvm_vcpu *vcpu, u64 id, void __user *uaddr)
switch (vfpid) { case KVM_REG_ARM_VFP_FPEXC:
return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpexc, id);
return reg_to_user_from_u32(uaddr, &vcpu->arch.vfp_guest.fpexc, id); case KVM_REG_ARM_VFP_FPSCR:
return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpscr, id);
return reg_to_user_from_u32(uaddr, &vcpu->arch.vfp_guest.fpscr, id); case KVM_REG_ARM_VFP_FPINST:
return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpinst, id);
return reg_to_user_from_u32(uaddr, &vcpu->arch.vfp_guest.fpinst, id); case KVM_REG_ARM_VFP_FPINST2:
return reg_to_user(uaddr, &vcpu->arch.vfp_guest.fpinst2, id);
return reg_to_user_from_u32(uaddr, &vcpu->arch.vfp_guest.fpinst2, id); case KVM_REG_ARM_VFP_MVFR0: val = fmrx(MVFR0);
return reg_to_user(uaddr, &val, id);
return reg_to_user_from_u32(uaddr, &val, id); case KVM_REG_ARM_VFP_MVFR1: val = fmrx(MVFR1);
return reg_to_user(uaddr, &val, id);
return reg_to_user_from_u32(uaddr, &val, id); case KVM_REG_ARM_VFP_FPSID: val = fmrx(FPSID);
return reg_to_user(uaddr, &val, id);
return reg_to_user_from_u32(uaddr, &val, id); default: return -ENOENT; }
@@ -938,8 +1004,8 @@ static int vfp_set_reg(struct kvm_vcpu *vcpu, u64 id, const void __user *uaddr) if (vfpid < num_fp_regs()) { if (KVM_REG_SIZE(id) != 8) return -ENOENT;
return reg_from_user(&vcpu->arch.vfp_guest.fpregs[vfpid],
uaddr, id);
return reg_from_user_to_u64(&vcpu->arch.vfp_guest.fpregs[vfpid],
uaddr, id); } /* FP control registers are all 32 bit. */
@@ -948,28 +1014,28 @@ static int vfp_set_reg(struct kvm_vcpu *vcpu, u64 id, const void __user *uaddr)
switch (vfpid) { case KVM_REG_ARM_VFP_FPEXC:
return reg_from_user(&vcpu->arch.vfp_guest.fpexc, uaddr, id);
return reg_from_user_to_u32(&vcpu->arch.vfp_guest.fpexc, uaddr, id); case KVM_REG_ARM_VFP_FPSCR:
return reg_from_user(&vcpu->arch.vfp_guest.fpscr, uaddr, id);
return reg_from_user_to_u32(&vcpu->arch.vfp_guest.fpscr, uaddr, id); case KVM_REG_ARM_VFP_FPINST:
return reg_from_user(&vcpu->arch.vfp_guest.fpinst, uaddr, id);
return reg_from_user_to_u32(&vcpu->arch.vfp_guest.fpinst, uaddr, id); case KVM_REG_ARM_VFP_FPINST2:
return reg_from_user(&vcpu->arch.vfp_guest.fpinst2, uaddr, id);
return reg_from_user_to_u32(&vcpu->arch.vfp_guest.fpinst2, uaddr, id); /* These are invariant. */ case KVM_REG_ARM_VFP_MVFR0:
if (reg_from_user(&val, uaddr, id))
if (reg_from_user_to_u32(&val, uaddr, id)) return -EFAULT; if (val != fmrx(MVFR0)) return -EINVAL; return 0; case KVM_REG_ARM_VFP_MVFR1:
if (reg_from_user(&val, uaddr, id))
if (reg_from_user_to_u32(&val, uaddr, id)) return -EFAULT; if (val != fmrx(MVFR1)) return -EINVAL; return 0; case KVM_REG_ARM_VFP_FPSID:
if (reg_from_user(&val, uaddr, id))
if (reg_from_user_to_u32(&val, uaddr, id)) return -EFAULT; if (val != fmrx(FPSID)) return -EINVAL;
@@ -1016,7 +1082,7 @@ int kvm_arm_coproc_get_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) return get_invariant_cp15(reg->id, uaddr);
/* Note: copies two regs if size is 64 bit. */
return reg_to_user(uaddr, &vcpu->arch.cp15[r->reg], reg->id);
return reg_to_user_from_u32(uaddr, &vcpu->arch.cp15[r->reg], reg->id);
}
int kvm_arm_coproc_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) @@ -1035,7 +1101,7 @@ int kvm_arm_coproc_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg) return set_invariant_cp15(reg->id, uaddr);
/* Note: copies two regs if size is 64 bit */
return reg_from_user(&vcpu->arch.cp15[r->reg], uaddr, reg->id);
return reg_from_user_to_u32(&vcpu->arch.cp15[r->reg], uaddr, reg->id);
}
static unsigned int num_demux_regs(void)
I really dislike this patch because of the asymmetric behaviour. Why can't we always use the "_u64" version?
Kernel target type could be u32. It code passes it as pointer to u64 it will do wrong things.
I've sent the most recent version of BE coproc handling patches as RFC, let's discuss it there. I think I addressed your questions/concerns in cover letter.
Thanks, Victor
Actually, why don't we use a construct similar to what we have for mmio_{read,write}_buf? This would give us a unified access method.
Thanks,
M.
-- Jazz is not dead. It just smells funny...
Previous patches addresses ARMV7 big-endian virtualiztion, kvm related issues, so enable ARM_VIRT_EXT for big-endian now.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org --- arch/arm/kvm/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig index 4be5bb1..466bd29 100644 --- a/arch/arm/kvm/Kconfig +++ b/arch/arm/kvm/Kconfig @@ -23,7 +23,7 @@ config KVM select HAVE_KVM_CPU_RELAX_INTERCEPT select KVM_MMIO select KVM_ARM_HOST - depends on ARM_VIRT_EXT && ARM_LPAE && !CPU_BIG_ENDIAN + depends on ARM_VIRT_EXT && ARM_LPAE ---help--- Support hosting virtualized guest machines. You will also need to select one or more of the processor modules below.
In case of guest CPU running in LE mode and host runs in BE mode we need byteswap data, so read/write is emulated correctly.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org Reviewed-by: Christoffer Dall christoffer.dall@linaro.org --- arch/arm64/include/asm/kvm_emulate.h | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+)
diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h index dd8ecfc..fdc3e21 100644 --- a/arch/arm64/include/asm/kvm_emulate.h +++ b/arch/arm64/include/asm/kvm_emulate.h @@ -213,6 +213,17 @@ static inline unsigned long vcpu_data_guest_to_host(struct kvm_vcpu *vcpu, default: return be64_to_cpu(data); } + } else { + switch (len) { + case 1: + return data & 0xff; + case 2: + return le16_to_cpu(data & 0xffff); + case 4: + return le32_to_cpu(data & 0xffffffff); + default: + return le64_to_cpu(data); + } }
return data; /* Leave LE untouched */ @@ -233,6 +244,17 @@ static inline unsigned long vcpu_data_host_to_guest(struct kvm_vcpu *vcpu, default: return cpu_to_be64(data); } + } else { + switch (len) { + case 1: + return data & 0xff; + case 2: + return cpu_to_le16(data & 0xffff); + case 4: + return cpu_to_le32(data & 0xffffffff); + default: + return cpu_to_le64(data); + } }
return data; /* Leave LE untouched */
On 13/05/14 17:14, Victor Kamensky wrote:
In case of guest CPU running in LE mode and host runs in BE mode we need byteswap data, so read/write is emulated correctly.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org Reviewed-by: Christoffer Dall christoffer.dall@linaro.org
arch/arm64/include/asm/kvm_emulate.h | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+)
diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h index dd8ecfc..fdc3e21 100644 --- a/arch/arm64/include/asm/kvm_emulate.h +++ b/arch/arm64/include/asm/kvm_emulate.h @@ -213,6 +213,17 @@ static inline unsigned long vcpu_data_guest_to_host(struct kvm_vcpu *vcpu, default: return be64_to_cpu(data); }
- } else {
switch (len) {
case 1:
return data & 0xff;
case 2:
return le16_to_cpu(data & 0xffff);
case 4:
return le32_to_cpu(data & 0xffffffff);
default:
return le64_to_cpu(data);
}}
return data; /* Leave LE untouched */ @@ -233,6 +244,17 @@ static inline unsigned long vcpu_data_host_to_guest(struct kvm_vcpu *vcpu, default: return cpu_to_be64(data); }
- } else {
switch (len) {
case 1:
return data & 0xff;
case 2:
return cpu_to_le16(data & 0xffff);
case 4:
return cpu_to_le32(data & 0xffffffff);
default:
return cpu_to_le64(data);
}}
return data; /* Leave LE untouched */
Reviewed-by: Marc Zyngier marc.zyngier@arm.com
M.
esr_el2 field of struct kvm_vcpu_fault_info has u32 type. It should be stored as word. Current code works in LE case because existing puts least significant word of x1 into esr_el2, and it puts most significant work of x1 into next field, which accidentally is OK because it is updated again by next instruction. But existing code breaks in BE case.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org Acked-by: Christoffer Dall christoffer.dall@linaro.org --- arch/arm64/kvm/hyp.S | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S index 2c56012..0620691 100644 --- a/arch/arm64/kvm/hyp.S +++ b/arch/arm64/kvm/hyp.S @@ -824,7 +824,7 @@ el1_trap: mrs x2, far_el2
2: mrs x0, tpidr_el2 - str x1, [x0, #VCPU_ESR_EL2] + str w1, [x0, #VCPU_ESR_EL2] str x2, [x0, #VCPU_FAR_EL2] str x3, [x0, #VCPU_HPFAR_EL2]
On 13/05/14 17:14, Victor Kamensky wrote:
esr_el2 field of struct kvm_vcpu_fault_info has u32 type. It should be stored as word. Current code works in LE case because existing puts least significant word of x1 into esr_el2, and it puts most significant work of x1 into next
word
field, which accidentally is OK because it is updated again by next instruction. But existing code breaks in BE case.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org Acked-by: Christoffer Dall christoffer.dall@linaro.org
arch/arm64/kvm/hyp.S | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S index 2c56012..0620691 100644 --- a/arch/arm64/kvm/hyp.S +++ b/arch/arm64/kvm/hyp.S @@ -824,7 +824,7 @@ el1_trap: mrs x2, far_el2 2: mrs x0, tpidr_el2
- str x1, [x0, #VCPU_ESR_EL2]
- str w1, [x0, #VCPU_ESR_EL2] str x2, [x0, #VCPU_FAR_EL2] str x3, [x0, #VCPU_HPFAR_EL2]
Nice catch! :-)
Acked-by: Marc Zyngier marc.zyngier@arm.com
M.
Fix vgic_bitmap_get_reg function to return 'right' word address of 'unsigned long' bitmap value in case of BE 64bit image.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org --- virt/kvm/arm/vgic.c | 24 ++++++++++++++++++++++-- 1 file changed, 22 insertions(+), 2 deletions(-)
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c index 529c336..d0a1a24 100644 --- a/virt/kvm/arm/vgic.c +++ b/virt/kvm/arm/vgic.c @@ -101,14 +101,34 @@ static u32 vgic_nr_lr;
static unsigned int vgic_maint_irq;
+/* + * struct vgic_bitmap contains union that provides two view of + * the same data. In one case it is an array of registers of + * u32 type (.reg), and in the other case it is a bitmape, which is + * an array of 'unsgined long' (.reg_ul). It all works well in + * case of 32-bit (u32 and 'unsigned long' have the same size). + * It alos works well in 64-bit LE, but breaks on 64-bit BE. + * In this case word sized register of even index actually resides + * in least significant word of 'unsigned long' which has address + * at offset plus 4 bytes. And word sized register of odd index + * resides at most significant of 'unsigned long' which has offset + * minus 4 bytes. Define REG_OFFSET_SWIZZLE that would help to + * change offset of register in case of BE 64bit system. + */ +#if defined(CONFIG_CPU_BIG_ENDIAN) && BITS_PER_LONG == 64 +#define REG_OFFSET_SWIZZLE 1 +#else +#define REG_OFFSET_SWIZZLE 0 +#endif + static u32 *vgic_bitmap_get_reg(struct vgic_bitmap *x, int cpuid, u32 offset) { offset >>= 2; if (!offset) - return x->percpu[cpuid].reg; + return x->percpu[cpuid].reg + (offset ^ REG_OFFSET_SWIZZLE); else - return x->shared.reg + offset - 1; + return x->shared.reg + ((offset - 1) ^ REG_OFFSET_SWIZZLE); }
static int vgic_bitmap_get_irq_val(struct vgic_bitmap *x,
On Tue, May 13, 2014 at 09:14:03AM -0700, Victor Kamensky wrote:
Fix vgic_bitmap_get_reg function to return 'right' word address of 'unsigned long' bitmap value in case of BE 64bit image.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org
virt/kvm/arm/vgic.c | 24 ++++++++++++++++++++++-- 1 file changed, 22 insertions(+), 2 deletions(-)
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c index 529c336..d0a1a24 100644 --- a/virt/kvm/arm/vgic.c +++ b/virt/kvm/arm/vgic.c @@ -101,14 +101,34 @@ static u32 vgic_nr_lr; static unsigned int vgic_maint_irq; +/*
- struct vgic_bitmap contains union that provides two view of
it contains two unions, so:
...contains unions that provide
s/two view/two views/
- the same data. In one case it is an array of registers of
- u32 type (.reg), and in the other case it is a bitmape, which is
s/bitmape/bitmap/
- an array of 'unsgined long' (.reg_ul). It all works well in
- case of 32-bit (u32 and 'unsigned long' have the same size).
"in case of 32-bit" is too vague. Write 'on 32-bit systems' if that's what you mean.
- It alos works well in 64-bit LE, but breaks on 64-bit BE.
s/alos/also/
same with "in 64-bit" as above.
- In this case word sized register of even index actually resides
the word sized register
- in least significant word of 'unsigned long' which has address
in the least significant word of the 'unsigned long' at the register offset plus 4 bytes.
- at offset plus 4 bytes. And word sized register of odd index
the
- resides at most significant of 'unsigned long' which has offset
- minus 4 bytes. Define REG_OFFSET_SWIZZLE that would help to
- change offset of register in case of BE 64bit system.
- */
I actually think this is quite hard to understand, how about:
/* * struct vgic_bitmap contains unions that provide two views of * the same data. In one case it is an array of registers of * u32's, and in the other case it is a bitmap of unsigned * longs. * * This does not work on 64-bit BE systems, because the bitmap access * will store two consecutive 32-bit words with the higher-addressed * register's bits at the lower index and the lower-addressed register's * bits at the higher index. * * Therefore, swizzle the register index when accessing the 32-bit word * registers to access the right register's value. */
+#if defined(CONFIG_CPU_BIG_ENDIAN) && BITS_PER_LONG == 64 +#define REG_OFFSET_SWIZZLE 1 +#else +#define REG_OFFSET_SWIZZLE 0 +#endif
static u32 *vgic_bitmap_get_reg(struct vgic_bitmap *x, int cpuid, u32 offset) { offset >>= 2; if (!offset)
return x->percpu[cpuid].reg;
elsereturn x->percpu[cpuid].reg + (offset ^ REG_OFFSET_SWIZZLE);
return x->shared.reg + offset - 1;
return x->shared.reg + ((offset - 1) ^ REG_OFFSET_SWIZZLE);
} static int vgic_bitmap_get_irq_val(struct vgic_bitmap *x, -- 1.8.1.4
On arm64 'u32 vgic_eisr[2];' and 'u32 vgic_elrsr[2]' are accessed as one 'unsigned long *' bit fields, which has 64bit size. So we need to swap least significant word with most significant word when code reads those registers from h/w.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org --- arch/arm64/kvm/hyp.S | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S index 0620691..5035b41 100644 --- a/arch/arm64/kvm/hyp.S +++ b/arch/arm64/kvm/hyp.S @@ -415,10 +415,17 @@ CPU_BE( rev w11, w11 ) str w4, [x3, #VGIC_CPU_HCR] str w5, [x3, #VGIC_CPU_VMCR] str w6, [x3, #VGIC_CPU_MISR] +#ifndef CONFIG_CPU_BIG_ENDIAN str w7, [x3, #VGIC_CPU_EISR] str w8, [x3, #(VGIC_CPU_EISR + 4)] str w9, [x3, #VGIC_CPU_ELRSR] str w10, [x3, #(VGIC_CPU_ELRSR + 4)] +#else + str w7, [x3, #(VGIC_CPU_EISR + 4)] + str w8, [x3, #VGIC_CPU_EISR] + str w9, [x3, #(VGIC_CPU_ELRSR + 4)] + str w10, [x3, #VGIC_CPU_ELRSR] +#endif str w11, [x3, #VGIC_CPU_APR]
/* Clear GICH_HCR */
On Tue, May 13, 2014 at 09:14:04AM -0700, Victor Kamensky wrote:
On arm64 'u32 vgic_eisr[2];' and 'u32 vgic_elrsr[2]' are accessed as one 'unsigned long *' bit fields, which has 64bit size. So we need to swap least significant word with most significant word when code reads those registers from h/w.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org
arch/arm64/kvm/hyp.S | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S index 0620691..5035b41 100644 --- a/arch/arm64/kvm/hyp.S +++ b/arch/arm64/kvm/hyp.S @@ -415,10 +415,17 @@ CPU_BE( rev w11, w11 ) str w4, [x3, #VGIC_CPU_HCR] str w5, [x3, #VGIC_CPU_VMCR] str w6, [x3, #VGIC_CPU_MISR] +#ifndef CONFIG_CPU_BIG_ENDIAN str w7, [x3, #VGIC_CPU_EISR] str w8, [x3, #(VGIC_CPU_EISR + 4)] str w9, [x3, #VGIC_CPU_ELRSR] str w10, [x3, #(VGIC_CPU_ELRSR + 4)] +#else
- str w7, [x3, #(VGIC_CPU_EISR + 4)]
- str w8, [x3, #VGIC_CPU_EISR]
- str w9, [x3, #(VGIC_CPU_ELRSR + 4)]
- str w10, [x3, #VGIC_CPU_ELRSR]
+#endif str w11, [x3, #VGIC_CPU_APR] /* Clear GICH_HCR */ -- 1.8.1.4
Hmm, the data structure doesn't actually clearly define this as bitmap, but the architecture clearly defines this as two registers. It feels like this critical code is getting over complicated to cater for the fact that we can use a convenient for_each_set_bit() function when we process a maintenance interrupt.
Marc, what are your thoughts on this? Change vgic_process_maintenance and clearly document the array on struct vgic_cpu, or go ahead with this change?
-Christoffer
On 26/05/14 18:35, Christoffer Dall wrote:
On Tue, May 13, 2014 at 09:14:04AM -0700, Victor Kamensky wrote:
On arm64 'u32 vgic_eisr[2];' and 'u32 vgic_elrsr[2]' are accessed as one 'unsigned long *' bit fields, which has 64bit size. So we need to swap least significant word with most significant word when code reads those registers from h/w.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org
arch/arm64/kvm/hyp.S | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S index 0620691..5035b41 100644 --- a/arch/arm64/kvm/hyp.S +++ b/arch/arm64/kvm/hyp.S @@ -415,10 +415,17 @@ CPU_BE( rev w11, w11 ) str w4, [x3, #VGIC_CPU_HCR] str w5, [x3, #VGIC_CPU_VMCR] str w6, [x3, #VGIC_CPU_MISR] +#ifndef CONFIG_CPU_BIG_ENDIAN str w7, [x3, #VGIC_CPU_EISR] str w8, [x3, #(VGIC_CPU_EISR + 4)] str w9, [x3, #VGIC_CPU_ELRSR] str w10, [x3, #(VGIC_CPU_ELRSR + 4)] +#else
- str w7, [x3, #(VGIC_CPU_EISR + 4)]
- str w8, [x3, #VGIC_CPU_EISR]
- str w9, [x3, #(VGIC_CPU_ELRSR + 4)]
- str w10, [x3, #VGIC_CPU_ELRSR]
+#endif str w11, [x3, #VGIC_CPU_APR] /* Clear GICH_HCR */ -- 1.8.1.4
Hmm, the data structure doesn't actually clearly define this as bitmap, but the architecture clearly defines this as two registers. It feels like this critical code is getting over complicated to cater for the fact that we can use a convenient for_each_set_bit() function when we process a maintenance interrupt.
Marc, what are your thoughts on this? Change vgic_process_maintenance and clearly document the array on struct vgic_cpu, or go ahead with this change?
So I've already started abstracting access to ELRSR and EISR in my GICv3 patches, and we could perfectly stick the code swapping code in the accessors, letting the world switch alone.
These changes could easily be lifter from the series, and adapted. This would also make the "bitmap-ification" more obvious.
M.
This patch addresses issue of reading and writing V8 sys registers in BE case. Since only register size function deals with is 8 bytes, existing code works in both little and big endian cases. Removed comment about little endian. Added BUG_ON that register size should be always 8 bytes.
If these functions would ever need to support both 8 bytes and 4 bytes register sizes to deals with them in endian agnostic way code should do something along these lines:
unsigned long regsize = KVM_REG_SIZE(id); union { u32 word; u64 dword; } tmp = {0};
if (copy_from_user(&tmp, uaddr, regsize) != 0) return -EFAULT; switch (regsize) { case 4: *val = tmp.word; break; case 8: *val = tmp.dword; break; }
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org --- arch/arm64/kvm/sys_regs.c | 19 +++++++++++++------ 1 file changed, 13 insertions(+), 6 deletions(-)
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 0324458..060c3a9 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -776,18 +776,25 @@ static struct sys_reg_desc invariant_sys_regs[] = { NULL, get_ctr_el0 }, };
-static int reg_from_user(void *val, const void __user *uaddr, u64 id) +static int reg_from_user(u64 *val, const void __user *uaddr, u64 id) { - /* This Just Works because we are little endian. */ - if (copy_from_user(val, uaddr, KVM_REG_SIZE(id)) != 0) + unsigned long regsize = KVM_REG_SIZE(id); + + BUG_ON(regsize != 8); + + if (copy_from_user(val, uaddr, regsize) != 0) return -EFAULT; + return 0; }
-static int reg_to_user(void __user *uaddr, const void *val, u64 id) +static int reg_to_user(void __user *uaddr, const u64 *val, u64 id) { - /* This Just Works because we are little endian. */ - if (copy_to_user(uaddr, val, KVM_REG_SIZE(id)) != 0) + unsigned long regsize = KVM_REG_SIZE(id); + + BUG_ON(regsize != 8); + + if (copy_to_user(uaddr, val, regsize) != 0) return -EFAULT; return 0; }
Hi Victor,
On Tue, May 13 2014 at 5:14:05 pm BST, Victor Kamensky victor.kamensky@linaro.org wrote:
This patch addresses issue of reading and writing V8 sys registers in BE case. Since only register size function deals with is 8 bytes, existing code works in both little and big endian cases. Removed comment about little endian. Added BUG_ON that register size should be always 8 bytes.
If these functions would ever need to support both 8 bytes and 4 bytes register sizes to deals with them in endian agnostic way code should do something along these lines:
unsigned long regsize = KVM_REG_SIZE(id); union { u32 word; u64 dword; } tmp = {0}; if (copy_from_user(&tmp, uaddr, regsize) != 0) return -EFAULT; switch (regsize) { case 4: *val = tmp.word; break; case 8: *val = tmp.dword; break; }
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org
arch/arm64/kvm/sys_regs.c | 19 +++++++++++++------ 1 file changed, 13 insertions(+), 6 deletions(-)
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 0324458..060c3a9 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -776,18 +776,25 @@ static struct sys_reg_desc invariant_sys_regs[] = { NULL, get_ctr_el0 }, }; -static int reg_from_user(void *val, const void __user *uaddr, u64 id) +static int reg_from_user(u64 *val, const void __user *uaddr, u64 id) {
- /* This Just Works because we are little endian. */
- if (copy_from_user(val, uaddr, KVM_REG_SIZE(id)) != 0)
- unsigned long regsize = KVM_REG_SIZE(id);
- BUG_ON(regsize != 8);
I haven't had time to review this series just yet, but this bit just sends chivers down my spine.
regsize is derived from id, which comes from a struct one_reg, which is directly provided by userspace. Here, you're trusting the luser to give you 8 as a size, and panic the kernel if not.
As much as I'd like to qualify this as only being a slightly undesirable effect, I think it deserves a NAK.
Thanks,
M.
On 14 May 2014 01:45, Marc Zyngier marc.zyngier@arm.com wrote:
Hi Victor,
On Tue, May 13 2014 at 5:14:05 pm BST, Victor Kamensky victor.kamensky@linaro.org wrote:
This patch addresses issue of reading and writing V8 sys registers in BE case. Since only register size function deals with is 8 bytes, existing code works in both little and big endian cases. Removed comment about little endian. Added BUG_ON that register size should be always 8 bytes.
If these functions would ever need to support both 8 bytes and 4 bytes register sizes to deals with them in endian agnostic way code should do something along these lines:
unsigned long regsize = KVM_REG_SIZE(id); union { u32 word; u64 dword; } tmp = {0}; if (copy_from_user(&tmp, uaddr, regsize) != 0) return -EFAULT; switch (regsize) { case 4: *val = tmp.word; break; case 8: *val = tmp.dword; break; }
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org
arch/arm64/kvm/sys_regs.c | 19 +++++++++++++------ 1 file changed, 13 insertions(+), 6 deletions(-)
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 0324458..060c3a9 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -776,18 +776,25 @@ static struct sys_reg_desc invariant_sys_regs[] = { NULL, get_ctr_el0 }, };
-static int reg_from_user(void *val, const void __user *uaddr, u64 id) +static int reg_from_user(u64 *val, const void __user *uaddr, u64 id) {
/* This Just Works because we are little endian. */
if (copy_from_user(val, uaddr, KVM_REG_SIZE(id)) != 0)
unsigned long regsize = KVM_REG_SIZE(id);
BUG_ON(regsize != 8);
I haven't had time to review this series just yet, but this bit just sends chivers down my spine.
regsize is derived from id, which comes from a struct one_reg, which is directly provided by userspace. Here, you're trusting the luser to give you 8 as a size, and panic the kernel if not.
As much as I'd like to qualify this as only being a slightly undesirable effect, I think it deserves a NAK.
Fair enough. I agree. Good catch! I was following on Christoffer's comments at [1], but I have not thought it through. Please advise should I come back to previous version as in [2] or just ignore any sizes other than 8 without having BUG_ON?
Thanks, Victor
[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2014-March/241815.html [2] http://lists.infradead.org/pipermail/linux-arm-kernel/2014-February/231891.h...
Thanks,
M.
-- Jazz is not dead. It just smells funny.
On Wed, May 14, 2014 at 07:18:26AM -0700, Victor Kamensky wrote:
On 14 May 2014 01:45, Marc Zyngier marc.zyngier@arm.com wrote:
[...]
-static int reg_from_user(void *val, const void __user *uaddr, u64 id) +static int reg_from_user(u64 *val, const void __user *uaddr, u64 id) {
/* This Just Works because we are little endian. */
if (copy_from_user(val, uaddr, KVM_REG_SIZE(id)) != 0)
unsigned long regsize = KVM_REG_SIZE(id);
BUG_ON(regsize != 8);
I haven't had time to review this series just yet, but this bit just sends chivers down my spine.
regsize is derived from id, which comes from a struct one_reg, which is directly provided by userspace. Here, you're trusting the luser to give you 8 as a size, and panic the kernel if not.
As much as I'd like to qualify this as only being a slightly undesirable effect, I think it deserves a NAK.
Fair enough. I agree. Good catch! I was following on Christoffer's comments at [1], but I have not thought it through. Please advise should I come back to previous version as in [2] or just ignore any sizes other than 8 without having BUG_ON?
Thanks, Victor
[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2014-March/241815.html [2] http://lists.infradead.org/pipermail/linux-arm-kernel/2014-February/231891.h...
If the ABI doesn't define an ID for your arch (which is what I was saying in my comment), simply return -EINVAL, but don't do BUG_ON(...).
-Christoffer
Fix isssue with 32bit guests running on top of BE KVM host. Guest state is retored with double word read operations. Within the high and low word data is already byteswap. This code effectively swaps two words within 64bit value.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org --- arch/arm64/kvm/sys_regs.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 060c3a9..4438b47 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -51,6 +51,16 @@ static u32 cache_levels; /* CSSELR values; used to index KVM_REG_ARM_DEMUX_ID_CCSIDR */ #define CSSELR_MAX 12
+/* Word access indexes for 64bit vm accessor */ +#ifdef CONFIG_CPU_BIG_ENDIAN +#define CP15_REG_MSW_INDEX 0 +#define CP15_REG_LSW_INDEX 1 +#else +#define CP15_REG_MSW_INDEX 1 +#define CP15_REG_LSW_INDEX 0 +#endif + + /* Which cache CCSIDR represents depends on CSSELR value. */ static u32 get_ccsidr(u32 csselr) { @@ -137,9 +147,9 @@ static bool access_vm_reg(struct kvm_vcpu *vcpu, if (!p->is_aarch32) { vcpu_sys_reg(vcpu, r->reg) = val; } else { - vcpu_cp15(vcpu, r->reg) = val & 0xffffffffUL; + vcpu_cp15(vcpu, r->reg + CP15_REG_LSW_INDEX) = val & 0xffffffffUL; if (!p->is_32bit) - vcpu_cp15(vcpu, r->reg + 1) = val >> 32; + vcpu_cp15(vcpu, r->reg + CP15_REG_MSW_INDEX) = val >> 32; } return true; }
On Tue, May 13, 2014 at 09:14:06AM -0700, Victor Kamensky wrote:
Fix isssue with 32bit guests running on top of BE KVM host. Guest state is retored with double word read operations. Within the high
restored (spell check should catch this).
"Guest state is restored with double word read operations." I don't know what this sentence is supposed to tell me.
and low word data is already byteswap. This code effectively swaps
"data is already byteswap" is not English. data is already byteswapped?
two words within 64bit value.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org
arch/arm64/kvm/sys_regs.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 060c3a9..4438b47 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -51,6 +51,16 @@ static u32 cache_levels; /* CSSELR values; used to index KVM_REG_ARM_DEMUX_ID_CCSIDR */ #define CSSELR_MAX 12 +/* Word access indexes for 64bit vm accessor */ +#ifdef CONFIG_CPU_BIG_ENDIAN +#define CP15_REG_MSW_INDEX 0 +#define CP15_REG_LSW_INDEX 1 +#else +#define CP15_REG_MSW_INDEX 1 +#define CP15_REG_LSW_INDEX 0 +#endif
/* Which cache CCSIDR represents depends on CSSELR value. */ static u32 get_ccsidr(u32 csselr) { @@ -137,9 +147,9 @@ static bool access_vm_reg(struct kvm_vcpu *vcpu, if (!p->is_aarch32) { vcpu_sys_reg(vcpu, r->reg) = val; } else {
vcpu_cp15(vcpu, r->reg) = val & 0xffffffffUL;
if (!p->is_32bit)vcpu_cp15(vcpu, r->reg + CP15_REG_LSW_INDEX) = val & 0xffffffffUL;
vcpu_cp15(vcpu, r->reg + 1) = val >> 32;
} return true;vcpu_cp15(vcpu, r->reg + CP15_REG_MSW_INDEX) = val >> 32;
}
1.8.1.4
I really don't like this. If anything I feel like it should be abstracted inside vcpu_cp15, but wouldn't it be cleaner to do something along the lines of:
u64 *regstore = (u64 *)vcpu->arch.ctxt.cp15[r->reg]; if (p->is_32bit) val &= 0xffffffffUL; *regstore = val;
But I think the thing that bothers me in general with all these patches is that they deal specially with a lot of situations where the data structure was designed specifically to make the code easy to read, and now it just becomes a really complicated mess. Have you carefully considered other options, redesigning the data structure layout etc.?
-Christoffer
On 26 May 2014 10:52, Christoffer Dall christoffer.dall@linaro.org wrote:
On Tue, May 13, 2014 at 09:14:06AM -0700, Victor Kamensky wrote:
Fix isssue with 32bit guests running on top of BE KVM host. Guest state is retored with double word read operations. Within the high
restored (spell check should catch this).
"Guest state is restored with double word read operations." I don't know what this sentence is supposed to tell me.
and low word data is already byteswap. This code effectively swaps
"data is already byteswap" is not English. data is already byteswapped?
two words within 64bit value.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org
arch/arm64/kvm/sys_regs.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 060c3a9..4438b47 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -51,6 +51,16 @@ static u32 cache_levels; /* CSSELR values; used to index KVM_REG_ARM_DEMUX_ID_CCSIDR */ #define CSSELR_MAX 12
+/* Word access indexes for 64bit vm accessor */ +#ifdef CONFIG_CPU_BIG_ENDIAN +#define CP15_REG_MSW_INDEX 0 +#define CP15_REG_LSW_INDEX 1 +#else +#define CP15_REG_MSW_INDEX 1 +#define CP15_REG_LSW_INDEX 0 +#endif
/* Which cache CCSIDR represents depends on CSSELR value. */ static u32 get_ccsidr(u32 csselr) { @@ -137,9 +147,9 @@ static bool access_vm_reg(struct kvm_vcpu *vcpu, if (!p->is_aarch32) { vcpu_sys_reg(vcpu, r->reg) = val; } else {
vcpu_cp15(vcpu, r->reg) = val & 0xffffffffUL;
vcpu_cp15(vcpu, r->reg + CP15_REG_LSW_INDEX) = val & 0xffffffffUL; if (!p->is_32bit)
vcpu_cp15(vcpu, r->reg + 1) = val >> 32;
vcpu_cp15(vcpu, r->reg + CP15_REG_MSW_INDEX) = val >> 32; } return true;
}
1.8.1.4
I really don't like this. If anything I feel like it should be abstracted inside vcpu_cp15,
good point, please see revised proposal below
but wouldn't it be cleaner to do something along the lines of:
u64 *regstore = (u64 *)vcpu->arch.ctxt.cp15[r->reg]; if (p->is_32bit) val &= 0xffffffffUL; *regstore = val;
I don't think above will be correct. The way I read it the following hypothetical instructions sequence
mcrr p15, 0, r6, r7, c2 @ TTBR 0 mcr p15, 0, r6, c2, c0, 0 @ 32bit TTBR 0
will get TTBR 0 high word to 0, when mrc instruction executes, but it should be left unchanged in case of 'p->is_32bit'
How about something like following? Is it move into right direction?
From b0a7793b03d9c62f7b9c53a317cb2d19a75c935b Mon Sep 17 00:00:00 2001
From: Victor Kamensky victor.kamensky@linaro.org Date: Mon, 12 May 2014 13:57:21 -0700 Subject: [PATCH] ARM64: KVM: fix big endian issue in access_vm_reg for 32bit guest
Fix issue with 32bit guests running on top of BE KVM host. Indexes of high and low words of 64bit cp15 register are swapped in case of big endian code, since 64bit cp15 state is restored or saved with double word write or read instruction.
Define helper macros to access high low words of 64bit cp15 register.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org --- arch/arm64/include/asm/kvm_host.h | 8 ++++++++ arch/arm64/kvm/sys_regs.c | 4 ++-- 2 files changed, 10 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 0a1d697..e9d2e11 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -140,6 +140,14 @@ struct kvm_vcpu_arch { #define vcpu_sys_reg(v,r) ((v)->arch.ctxt.sys_regs[(r)]) #define vcpu_cp15(v,r) ((v)->arch.ctxt.cp15[(r)])
+#ifdef CONFIG_CPU_BIG_ENDIAN +#define vcpu_cp15_64_high(v,r) ((v)->arch.ctxt.cp15[((r) + 0)]) +#define vcpu_cp15_64_low(v,r) ((v)->arch.ctxt.cp15[((r) + 1)]) +#else +#define vcpu_cp15_64_high(v,r) ((v)->arch.ctxt.cp15[((r) + 1)]) +#define vcpu_cp15_64_low(v,r) ((v)->arch.ctxt.cp15[((r) + 0)]) +#endif + struct kvm_vm_stat { u32 remote_tlb_flush; }; diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index a13e7e7..b243e07 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -137,9 +137,9 @@ static bool access_vm_reg(struct kvm_vcpu *vcpu, if (!p->is_aarch32) { vcpu_sys_reg(vcpu, r->reg) = val; } else { - vcpu_cp15(vcpu, r->reg) = val & 0xffffffffUL; + vcpu_cp15_64_low(vcpu, r->reg) = val & 0xffffffffUL; if (!p->is_32bit) - vcpu_cp15(vcpu, r->reg + 1) = val >> 32; + vcpu_cp15_64_high(vcpu, r->reg) = val >> 32; } return true; }
On Tue, May 27, 2014 at 11:11:57PM -0700, Victor Kamensky wrote:
On 26 May 2014 10:52, Christoffer Dall christoffer.dall@linaro.org wrote:
On Tue, May 13, 2014 at 09:14:06AM -0700, Victor Kamensky wrote:
Fix isssue with 32bit guests running on top of BE KVM host. Guest state is retored with double word read operations. Within the high
restored (spell check should catch this).
"Guest state is restored with double word read operations." I don't know what this sentence is supposed to tell me.
and low word data is already byteswap. This code effectively swaps
"data is already byteswap" is not English. data is already byteswapped?
two words within 64bit value.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org
arch/arm64/kvm/sys_regs.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 060c3a9..4438b47 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -51,6 +51,16 @@ static u32 cache_levels; /* CSSELR values; used to index KVM_REG_ARM_DEMUX_ID_CCSIDR */ #define CSSELR_MAX 12
+/* Word access indexes for 64bit vm accessor */ +#ifdef CONFIG_CPU_BIG_ENDIAN +#define CP15_REG_MSW_INDEX 0 +#define CP15_REG_LSW_INDEX 1 +#else +#define CP15_REG_MSW_INDEX 1 +#define CP15_REG_LSW_INDEX 0 +#endif
/* Which cache CCSIDR represents depends on CSSELR value. */ static u32 get_ccsidr(u32 csselr) { @@ -137,9 +147,9 @@ static bool access_vm_reg(struct kvm_vcpu *vcpu, if (!p->is_aarch32) { vcpu_sys_reg(vcpu, r->reg) = val; } else {
vcpu_cp15(vcpu, r->reg) = val & 0xffffffffUL;
vcpu_cp15(vcpu, r->reg + CP15_REG_LSW_INDEX) = val & 0xffffffffUL; if (!p->is_32bit)
vcpu_cp15(vcpu, r->reg + 1) = val >> 32;
vcpu_cp15(vcpu, r->reg + CP15_REG_MSW_INDEX) = val >> 32; } return true;
}
1.8.1.4
I really don't like this. If anything I feel like it should be abstracted inside vcpu_cp15,
good point, please see revised proposal below
but wouldn't it be cleaner to do something along the lines of:
u64 *regstore = (u64 *)vcpu->arch.ctxt.cp15[r->reg]; if (p->is_32bit) val &= 0xffffffffUL; *regstore = val;
I don't think above will be correct. The way I read it the following hypothetical instructions sequence
mcrr p15, 0, r6, r7, c2 @ TTBR 0 mcr p15, 0, r6, c2, c0, 0 @ 32bit TTBR 0
will get TTBR 0 high word to 0, when mrc instruction executes, but it should be left unchanged in case of 'p->is_32bit'
How about something like following? Is it move into right direction?
From b0a7793b03d9c62f7b9c53a317cb2d19a75c935b Mon Sep 17 00:00:00 2001 From: Victor Kamensky victor.kamensky@linaro.org Date: Mon, 12 May 2014 13:57:21 -0700 Subject: [PATCH] ARM64: KVM: fix big endian issue in access_vm_reg for 32bit guest
Fix issue with 32bit guests running on top of BE KVM host. Indexes of high and low words of 64bit cp15 register are swapped in case of big endian code, since 64bit cp15 state is restored or saved with double word write or read instruction.
Define helper macros to access high low words of 64bit cp15 register.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org
arch/arm64/include/asm/kvm_host.h | 8 ++++++++ arch/arm64/kvm/sys_regs.c | 4 ++-- 2 files changed, 10 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 0a1d697..e9d2e11 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -140,6 +140,14 @@ struct kvm_vcpu_arch { #define vcpu_sys_reg(v,r) ((v)->arch.ctxt.sys_regs[(r)]) #define vcpu_cp15(v,r) ((v)->arch.ctxt.cp15[(r)])
+#ifdef CONFIG_CPU_BIG_ENDIAN +#define vcpu_cp15_64_high(v,r) ((v)->arch.ctxt.cp15[((r) + 0)]) +#define vcpu_cp15_64_low(v,r) ((v)->arch.ctxt.cp15[((r) + 1)]) +#else +#define vcpu_cp15_64_high(v,r) ((v)->arch.ctxt.cp15[((r) + 1)]) +#define vcpu_cp15_64_low(v,r) ((v)->arch.ctxt.cp15[((r) + 0)]) +#endif
struct kvm_vm_stat { u32 remote_tlb_flush; }; diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index a13e7e7..b243e07 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -137,9 +137,9 @@ static bool access_vm_reg(struct kvm_vcpu *vcpu, if (!p->is_aarch32) { vcpu_sys_reg(vcpu, r->reg) = val; } else {
vcpu_cp15(vcpu, r->reg) = val & 0xffffffffUL;
vcpu_cp15_64_low(vcpu, r->reg) = val & 0xffffffffUL; if (!p->is_32bit)
vcpu_cp15(vcpu, r->reg + 1) = val >> 32;
} return true;vcpu_cp15_64_high(vcpu, r->reg) = val >> 32;
}
1.8.1.4
This is definitely much better than the original patch.
But I think the thing that bothers me in general with all these patches is that they deal specially with a lot of situations where the data structure was designed specifically to make the code easy to read, and now it just becomes a really complicated mess. Have you carefully considered other options, redesigning the data structure layout etc.?
I have considered different options and I am open for suggestions. Bytesswaps quite often could be done in different places but achieve the same result. In several cases I initially developed patch that deals with BE issue in one way and reworked to make it more compact less intrusive. For example in this particular case order of high and low words of 64bit cp15 register could be kept the same in BE/LE cases but code that save/restore it could be changed to do byteswap. My opinion that proposed option is more clean.
What I was thinking is to avoid packing 64-bit values as two 32-bit values, so for example having two separate arrays on the kvm_cpu_context struct, one for 32-bit coprocessor registers and one for 64-bit coprocessor registers.
That would sort of follow Benjamin Herrenschmidt's manatra about always using the correctly typed values everywhere to avoid endianness issues.
Did you look at that approach? Does it blow up?
-Christoffer
On 28 May 2014 02:14, Christoffer Dall christoffer.dall@linaro.org wrote:
On Tue, May 27, 2014 at 11:11:57PM -0700, Victor Kamensky wrote:
On 26 May 2014 10:52, Christoffer Dall christoffer.dall@linaro.org wrote:
On Tue, May 13, 2014 at 09:14:06AM -0700, Victor Kamensky wrote:
Fix isssue with 32bit guests running on top of BE KVM host. Guest state is retored with double word read operations. Within the high
restored (spell check should catch this).
"Guest state is restored with double word read operations." I don't know what this sentence is supposed to tell me.
and low word data is already byteswap. This code effectively swaps
"data is already byteswap" is not English. data is already byteswapped?
two words within 64bit value.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org
arch/arm64/kvm/sys_regs.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index 060c3a9..4438b47 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -51,6 +51,16 @@ static u32 cache_levels; /* CSSELR values; used to index KVM_REG_ARM_DEMUX_ID_CCSIDR */ #define CSSELR_MAX 12
+/* Word access indexes for 64bit vm accessor */ +#ifdef CONFIG_CPU_BIG_ENDIAN +#define CP15_REG_MSW_INDEX 0 +#define CP15_REG_LSW_INDEX 1 +#else +#define CP15_REG_MSW_INDEX 1 +#define CP15_REG_LSW_INDEX 0 +#endif
/* Which cache CCSIDR represents depends on CSSELR value. */ static u32 get_ccsidr(u32 csselr) { @@ -137,9 +147,9 @@ static bool access_vm_reg(struct kvm_vcpu *vcpu, if (!p->is_aarch32) { vcpu_sys_reg(vcpu, r->reg) = val; } else {
vcpu_cp15(vcpu, r->reg) = val & 0xffffffffUL;
vcpu_cp15(vcpu, r->reg + CP15_REG_LSW_INDEX) = val & 0xffffffffUL; if (!p->is_32bit)
vcpu_cp15(vcpu, r->reg + 1) = val >> 32;
vcpu_cp15(vcpu, r->reg + CP15_REG_MSW_INDEX) = val >> 32; } return true;
}
1.8.1.4
I really don't like this. If anything I feel like it should be abstracted inside vcpu_cp15,
good point, please see revised proposal below
but wouldn't it be cleaner to do something along the lines of:
u64 *regstore = (u64 *)vcpu->arch.ctxt.cp15[r->reg]; if (p->is_32bit) val &= 0xffffffffUL; *regstore = val;
I don't think above will be correct. The way I read it the following hypothetical instructions sequence
mcrr p15, 0, r6, r7, c2 @ TTBR 0 mcr p15, 0, r6, c2, c0, 0 @ 32bit TTBR 0
will get TTBR 0 high word to 0, when mrc instruction executes, but it should be left unchanged in case of 'p->is_32bit'
How about something like following? Is it move into right direction?
From b0a7793b03d9c62f7b9c53a317cb2d19a75c935b Mon Sep 17 00:00:00 2001 From: Victor Kamensky victor.kamensky@linaro.org Date: Mon, 12 May 2014 13:57:21 -0700 Subject: [PATCH] ARM64: KVM: fix big endian issue in access_vm_reg for 32bit guest
Fix issue with 32bit guests running on top of BE KVM host. Indexes of high and low words of 64bit cp15 register are swapped in case of big endian code, since 64bit cp15 state is restored or saved with double word write or read instruction.
Define helper macros to access high low words of 64bit cp15 register.
Signed-off-by: Victor Kamensky victor.kamensky@linaro.org
arch/arm64/include/asm/kvm_host.h | 8 ++++++++ arch/arm64/kvm/sys_regs.c | 4 ++-- 2 files changed, 10 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 0a1d697..e9d2e11 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -140,6 +140,14 @@ struct kvm_vcpu_arch { #define vcpu_sys_reg(v,r) ((v)->arch.ctxt.sys_regs[(r)]) #define vcpu_cp15(v,r) ((v)->arch.ctxt.cp15[(r)])
+#ifdef CONFIG_CPU_BIG_ENDIAN +#define vcpu_cp15_64_high(v,r) ((v)->arch.ctxt.cp15[((r) + 0)]) +#define vcpu_cp15_64_low(v,r) ((v)->arch.ctxt.cp15[((r) + 1)]) +#else +#define vcpu_cp15_64_high(v,r) ((v)->arch.ctxt.cp15[((r) + 1)]) +#define vcpu_cp15_64_low(v,r) ((v)->arch.ctxt.cp15[((r) + 0)]) +#endif
struct kvm_vm_stat { u32 remote_tlb_flush; }; diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c index a13e7e7..b243e07 100644 --- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -137,9 +137,9 @@ static bool access_vm_reg(struct kvm_vcpu *vcpu, if (!p->is_aarch32) { vcpu_sys_reg(vcpu, r->reg) = val; } else {
vcpu_cp15(vcpu, r->reg) = val & 0xffffffffUL;
vcpu_cp15_64_low(vcpu, r->reg) = val & 0xffffffffUL; if (!p->is_32bit)
vcpu_cp15(vcpu, r->reg + 1) = val >> 32;
} return true;vcpu_cp15_64_high(vcpu, r->reg) = val >> 32;
}
1.8.1.4
This is definitely much better than the original patch.
But I think the thing that bothers me in general with all these patches is that they deal specially with a lot of situations where the data structure was designed specifically to make the code easy to read, and now it just becomes a really complicated mess. Have you carefully considered other options, redesigning the data structure layout etc.?
I have considered different options and I am open for suggestions. Bytesswaps quite often could be done in different places but achieve the same result. In several cases I initially developed patch that deals with BE issue in one way and reworked to make it more compact less intrusive. For example in this particular case order of high and low words of 64bit cp15 register could be kept the same in BE/LE cases but code that save/restore it could be changed to do byteswap. My opinion that proposed option is more clean.
What I was thinking is to avoid packing 64-bit values as two 32-bit values, so for example having two separate arrays on the kvm_cpu_context struct, one for 32-bit coprocessor registers and one for 64-bit coprocessor registers.
They cannot be separate it is the same data with two ways to access: as 64bit value or low word 32bit value. Typically it is LPAE cp15 memory related registers like TTBR0, as in mcrr, mcr example in my previous response. mcrr will update high and low word values of TTBR0, mcr will update only low word of it.
Thanks, Victor
That would sort of follow Benjamin Herrenschmidt's manatra about always using the correctly typed values everywhere to avoid endianness issues.
Did you look at that approach? Does it blow up?
-Christoffer
On Wed, May 28, 2014 at 06:56:44AM -0700, Victor Kamensky wrote:
On 28 May 2014 02:14, Christoffer Dall christoffer.dall@linaro.org wrote:
On Tue, May 27, 2014 at 11:11:57PM -0700, Victor Kamensky wrote:
On 26 May 2014 10:52, Christoffer Dall christoffer.dall@linaro.org wrote:
On Tue, May 13, 2014 at 09:14:06AM -0700, Victor Kamensky wrote:
[...]
But I think the thing that bothers me in general with all these patches is that they deal specially with a lot of situations where the data structure was designed specifically to make the code easy to read, and now it just becomes a really complicated mess. Have you carefully considered other options, redesigning the data structure layout etc.?
I have considered different options and I am open for suggestions. Bytesswaps quite often could be done in different places but achieve the same result. In several cases I initially developed patch that deals with BE issue in one way and reworked to make it more compact less intrusive. For example in this particular case order of high and low words of 64bit cp15 register could be kept the same in BE/LE cases but code that save/restore it could be changed to do byteswap. My opinion that proposed option is more clean.
What I was thinking is to avoid packing 64-bit values as two 32-bit values, so for example having two separate arrays on the kvm_cpu_context struct, one for 32-bit coprocessor registers and one for 64-bit coprocessor registers.
They cannot be separate it is the same data with two ways to access: as 64bit value or low word 32bit value. Typically it is LPAE cp15 memory related registers like TTBR0, as in mcrr, mcr example in my previous response. mcrr will update high and low word values of TTBR0, mcr will update only low word of it.
You just have to decide whether you want to represent a 64-bit register as a concatenation of two u32's or a 32-bit register access as a part of the 64-bit register value.
I think the architecture, at least for 32-bit register views on ARMv8, defines the 32-bit accesses as a view on the 64-bit register.
You should of course only store those registers once, but instead of storing them in an array of u32's you could store the 64-bit wide registers in a separate array of u64's. 32-bit accesses to those registers would look in the array of u64's for such a value.
I didn't try it out, so not sure how it looks like, just saying it's an option worth considering.
-Christoffer
On 28/05/14 15:09, Christoffer Dall wrote:
On Wed, May 28, 2014 at 06:56:44AM -0700, Victor Kamensky wrote:
On 28 May 2014 02:14, Christoffer Dall christoffer.dall@linaro.org wrote:
They cannot be separate it is the same data with two ways to access: as 64bit value or low word 32bit value. Typically it is LPAE cp15 memory related registers like TTBR0, as in mcrr, mcr example in my previous response. mcrr will update high and low word values of TTBR0, mcr will update only low word of it.
You just have to decide whether you want to represent a 64-bit register as a concatenation of two u32's or a 32-bit register access as a part of the 64-bit register value.
I think the architecture, at least for 32-bit register views on ARMv8, defines the 32-bit accesses as a view on the 64-bit register.
Indeed, and that's how KVM implements it.
You should of course only store those registers once, but instead of storing them in an array of u32's you could store the 64-bit wide registers in a separate array of u64's. 32-bit accesses to those registers would look in the array of u64's for such a value.
I didn't try it out, so not sure how it looks like, just saying it's an option worth considering.
I think that's the best idea so far. If the access is 64bit, use the 64bit version of the union (going through the corresponding vcpu_sys_reg() interface).
This should be a very minimal refactor.
M.
On Tue, May 13, 2014 at 09:13:52AM -0700, Victor Kamensky wrote:
Hi,
Here is updated patches to address KVM host support in big endian ARM V7/V8.
Changes since previous version [1] and [2]:
x) joined both v7 [1] and v8 [2] series into one
x) addressed, nearly all, Christoffer's comments to previous patch series
x) Fixed few issues in 'ARM: KVM: handle 64bit values passed to mrcc or from mcrr instructions in BE case' that got discovered after moving to the latest v3.15-rc5 kernel
What were the issues?
x) Fixed issue in supporting 32bit guest by V8 BE KVM host
Is this changes only from v2 to v3 or?
Consider using the common format for changelogs, see here for an example: https://lists.cs.columbia.edu/pipermail/kvmarm/2014-May/009468.html
Thanks, -Christoffer
On 26 May 2014 08:49, Christoffer Dall christoffer.dall@linaro.org wrote:
On Tue, May 13, 2014 at 09:13:52AM -0700, Victor Kamensky wrote:
Hi,
Here is updated patches to address KVM host support in big endian ARM V7/V8.
Changes since previous version [1] and [2]:
x) joined both v7 [1] and v8 [2] series into one
x) addressed, nearly all, Christoffer's comments to previous patch series
x) Fixed few issues in 'ARM: KVM: handle 64bit values passed to mrcc or from mcrr instructions in BE case' that got discovered after moving to the latest v3.15-rc5 kernel
What were the issues?
You correctly spotted it: it related TTBR handling and access_vm_reg function
x) Fixed issue in supporting 32bit guest by V8 BE KVM host
Is this changes only from v2 to v3 or?
Change in v3 patch 14 that deals with access_vm_reg function.
Consider using the common format for changelogs, see here for an example: https://lists.cs.columbia.edu/pipermail/kvmarm/2014-May/009468.html
Thanks. Will try to do that.
Christoffer, Marc thank you very much for review! I accepted a lot of comments and update patches. For remaining issues I've sent my responses. When we get a bit more closure on remaining issues, I will report the series.
Thanks, Victor
Thanks, -Christoffer
On Tue, May 27, 2014 at 11:31:08PM -0700, Victor Kamensky wrote:
On 26 May 2014 08:49, Christoffer Dall christoffer.dall@linaro.org wrote:
On Tue, May 13, 2014 at 09:13:52AM -0700, Victor Kamensky wrote:
Hi,
Here is updated patches to address KVM host support in big endian ARM V7/V8.
Changes since previous version [1] and [2]:
x) joined both v7 [1] and v8 [2] series into one
x) addressed, nearly all, Christoffer's comments to previous patch series
x) Fixed few issues in 'ARM: KVM: handle 64bit values passed to mrcc or from mcrr instructions in BE case' that got discovered after moving to the latest v3.15-rc5 kernel
What were the issues?
You correctly spotted it: it related TTBR handling and access_vm_reg function
x) Fixed issue in supporting 32bit guest by V8 BE KVM host
Is this changes only from v2 to v3 or?
Change in v3 patch 14 that deals with access_vm_reg function.
Consider using the common format for changelogs, see here for an example: https://lists.cs.columbia.edu/pipermail/kvmarm/2014-May/009468.html
Thanks. Will try to do that.
Christoffer, Marc thank you very much for review! I accepted a lot of comments and update patches. For remaining issues I've sent my responses. When we get a bit more closure on remaining issues, I will report the series.
Thanks!
linaro-kernel@lists.linaro.org