User space can use the MEM_OP ioctl to make storage key checked reads and writes to the guest, however, it has no way of performing atomic, key checked, accesses to the guest. Extend the MEM_OP ioctl in order to allow for this, by adding a cmpxchg operation. For now, support this operation for absolute accesses only.
This operation can be use, for example, to set the device-state-change indicator and the adapter-local-summary indicator atomically.
Also contains some fixes/changes for the memop selftest independent of the cmpxchg changes.
v6 -> v7 * cast to __user pointers in cmpxchg_guest_abs_with_key * dropped function for allocating buffer for mem_op * use goto if CHECK_ONLY in mem_op refactoring * changed assert message in bad address test * picked up R-b's (thanks Thomas & Janosch) * fix nits, fix typos, improve commit descriptions
v5 -> v6 * move memop selftest fixes/refactoring to front of series so they can be picked independently from the rest * use op instead of flag to indicate cmpxchg * no longer indicate success of cmpxchg to user space, which can infer it by observing a change in the old value instead * refactor functions implementing the ioctl * adjust documentation (drop R-b) * adjust selftest * rebase
v4 -> v5 * refuse cmpxchg if not write (thanks Thomas) * minor doc changes (thanks Claudio) * picked up R-b's (thanks Thomas & Claudio) * memop selftest fixes * rebased
v3 -> v4 * no functional change intended * rework documentation a bit * name extension cap cmpxchg bit * picked up R-b (thanks Thomas) * various changes (rename variable, comments, ...) see range-diff below
v2 -> v3 * rebase onto the wip/cmpxchg_user_key branch in the s390 kernel repo * use __uint128_t instead of unsigned __int128 * put moving of testlist into main into separate patch * pick up R-b's (thanks Nico)
v1 -> v2 * get rid of xrk instruction for cmpxchg byte and short implementation * pass old parameter via pointer instead of in mem_op struct * indicate failure of cmpxchg due to wrong old value by special return code * picked up R-b's (thanks Thomas)
Janis Schoetterl-Glausch (14): KVM: s390: selftest: memop: Pass mop_desc via pointer KVM: s390: selftest: memop: Replace macros by functions KVM: s390: selftest: memop: Move testlist into main KVM: s390: selftest: memop: Add bad address test KVM: s390: selftest: memop: Fix typo KVM: s390: selftest: memop: Fix wrong address being used in test KVM: s390: selftest: memop: Fix integer literal KVM: s390: Move common code of mem_op functions into function KVM: s390: Dispatch to implementing function at top level of vm mem_op KVM: s390: Refactor absolute vm mem_op function KVM: s390: Refactor vcpu mem_op function KVM: s390: Extend MEM_OP ioctl by storage key checked cmpxchg Documentation: KVM: s390: Describe KVM_S390_MEMOP_F_CMPXCHG KVM: s390: selftest: memop: Add cmpxchg tests
Documentation/virt/kvm/api.rst | 29 +- include/uapi/linux/kvm.h | 8 + arch/s390/kvm/gaccess.h | 3 + arch/s390/kvm/gaccess.c | 109 ++++ arch/s390/kvm/kvm-s390.c | 221 ++++--- tools/testing/selftests/kvm/s390x/memop.c | 675 +++++++++++++++++----- 6 files changed, 809 insertions(+), 236 deletions(-)
Range-diff against v6: 1: 512e1a3e0ae5 ! 1: fb51df0930d8 KVM: s390: selftest: memop: Pass mop_desc via pointer @@ Commit message
Signed-off-by: Janis Schoetterl-Glausch scgl@linux.ibm.com Reviewed-by: Thomas Huth thuth@redhat.com + Reviewed-by: Janosch Frank frankja@linux.ibm.com
## tools/testing/selftests/kvm/s390x/memop.c ## @@ tools/testing/selftests/kvm/s390x/memop.c: struct mop_desc { 2: 47328ea64f80 ! 2: 6f09b06574f0 KVM: s390: selftest: memop: Replace macros by functions @@ Commit message KVM: s390: selftest: memop: Replace macros by functions
Replace the DEFAULT_* test helpers by functions, as they don't - need the exta flexibility. + need the extra flexibility.
Signed-off-by: Janis Schoetterl-Glausch scgl@linux.ibm.com Reviewed-by: Thomas Huth thuth@redhat.com + Acked-by: Janosch Frank frankja@linux.ibm.com
## tools/testing/selftests/kvm/s390x/memop.c ## @@ tools/testing/selftests/kvm/s390x/memop.c: struct mop_desc { 3: 224fe37eeec7 ! 3: 108c1af396fe KVM: s390: selftest: memop: Move testlist into main @@ Commit message
Signed-off-by: Janis Schoetterl-Glausch scgl@linux.ibm.com Reviewed-by: Thomas Huth thuth@redhat.com + Reviewed-by: Janosch Frank frankja@linux.ibm.com
## tools/testing/selftests/kvm/s390x/memop.c ## @@ tools/testing/selftests/kvm/s390x/memop.c: static void test_errors(void) 4: f622d3413cf0 ! 4: 4bcad9b4bf9d KVM: s390: selftest: memop: Add bad address test @@ Metadata ## Commit message ## KVM: s390: selftest: memop: Add bad address test
- Add test that tries to access, instead of CHECK_ONLY. + Add a test that tries a real write to a bad address. + The existing CHECK_ONLY test doesn't cover all paths.
Signed-off-by: Janis Schoetterl-Glausch scgl@linux.ibm.com Reviewed-by: Nico Boehr nrb@linux.ibm.com @@ tools/testing/selftests/kvm/s390x/memop.c: static void _test_errors_common(struc /* Bad guest address: */ rv = ERR_MOP(info, target, WRITE, mem1, size, GADDR((void *)~0xfffUL), CHECK_ONLY); - TEST_ASSERT(rv > 0, "ioctl does not report bad guest memory access"); -+ TEST_ASSERT(rv > 0, "ioctl does not report bad guest memory address"); ++ TEST_ASSERT(rv > 0, "ioctl does not report bad guest memory address with CHECK_ONLY"); + rv = ERR_MOP(info, target, WRITE, mem1, size, GADDR((void *)~0xfffUL)); -+ TEST_ASSERT(rv > 0, "ioctl does not report bad guest memory address"); ++ TEST_ASSERT(rv > 0, "ioctl does not report bad guest memory address on write");
/* Bad host address: */ rv = ERR_MOP(info, target, WRITE, 0, size, GADDR_V(mem1)); 5: 431f191a8a57 = 5: 7af4c710045c KVM: s390: selftest: memop: Fix typo 6: 3122187435fb = 6: 2ab1854f3959 KVM: s390: selftest: memop: Fix wrong address being used in test 7: 401f51f3ef55 ! 7: 980644e1ce74 KVM: s390: selftest: memop: Fix integer literal @@ Commit message
Signed-off-by: Janis Schoetterl-Glausch scgl@linux.ibm.com Fixes: 1bb873495a9e ("KVM: s390: selftests: Add more copy memop tests") + Reviewed-by: Thomas Huth thuth@redhat.com
## tools/testing/selftests/kvm/s390x/memop.c ## @@ tools/testing/selftests/kvm/s390x/memop.c: static void guest_copy_key_fetch_prot_override(void) 8: 185e5cd33df6 ! 8: a9930caadf5e KVM: s390: Move common code of mem_op functions into functions @@ Metadata Author: Janis Schoetterl-Glausch scgl@linux.ibm.com
## Commit message ## - KVM: s390: Move common code of mem_op functions into functions + KVM: s390: Move common code of mem_op functions into function
The vcpu and vm mem_op ioctl implementations share some functionality. - Move argument checking and buffer allocation into functions and call - them from both implementations. - This allows code reuse in case of additional future mem_op operations. + Move argument checking into a function and call it from both + implementations. This allows code reuse in case of additional future + mem_op operations.
Suggested-by: Janosch Frank frankja@linux.ibm.com Signed-off-by: Janis Schoetterl-Glausch scgl@linux.ibm.com @@ arch/s390/kvm/kvm-s390.c: static int kvm_s390_handle_pv(struct kvm *kvm, struct + mop->key = 0; + } + return 0; -+} -+ -+static void *mem_op_alloc_buf(struct kvm_s390_mem_op *mop) -+{ -+ void *buf; -+ -+ if (mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY) -+ return NULL; -+ buf = vmalloc(mop->size); -+ if (!buf) -+ return ERR_PTR(-ENOMEM); -+ return buf; }
static int kvm_s390_vm_mem_op(struct kvm *kvm, struct kvm_s390_mem_op *mop) @@ arch/s390/kvm/kvm-s390.c: static int kvm_s390_vm_mem_op(struct kvm *kvm, struct - } else { - mop->key = 0; - } -- if (!(mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY)) { -- tmpbuf = vmalloc(mop->size); -- if (!tmpbuf) -- return -ENOMEM; -- } -+ tmpbuf = mem_op_alloc_buf(mop); -+ if (IS_ERR(tmpbuf)) -+ return PTR_ERR(tmpbuf); - - srcu_idx = srcu_read_lock(&kvm->srcu); - + if (!(mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY)) { + tmpbuf = vmalloc(mop->size); + if (!tmpbuf) @@ arch/s390/kvm/kvm-s390.c: static long kvm_s390_vcpu_mem_op(struct kvm_vcpu *vcpu, { void __user *uaddr = (void __user *)mop->buf; @@ arch/s390/kvm/kvm-s390.c: static long kvm_s390_vcpu_mem_op(struct kvm_vcpu *vcpu - } else { - mop->key = 0; - } -- if (!(mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY)) { -- tmpbuf = vmalloc(mop->size); -- if (!tmpbuf) -- return -ENOMEM; -- } -+ tmpbuf = mem_op_alloc_buf(mop); -+ if (IS_ERR(tmpbuf)) -+ return PTR_ERR(tmpbuf); - - switch (mop->op) { - case KVM_S390_MEMOP_LOGICAL_READ: + if (!(mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY)) { + tmpbuf = vmalloc(mop->size); + if (!tmpbuf) 9: d7c8b94cb351 ! 9: 2715c1ceabdb KVM: s390: Dispatch to implementing function at top level of vm mem_op @@ Commit message
Suggested-by: Janosch Frank frankja@linux.ibm.com Signed-off-by: Janis Schoetterl-Glausch scgl@linux.ibm.com + Reviewed-by: Thomas Huth thuth@redhat.com
## arch/s390/kvm/kvm-s390.c ## -@@ arch/s390/kvm/kvm-s390.c: static void *mem_op_alloc_buf(struct kvm_s390_mem_op *mop) - return buf; +@@ arch/s390/kvm/kvm-s390.c: static int mem_op_validate_common(struct kvm_s390_mem_op *mop, u64 supported_fla + return 0; }
-static int kvm_s390_vm_mem_op(struct kvm *kvm, struct kvm_s390_mem_op *mop) @@ arch/s390/kvm/kvm-s390.c: static int kvm_s390_vm_mem_op(struct kvm *kvm, struct - */ - if (kvm_s390_pv_get_handle(kvm)) - return -EINVAL; - tmpbuf = mem_op_alloc_buf(mop); - if (IS_ERR(tmpbuf)) - return PTR_ERR(tmpbuf); + if (!(mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY)) { + tmpbuf = vmalloc(mop->size); + if (!tmpbuf) @@ arch/s390/kvm/kvm-s390.c: static int kvm_s390_vm_mem_op(struct kvm *kvm, struct kvm_s390_mem_op *mop) } break; 10: 8acb425a9e93 ! 10: 91f4e887f311 KVM: s390: Refactor absolute vm mem_op function @@ Commit message
Suggested-by: Janosch Frank frankja@linux.ibm.com Signed-off-by: Janis Schoetterl-Glausch scgl@linux.ibm.com + Reviewed-by: Thomas Huth thuth@redhat.com
## arch/s390/kvm/kvm-s390.c ## -@@ arch/s390/kvm/kvm-s390.c: static void *mem_op_alloc_buf(struct kvm_s390_mem_op *mop) +@@ arch/s390/kvm/kvm-s390.c: static int mem_op_validate_common(struct kvm_s390_mem_op *mop, u64 supported_fla static int kvm_s390_vm_mem_op_abs(struct kvm *kvm, struct kvm_s390_mem_op *mop) { void __user *uaddr = (void __user *)mop->buf; @@ arch/s390/kvm/kvm-s390.c: static int kvm_s390_vm_mem_op_abs(struct kvm *kvm, str - } - } - break; -- } ++ acc_mode = mop->op == KVM_S390_MEMOP_ABSOLUTE_READ ? GACC_FETCH : GACC_STORE; ++ if (mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY) { ++ r = check_gpa_range(kvm, mop->gaddr, mop->size, acc_mode, mop->key); ++ goto out_unlock; + } - case KVM_S390_MEMOP_ABSOLUTE_WRITE: { - if (mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY) { - r = check_gpa_range(kvm, mop->gaddr, mop->size, GACC_STORE, mop->key); @@ arch/s390/kvm/kvm-s390.c: static int kvm_s390_vm_mem_op_abs(struct kvm *kvm, str - } - r = access_guest_abs_with_key(kvm, mop->gaddr, tmpbuf, - mop->size, GACC_STORE, mop->key); -+ acc_mode = mop->op == KVM_S390_MEMOP_ABSOLUTE_READ ? GACC_FETCH : GACC_STORE; -+ if (mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY) { -+ r = check_gpa_range(kvm, mop->gaddr, mop->size, acc_mode, mop->key); -+ } else if (acc_mode == GACC_FETCH) { ++ if (acc_mode == GACC_FETCH) { + r = access_guest_abs_with_key(kvm, mop->gaddr, tmpbuf, + mop->size, GACC_FETCH, mop->key); + if (r) 11: f62dfb06ea55 ! 11: d16ac0db04c7 KVM: s390: Refactor absolute vcpu mem_op function @@ Metadata Author: Janis Schoetterl-Glausch scgl@linux.ibm.com
## Commit message ## - KVM: s390: Refactor absolute vcpu mem_op function + KVM: s390: Refactor vcpu mem_op function
Remove code duplication with regards to the CHECK_ONLY flag. Decrease the number of indents. @@ arch/s390/kvm/kvm-s390.c: static long kvm_s390_vcpu_mem_op(struct kvm_vcpu *vcpu int r;
@@ arch/s390/kvm/kvm-s390.c: static long kvm_s390_vcpu_mem_op(struct kvm_vcpu *vcpu, - if (IS_ERR(tmpbuf)) - return PTR_ERR(tmpbuf); + return -ENOMEM; + }
- switch (mop->op) { - case KVM_S390_MEMOP_LOGICAL_READ: @@ arch/s390/kvm/kvm-s390.c: static long kvm_s390_vcpu_mem_op(struct kvm_vcpu *vcpu + if (mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY) { + r = check_gva_range(vcpu, mop->gaddr, mop->ar, mop->size, + acc_mode, mop->key); -+ } else if (acc_mode == GACC_FETCH) { ++ goto out_inject; ++ } ++ if (acc_mode == GACC_FETCH) { r = read_guest_with_key(vcpu, mop->gaddr, mop->ar, tmpbuf, mop->size, mop->key); - if (r == 0) { 12: 2199327cb484 ! 12: afec9544eec4 KVM: s390: Extend MEM_OP ioctl by storage key checked cmpxchg @@ Commit message Extend the MEM_OP ioctl in order to allow for this, by adding a cmpxchg op. For now, support this op for absolute accesses only.
- This op can be use, for example, to set the device-state-change + This op can be used, for example, to set the device-state-change indicator and the adapter-local-summary indicator atomically.
Signed-off-by: Janis Schoetterl-Glausch scgl@linux.ibm.com @@ arch/s390/kvm/gaccess.c: int access_guest_real(struct kvm_vcpu *vcpu, unsigned l + * Honors storage keys. + * + * Return: * 0: successful exchange -+ * * a program interruption code indicating the reason cmpxchg could -+ * not be attempted ++ * * >0: a program interruption code indicating the reason cmpxchg could ++ * not be attempted + * * -EINVAL: address misaligned or len not power of two + * * -EAGAIN: transient failure (len 1 or 2) + * * -EOPNOTSUPP: read-only memslot (should never occur) @@ arch/s390/kvm/gaccess.c: int access_guest_real(struct kvm_vcpu *vcpu, unsigned l + __uint128_t *old_addr, __uint128_t new, + u8 access_key, bool *success) +{ -+ gfn_t gfn = gpa >> PAGE_SHIFT; ++ gfn_t gfn = gpa_to_gfn(gpa); + struct kvm_memory_slot *slot = gfn_to_memslot(kvm, gfn); + bool writable; + hva_t hva; @@ arch/s390/kvm/gaccess.c: int access_guest_real(struct kvm_vcpu *vcpu, unsigned l + return -EOPNOTSUPP; + + hva += offset_in_page(gpa); ++ /* ++ * The cmpxchg_user_key macro depends on the type of "old", so we need ++ * a case for each valid length and get some code duplication as long ++ * as we don't introduce a new macro. ++ */ + switch (len) { + case 1: { + u8 old; + -+ ret = cmpxchg_user_key((u8 *)hva, &old, *old_addr, new, access_key); ++ ret = cmpxchg_user_key((u8 __user *)hva, &old, *old_addr, new, access_key); + *success = !ret && old == *old_addr; + *old_addr = old; + break; @@ arch/s390/kvm/gaccess.c: int access_guest_real(struct kvm_vcpu *vcpu, unsigned l + case 2: { + u16 old; + -+ ret = cmpxchg_user_key((u16 *)hva, &old, *old_addr, new, access_key); ++ ret = cmpxchg_user_key((u16 __user *)hva, &old, *old_addr, new, access_key); + *success = !ret && old == *old_addr; + *old_addr = old; + break; @@ arch/s390/kvm/gaccess.c: int access_guest_real(struct kvm_vcpu *vcpu, unsigned l + case 4: { + u32 old; + -+ ret = cmpxchg_user_key((u32 *)hva, &old, *old_addr, new, access_key); ++ ret = cmpxchg_user_key((u32 __user *)hva, &old, *old_addr, new, access_key); + *success = !ret && old == *old_addr; + *old_addr = old; + break; @@ arch/s390/kvm/gaccess.c: int access_guest_real(struct kvm_vcpu *vcpu, unsigned l + case 8: { + u64 old; + -+ ret = cmpxchg_user_key((u64 *)hva, &old, *old_addr, new, access_key); ++ ret = cmpxchg_user_key((u64 __user *)hva, &old, *old_addr, new, access_key); + *success = !ret && old == *old_addr; + *old_addr = old; + break; @@ arch/s390/kvm/gaccess.c: int access_guest_real(struct kvm_vcpu *vcpu, unsigned l + case 16: { + __uint128_t old; + -+ ret = cmpxchg_user_key((__uint128_t *)hva, &old, *old_addr, new, access_key); ++ ret = cmpxchg_user_key((__uint128_t __user *)hva, &old, *old_addr, new, access_key); + *success = !ret && old == *old_addr; + *old_addr = old; + break; @@ arch/s390/kvm/gaccess.c: int access_guest_real(struct kvm_vcpu *vcpu, unsigned l + default: + return -EINVAL; + } -+ mark_page_dirty_in_slot(kvm, slot, gfn); ++ if (*success) ++ mark_page_dirty_in_slot(kvm, slot, gfn); + /* + * Assume that the fault is caused by protection, either key protection + * or user page write protection. 13: 05d7682a90ba = 13: 2c006a4015c0 Documentation: KVM: s390: Describe KVM_S390_MEMOP_F_CMPXCHG 14: 993293dcdf13 = 14: ac16991fbe27 KVM: s390: selftest: memop: Add cmpxchg tests
base-commit: 4ec5183ec48656cec489c49f989c508b68b518e3 prerequisite-patch-id: 02346a556c2af3340683322ef71816f813599c21 prerequisite-patch-id: 770ae375220957f1db8eb2657bf2fb7868a501af prerequisite-patch-id: fba46548bd1b60742eed8669a15cadad3c5ec19c prerequisite-patch-id: 8df365bc587d8768618759d77e3a6806bc36130d prerequisite-patch-id: 0fc74ea6b8342b1dd69b034be308647d47d31746 prerequisite-patch-id: c5cdc3ce7cdffc18c5e56abfb657c84141fb623a prerequisite-patch-id: 837715a7a75d9175027343d727c98a398b927c9b prerequisite-patch-id: 6ccea6ff976ae44347f266d7a9e19e09a2ae9063
The struct is quite large, so this seems nicer.
Signed-off-by: Janis Schoetterl-Glausch scgl@linux.ibm.com Reviewed-by: Thomas Huth thuth@redhat.com Reviewed-by: Janosch Frank frankja@linux.ibm.com --- tools/testing/selftests/kvm/s390x/memop.c | 44 +++++++++++------------ 1 file changed, 22 insertions(+), 22 deletions(-)
diff --git a/tools/testing/selftests/kvm/s390x/memop.c b/tools/testing/selftests/kvm/s390x/memop.c index 3fd81e58f40c..9c05d1205114 100644 --- a/tools/testing/selftests/kvm/s390x/memop.c +++ b/tools/testing/selftests/kvm/s390x/memop.c @@ -48,53 +48,53 @@ struct mop_desc { uint8_t key; };
-static struct kvm_s390_mem_op ksmo_from_desc(struct mop_desc desc) +static struct kvm_s390_mem_op ksmo_from_desc(const struct mop_desc *desc) { struct kvm_s390_mem_op ksmo = { - .gaddr = (uintptr_t)desc.gaddr, - .size = desc.size, - .buf = ((uintptr_t)desc.buf), + .gaddr = (uintptr_t)desc->gaddr, + .size = desc->size, + .buf = ((uintptr_t)desc->buf), .reserved = "ignored_ignored_ignored_ignored" };
- switch (desc.target) { + switch (desc->target) { case LOGICAL: - if (desc.mode == READ) + if (desc->mode == READ) ksmo.op = KVM_S390_MEMOP_LOGICAL_READ; - if (desc.mode == WRITE) + if (desc->mode == WRITE) ksmo.op = KVM_S390_MEMOP_LOGICAL_WRITE; break; case SIDA: - if (desc.mode == READ) + if (desc->mode == READ) ksmo.op = KVM_S390_MEMOP_SIDA_READ; - if (desc.mode == WRITE) + if (desc->mode == WRITE) ksmo.op = KVM_S390_MEMOP_SIDA_WRITE; break; case ABSOLUTE: - if (desc.mode == READ) + if (desc->mode == READ) ksmo.op = KVM_S390_MEMOP_ABSOLUTE_READ; - if (desc.mode == WRITE) + if (desc->mode == WRITE) ksmo.op = KVM_S390_MEMOP_ABSOLUTE_WRITE; break; case INVALID: ksmo.op = -1; } - if (desc.f_check) + if (desc->f_check) ksmo.flags |= KVM_S390_MEMOP_F_CHECK_ONLY; - if (desc.f_inject) + if (desc->f_inject) ksmo.flags |= KVM_S390_MEMOP_F_INJECT_EXCEPTION; - if (desc._set_flags) - ksmo.flags = desc.set_flags; - if (desc.f_key) { + if (desc->_set_flags) + ksmo.flags = desc->set_flags; + if (desc->f_key) { ksmo.flags |= KVM_S390_MEMOP_F_SKEY_PROTECTION; - ksmo.key = desc.key; + ksmo.key = desc->key; } - if (desc._ar) - ksmo.ar = desc.ar; + if (desc->_ar) + ksmo.ar = desc->ar; else ksmo.ar = 0; - if (desc._sida_offset) - ksmo.sida_offset = desc.sida_offset; + if (desc->_sida_offset) + ksmo.sida_offset = desc->sida_offset;
return ksmo; } @@ -183,7 +183,7 @@ static int err_memop_ioctl(struct test_info info, struct kvm_s390_mem_op *ksmo) else \ __desc.gaddr = __desc.gaddr_v; \ } \ - __ksmo = ksmo_from_desc(__desc); \ + __ksmo = ksmo_from_desc(&__desc); \ print_memop(__info.vcpu, &__ksmo); \ err##memop_ioctl(__info, &__ksmo); \ })
Replace the DEFAULT_* test helpers by functions, as they don't need the extra flexibility.
Signed-off-by: Janis Schoetterl-Glausch scgl@linux.ibm.com Reviewed-by: Thomas Huth thuth@redhat.com Acked-by: Janosch Frank frankja@linux.ibm.com --- tools/testing/selftests/kvm/s390x/memop.c | 82 +++++++++++------------ 1 file changed, 39 insertions(+), 43 deletions(-)
diff --git a/tools/testing/selftests/kvm/s390x/memop.c b/tools/testing/selftests/kvm/s390x/memop.c index 9c05d1205114..df1c726294b2 100644 --- a/tools/testing/selftests/kvm/s390x/memop.c +++ b/tools/testing/selftests/kvm/s390x/memop.c @@ -48,6 +48,8 @@ struct mop_desc { uint8_t key; };
+const uint8_t NO_KEY = 0xff; + static struct kvm_s390_mem_op ksmo_from_desc(const struct mop_desc *desc) { struct kvm_s390_mem_op ksmo = { @@ -85,7 +87,7 @@ static struct kvm_s390_mem_op ksmo_from_desc(const struct mop_desc *desc) ksmo.flags |= KVM_S390_MEMOP_F_INJECT_EXCEPTION; if (desc->_set_flags) ksmo.flags = desc->set_flags; - if (desc->f_key) { + if (desc->f_key && desc->key != NO_KEY) { ksmo.flags |= KVM_S390_MEMOP_F_SKEY_PROTECTION; ksmo.key = desc->key; } @@ -268,34 +270,28 @@ static void prepare_mem12(void) #define ASSERT_MEM_EQ(p1, p2, size) \ TEST_ASSERT(!memcmp(p1, p2, size), "Memory contents do not match!")
-#define DEFAULT_WRITE_READ(copy_cpu, mop_cpu, mop_target_p, size, ...) \ -({ \ - struct test_info __copy_cpu = (copy_cpu), __mop_cpu = (mop_cpu); \ - enum mop_target __target = (mop_target_p); \ - uint32_t __size = (size); \ - \ - prepare_mem12(); \ - CHECK_N_DO(MOP, __mop_cpu, __target, WRITE, mem1, __size, \ - GADDR_V(mem1), ##__VA_ARGS__); \ - HOST_SYNC(__copy_cpu, STAGE_COPIED); \ - CHECK_N_DO(MOP, __mop_cpu, __target, READ, mem2, __size, \ - GADDR_V(mem2), ##__VA_ARGS__); \ - ASSERT_MEM_EQ(mem1, mem2, __size); \ -}) +static void default_write_read(struct test_info copy_cpu, struct test_info mop_cpu, + enum mop_target mop_target, uint32_t size, uint8_t key) +{ + prepare_mem12(); + CHECK_N_DO(MOP, mop_cpu, mop_target, WRITE, mem1, size, + GADDR_V(mem1), KEY(key)); + HOST_SYNC(copy_cpu, STAGE_COPIED); + CHECK_N_DO(MOP, mop_cpu, mop_target, READ, mem2, size, + GADDR_V(mem2), KEY(key)); + ASSERT_MEM_EQ(mem1, mem2, size); +}
-#define DEFAULT_READ(copy_cpu, mop_cpu, mop_target_p, size, ...) \ -({ \ - struct test_info __copy_cpu = (copy_cpu), __mop_cpu = (mop_cpu); \ - enum mop_target __target = (mop_target_p); \ - uint32_t __size = (size); \ - \ - prepare_mem12(); \ - CHECK_N_DO(MOP, __mop_cpu, __target, WRITE, mem1, __size, \ - GADDR_V(mem1)); \ - HOST_SYNC(__copy_cpu, STAGE_COPIED); \ - CHECK_N_DO(MOP, __mop_cpu, __target, READ, mem2, __size, ##__VA_ARGS__);\ - ASSERT_MEM_EQ(mem1, mem2, __size); \ -}) +static void default_read(struct test_info copy_cpu, struct test_info mop_cpu, + enum mop_target mop_target, uint32_t size, uint8_t key) +{ + prepare_mem12(); + CHECK_N_DO(MOP, mop_cpu, mop_target, WRITE, mem1, size, GADDR_V(mem1)); + HOST_SYNC(copy_cpu, STAGE_COPIED); + CHECK_N_DO(MOP, mop_cpu, mop_target, READ, mem2, size, + GADDR_V(mem2), KEY(key)); + ASSERT_MEM_EQ(mem1, mem2, size); +}
static void guest_copy(void) { @@ -310,7 +306,7 @@ static void test_copy(void)
HOST_SYNC(t.vcpu, STAGE_INITED);
- DEFAULT_WRITE_READ(t.vcpu, t.vcpu, LOGICAL, t.size); + default_write_read(t.vcpu, t.vcpu, LOGICAL, t.size, NO_KEY);
kvm_vm_free(t.kvm_vm); } @@ -357,26 +353,26 @@ static void test_copy_key(void) HOST_SYNC(t.vcpu, STAGE_SKEYS_SET);
/* vm, no key */ - DEFAULT_WRITE_READ(t.vcpu, t.vm, ABSOLUTE, t.size); + default_write_read(t.vcpu, t.vm, ABSOLUTE, t.size, NO_KEY);
/* vm/vcpu, machting key or key 0 */ - DEFAULT_WRITE_READ(t.vcpu, t.vcpu, LOGICAL, t.size, KEY(0)); - DEFAULT_WRITE_READ(t.vcpu, t.vcpu, LOGICAL, t.size, KEY(9)); - DEFAULT_WRITE_READ(t.vcpu, t.vm, ABSOLUTE, t.size, KEY(0)); - DEFAULT_WRITE_READ(t.vcpu, t.vm, ABSOLUTE, t.size, KEY(9)); + default_write_read(t.vcpu, t.vcpu, LOGICAL, t.size, 0); + default_write_read(t.vcpu, t.vcpu, LOGICAL, t.size, 9); + default_write_read(t.vcpu, t.vm, ABSOLUTE, t.size, 0); + default_write_read(t.vcpu, t.vm, ABSOLUTE, t.size, 9); /* * There used to be different code paths for key handling depending on * if the region crossed a page boundary. * There currently are not, but the more tests the merrier. */ - DEFAULT_WRITE_READ(t.vcpu, t.vcpu, LOGICAL, 1, KEY(0)); - DEFAULT_WRITE_READ(t.vcpu, t.vcpu, LOGICAL, 1, KEY(9)); - DEFAULT_WRITE_READ(t.vcpu, t.vm, ABSOLUTE, 1, KEY(0)); - DEFAULT_WRITE_READ(t.vcpu, t.vm, ABSOLUTE, 1, KEY(9)); + default_write_read(t.vcpu, t.vcpu, LOGICAL, 1, 0); + default_write_read(t.vcpu, t.vcpu, LOGICAL, 1, 9); + default_write_read(t.vcpu, t.vm, ABSOLUTE, 1, 0); + default_write_read(t.vcpu, t.vm, ABSOLUTE, 1, 9);
/* vm/vcpu, mismatching keys on read, but no fetch protection */ - DEFAULT_READ(t.vcpu, t.vcpu, LOGICAL, t.size, GADDR_V(mem2), KEY(2)); - DEFAULT_READ(t.vcpu, t.vm, ABSOLUTE, t.size, GADDR_V(mem1), KEY(2)); + default_read(t.vcpu, t.vcpu, LOGICAL, t.size, 2); + default_read(t.vcpu, t.vm, ABSOLUTE, t.size, 2);
kvm_vm_free(t.kvm_vm); } @@ -409,7 +405,7 @@ static void test_copy_key_storage_prot_override(void) HOST_SYNC(t.vcpu, STAGE_SKEYS_SET);
/* vcpu, mismatching keys, storage protection override in effect */ - DEFAULT_WRITE_READ(t.vcpu, t.vcpu, LOGICAL, t.size, KEY(2)); + default_write_read(t.vcpu, t.vcpu, LOGICAL, t.size, 2);
kvm_vm_free(t.kvm_vm); } @@ -422,8 +418,8 @@ static void test_copy_key_fetch_prot(void) HOST_SYNC(t.vcpu, STAGE_SKEYS_SET);
/* vm/vcpu, matching key, fetch protection in effect */ - DEFAULT_READ(t.vcpu, t.vcpu, LOGICAL, t.size, GADDR_V(mem2), KEY(9)); - DEFAULT_READ(t.vcpu, t.vm, ABSOLUTE, t.size, GADDR_V(mem2), KEY(9)); + default_read(t.vcpu, t.vcpu, LOGICAL, t.size, 9); + default_read(t.vcpu, t.vm, ABSOLUTE, t.size, 9);
kvm_vm_free(t.kvm_vm); }
This allows checking if the necessary requirements for a test case are met via an arbitrary expression. In particular, it is easy to check if certain bits are set in the memop extension capability.
Signed-off-by: Janis Schoetterl-Glausch scgl@linux.ibm.com Reviewed-by: Thomas Huth thuth@redhat.com Reviewed-by: Janosch Frank frankja@linux.ibm.com --- tools/testing/selftests/kvm/s390x/memop.c | 131 +++++++++++----------- 1 file changed, 66 insertions(+), 65 deletions(-)
diff --git a/tools/testing/selftests/kvm/s390x/memop.c b/tools/testing/selftests/kvm/s390x/memop.c index df1c726294b2..bbc191a13760 100644 --- a/tools/testing/selftests/kvm/s390x/memop.c +++ b/tools/testing/selftests/kvm/s390x/memop.c @@ -690,85 +690,86 @@ static void test_errors(void) kvm_vm_free(t.kvm_vm); }
-struct testdef { - const char *name; - void (*test)(void); - int extension; -} testlist[] = { - { - .name = "simple copy", - .test = test_copy, - }, - { - .name = "generic error checks", - .test = test_errors, - }, - { - .name = "copy with storage keys", - .test = test_copy_key, - .extension = 1, - }, - { - .name = "copy with key storage protection override", - .test = test_copy_key_storage_prot_override, - .extension = 1, - }, - { - .name = "copy with key fetch protection", - .test = test_copy_key_fetch_prot, - .extension = 1, - }, - { - .name = "copy with key fetch protection override", - .test = test_copy_key_fetch_prot_override, - .extension = 1, - }, - { - .name = "error checks with key", - .test = test_errors_key, - .extension = 1, - }, - { - .name = "termination", - .test = test_termination, - .extension = 1, - }, - { - .name = "error checks with key storage protection override", - .test = test_errors_key_storage_prot_override, - .extension = 1, - }, - { - .name = "error checks without key fetch prot override", - .test = test_errors_key_fetch_prot_override_not_enabled, - .extension = 1, - }, - { - .name = "error checks with key fetch prot override", - .test = test_errors_key_fetch_prot_override_enabled, - .extension = 1, - }, -};
int main(int argc, char *argv[]) { int extension_cap, idx;
TEST_REQUIRE(kvm_has_cap(KVM_CAP_S390_MEM_OP)); + extension_cap = kvm_check_cap(KVM_CAP_S390_MEM_OP_EXTENSION);
- ksft_print_header(); + struct testdef { + const char *name; + void (*test)(void); + bool requirements_met; + } testlist[] = { + { + .name = "simple copy", + .test = test_copy, + .requirements_met = true, + }, + { + .name = "generic error checks", + .test = test_errors, + .requirements_met = true, + }, + { + .name = "copy with storage keys", + .test = test_copy_key, + .requirements_met = extension_cap > 0, + }, + { + .name = "copy with key storage protection override", + .test = test_copy_key_storage_prot_override, + .requirements_met = extension_cap > 0, + }, + { + .name = "copy with key fetch protection", + .test = test_copy_key_fetch_prot, + .requirements_met = extension_cap > 0, + }, + { + .name = "copy with key fetch protection override", + .test = test_copy_key_fetch_prot_override, + .requirements_met = extension_cap > 0, + }, + { + .name = "error checks with key", + .test = test_errors_key, + .requirements_met = extension_cap > 0, + }, + { + .name = "termination", + .test = test_termination, + .requirements_met = extension_cap > 0, + }, + { + .name = "error checks with key storage protection override", + .test = test_errors_key_storage_prot_override, + .requirements_met = extension_cap > 0, + }, + { + .name = "error checks without key fetch prot override", + .test = test_errors_key_fetch_prot_override_not_enabled, + .requirements_met = extension_cap > 0, + }, + { + .name = "error checks with key fetch prot override", + .test = test_errors_key_fetch_prot_override_enabled, + .requirements_met = extension_cap > 0, + }, + };
+ ksft_print_header(); ksft_set_plan(ARRAY_SIZE(testlist));
- extension_cap = kvm_check_cap(KVM_CAP_S390_MEM_OP_EXTENSION); for (idx = 0; idx < ARRAY_SIZE(testlist); idx++) { - if (extension_cap >= testlist[idx].extension) { + if (testlist[idx].requirements_met) { testlist[idx].test(); ksft_test_result_pass("%s\n", testlist[idx].name); } else { - ksft_test_result_skip("%s - extension level %d not supported\n", - testlist[idx].name, - testlist[idx].extension); + ksft_test_result_skip("%s - requirements not met (kernel has extension cap %#x)\n", + testlist[idx].name, extension_cap); } }
Add a test that tries a real write to a bad address. The existing CHECK_ONLY test doesn't cover all paths.
Signed-off-by: Janis Schoetterl-Glausch scgl@linux.ibm.com Reviewed-by: Nico Boehr nrb@linux.ibm.com --- tools/testing/selftests/kvm/s390x/memop.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/kvm/s390x/memop.c b/tools/testing/selftests/kvm/s390x/memop.c index bbc191a13760..00737cceacda 100644 --- a/tools/testing/selftests/kvm/s390x/memop.c +++ b/tools/testing/selftests/kvm/s390x/memop.c @@ -641,7 +641,9 @@ static void _test_errors_common(struct test_info info, enum mop_target target, i
/* Bad guest address: */ rv = ERR_MOP(info, target, WRITE, mem1, size, GADDR((void *)~0xfffUL), CHECK_ONLY); - TEST_ASSERT(rv > 0, "ioctl does not report bad guest memory access"); + TEST_ASSERT(rv > 0, "ioctl does not report bad guest memory address with CHECK_ONLY"); + rv = ERR_MOP(info, target, WRITE, mem1, size, GADDR((void *)~0xfffUL)); + TEST_ASSERT(rv > 0, "ioctl does not report bad guest memory address on write");
/* Bad host address: */ rv = ERR_MOP(info, target, WRITE, 0, size, GADDR_V(mem1));
On 2/6/23 17:45, Janis Schoetterl-Glausch wrote:
Add a test that tries a real write to a bad address. The existing CHECK_ONLY test doesn't cover all paths.
Signed-off-by: Janis Schoetterl-Glausch scgl@linux.ibm.com Reviewed-by: Nico Boehr nrb@linux.ibm.com
Reviewed-by: Janosch Frank frankja@linux.ibm.com
tools/testing/selftests/kvm/s390x/memop.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/kvm/s390x/memop.c b/tools/testing/selftests/kvm/s390x/memop.c index bbc191a13760..00737cceacda 100644 --- a/tools/testing/selftests/kvm/s390x/memop.c +++ b/tools/testing/selftests/kvm/s390x/memop.c @@ -641,7 +641,9 @@ static void _test_errors_common(struct test_info info, enum mop_target target, i /* Bad guest address: */ rv = ERR_MOP(info, target, WRITE, mem1, size, GADDR((void *)~0xfffUL), CHECK_ONLY);
- TEST_ASSERT(rv > 0, "ioctl does not report bad guest memory access");
- TEST_ASSERT(rv > 0, "ioctl does not report bad guest memory address with CHECK_ONLY");
- rv = ERR_MOP(info, target, WRITE, mem1, size, GADDR((void *)~0xfffUL));
- TEST_ASSERT(rv > 0, "ioctl does not report bad guest memory address on write");
/* Bad host address: */ rv = ERR_MOP(info, target, WRITE, 0, size, GADDR_V(mem1));
"acceeded" isn't a word, should be "exceeded".
Signed-off-by: Janis Schoetterl-Glausch scgl@linux.ibm.com Reviewed-by: Thomas Huth thuth@redhat.com Reviewed-by: Nico Boehr nrb@linux.ibm.com --- tools/testing/selftests/kvm/s390x/memop.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/kvm/s390x/memop.c b/tools/testing/selftests/kvm/s390x/memop.c index 00737cceacda..033a8603a096 100644 --- a/tools/testing/selftests/kvm/s390x/memop.c +++ b/tools/testing/selftests/kvm/s390x/memop.c @@ -602,7 +602,7 @@ static void test_errors_key_fetch_prot_override_enabled(void)
/* * vcpu, mismatching keys on fetch, - * fetch protection override does not apply because memory range acceeded + * fetch protection override does not apply because memory range exceeded */ CHECK_N_DO(ERR_PROT_MOP, t.vcpu, LOGICAL, READ, mem2, 2048 + 1, GADDR_V(0), KEY(2)); CHECK_N_DO(ERR_PROT_MOP, t.vcpu, LOGICAL, READ, mem2, PAGE_SIZE + 2048 + 1,
The guest code sets the key for mem1 only. In order to provoke a protection exception the test codes needs to address mem1.
Signed-off-by: Janis Schoetterl-Glausch scgl@linux.ibm.com Reviewed-by: Nico Boehr nrb@linux.ibm.com --- tools/testing/selftests/kvm/s390x/memop.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/kvm/s390x/memop.c b/tools/testing/selftests/kvm/s390x/memop.c index 033a8603a096..1ae5c01f9904 100644 --- a/tools/testing/selftests/kvm/s390x/memop.c +++ b/tools/testing/selftests/kvm/s390x/memop.c @@ -450,9 +450,9 @@ static void test_errors_key(void)
/* vm/vcpu, mismatching keys, fetch protection in effect */ CHECK_N_DO(ERR_PROT_MOP, t.vcpu, LOGICAL, WRITE, mem1, t.size, GADDR_V(mem1), KEY(2)); - CHECK_N_DO(ERR_PROT_MOP, t.vcpu, LOGICAL, READ, mem2, t.size, GADDR_V(mem2), KEY(2)); + CHECK_N_DO(ERR_PROT_MOP, t.vcpu, LOGICAL, READ, mem2, t.size, GADDR_V(mem1), KEY(2)); CHECK_N_DO(ERR_PROT_MOP, t.vm, ABSOLUTE, WRITE, mem1, t.size, GADDR_V(mem1), KEY(2)); - CHECK_N_DO(ERR_PROT_MOP, t.vm, ABSOLUTE, READ, mem2, t.size, GADDR_V(mem2), KEY(2)); + CHECK_N_DO(ERR_PROT_MOP, t.vm, ABSOLUTE, READ, mem2, t.size, GADDR_V(mem1), KEY(2));
kvm_vm_free(t.kvm_vm); }
The address is a 64 bit value, specifying a 32 bit value can crash the guest. In this case things worked out with -O2 but not -O0.
Signed-off-by: Janis Schoetterl-Glausch scgl@linux.ibm.com Fixes: 1bb873495a9e ("KVM: s390: selftests: Add more copy memop tests") Reviewed-by: Thomas Huth thuth@redhat.com --- tools/testing/selftests/kvm/s390x/memop.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/kvm/s390x/memop.c b/tools/testing/selftests/kvm/s390x/memop.c index 1ae5c01f9904..c5fec84ef3c2 100644 --- a/tools/testing/selftests/kvm/s390x/memop.c +++ b/tools/testing/selftests/kvm/s390x/memop.c @@ -514,7 +514,7 @@ static void guest_copy_key_fetch_prot_override(void) GUEST_SYNC(STAGE_INITED); set_storage_key_range(0, PAGE_SIZE, 0x18); set_storage_key_range((void *)last_page_addr, PAGE_SIZE, 0x0); - asm volatile ("sske %[key],%[addr]\n" :: [addr] "r"(0), [key] "r"(0x18) : "cc"); + asm volatile ("sske %[key],%[addr]\n" :: [addr] "r"(0L), [key] "r"(0x18) : "cc"); GUEST_SYNC(STAGE_SKEYS_SET);
for (;;) {
The vcpu and vm mem_op ioctl implementations share some functionality. Move argument checking into a function and call it from both implementations. This allows code reuse in case of additional future mem_op operations.
Suggested-by: Janosch Frank frankja@linux.ibm.com Signed-off-by: Janis Schoetterl-Glausch scgl@linux.ibm.com --- arch/s390/kvm/kvm-s390.c | 52 +++++++++++++++++++--------------------- 1 file changed, 24 insertions(+), 28 deletions(-)
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index e4890e04b210..0367c1a7e69a 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -2764,24 +2764,32 @@ static int kvm_s390_handle_pv(struct kvm *kvm, struct kvm_pv_cmd *cmd) return r; }
-static bool access_key_invalid(u8 access_key) +static int mem_op_validate_common(struct kvm_s390_mem_op *mop, u64 supported_flags) { - return access_key > 0xf; + if (mop->flags & ~supported_flags || !mop->size) + return -EINVAL; + if (mop->size > MEM_OP_MAX_SIZE) + return -E2BIG; + if (mop->flags & KVM_S390_MEMOP_F_SKEY_PROTECTION) { + if (mop->key > 0xf) + return -EINVAL; + } else { + mop->key = 0; + } + return 0; }
static int kvm_s390_vm_mem_op(struct kvm *kvm, struct kvm_s390_mem_op *mop) { void __user *uaddr = (void __user *)mop->buf; - u64 supported_flags; void *tmpbuf = NULL; int r, srcu_idx;
- supported_flags = KVM_S390_MEMOP_F_SKEY_PROTECTION - | KVM_S390_MEMOP_F_CHECK_ONLY; - if (mop->flags & ~supported_flags || !mop->size) - return -EINVAL; - if (mop->size > MEM_OP_MAX_SIZE) - return -E2BIG; + r = mem_op_validate_common(mop, KVM_S390_MEMOP_F_SKEY_PROTECTION | + KVM_S390_MEMOP_F_CHECK_ONLY); + if (r) + return r; + /* * This is technically a heuristic only, if the kvm->lock is not * taken, it is not guaranteed that the vm is/remains non-protected. @@ -2793,12 +2801,6 @@ static int kvm_s390_vm_mem_op(struct kvm *kvm, struct kvm_s390_mem_op *mop) */ if (kvm_s390_pv_get_handle(kvm)) return -EINVAL; - if (mop->flags & KVM_S390_MEMOP_F_SKEY_PROTECTION) { - if (access_key_invalid(mop->key)) - return -EINVAL; - } else { - mop->key = 0; - } if (!(mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY)) { tmpbuf = vmalloc(mop->size); if (!tmpbuf) @@ -5250,23 +5252,17 @@ static long kvm_s390_vcpu_mem_op(struct kvm_vcpu *vcpu, { void __user *uaddr = (void __user *)mop->buf; void *tmpbuf = NULL; - int r = 0; - const u64 supported_flags = KVM_S390_MEMOP_F_INJECT_EXCEPTION - | KVM_S390_MEMOP_F_CHECK_ONLY - | KVM_S390_MEMOP_F_SKEY_PROTECTION; + int r;
- if (mop->flags & ~supported_flags || mop->ar >= NUM_ACRS || !mop->size) + r = mem_op_validate_common(mop, KVM_S390_MEMOP_F_INJECT_EXCEPTION | + KVM_S390_MEMOP_F_CHECK_ONLY | + KVM_S390_MEMOP_F_SKEY_PROTECTION); + if (r) + return r; + if (mop->ar >= NUM_ACRS) return -EINVAL; - if (mop->size > MEM_OP_MAX_SIZE) - return -E2BIG; if (kvm_s390_pv_cpu_is_protected(vcpu)) return -EINVAL; - if (mop->flags & KVM_S390_MEMOP_F_SKEY_PROTECTION) { - if (access_key_invalid(mop->key)) - return -EINVAL; - } else { - mop->key = 0; - } if (!(mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY)) { tmpbuf = vmalloc(mop->size); if (!tmpbuf)
On 2/6/23 17:45, Janis Schoetterl-Glausch wrote:
The vcpu and vm mem_op ioctl implementations share some functionality. Move argument checking into a function and call it from both implementations. This allows code reuse in case of additional future mem_op operations.
Suggested-by: Janosch Frank frankja@linux.ibm.com Signed-off-by: Janis Schoetterl-Glausch scgl@linux.ibm.com
Reviewed-by: Janosch Frank frankja@linux.ibm.com
arch/s390/kvm/kvm-s390.c | 52 +++++++++++++++++++--------------------- 1 file changed, 24 insertions(+), 28 deletions(-)
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index e4890e04b210..0367c1a7e69a 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -2764,24 +2764,32 @@ static int kvm_s390_handle_pv(struct kvm *kvm, struct kvm_pv_cmd *cmd) return r; } -static bool access_key_invalid(u8 access_key) +static int mem_op_validate_common(struct kvm_s390_mem_op *mop, u64 supported_flags) {
- return access_key > 0xf;
- if (mop->flags & ~supported_flags || !mop->size)
return -EINVAL;
- if (mop->size > MEM_OP_MAX_SIZE)
return -E2BIG;
- if (mop->flags & KVM_S390_MEMOP_F_SKEY_PROTECTION) {
if (mop->key > 0xf)
return -EINVAL;
- } else {
mop->key = 0;
- }
- return 0; }
static int kvm_s390_vm_mem_op(struct kvm *kvm, struct kvm_s390_mem_op *mop) { void __user *uaddr = (void __user *)mop->buf;
- u64 supported_flags; void *tmpbuf = NULL; int r, srcu_idx;
- supported_flags = KVM_S390_MEMOP_F_SKEY_PROTECTION
| KVM_S390_MEMOP_F_CHECK_ONLY;
- if (mop->flags & ~supported_flags || !mop->size)
return -EINVAL;
- if (mop->size > MEM_OP_MAX_SIZE)
return -E2BIG;
- r = mem_op_validate_common(mop, KVM_S390_MEMOP_F_SKEY_PROTECTION |
KVM_S390_MEMOP_F_CHECK_ONLY);
- if (r)
return r;
- /*
- This is technically a heuristic only, if the kvm->lock is not
- taken, it is not guaranteed that the vm is/remains non-protected.
@@ -2793,12 +2801,6 @@ static int kvm_s390_vm_mem_op(struct kvm *kvm, struct kvm_s390_mem_op *mop) */ if (kvm_s390_pv_get_handle(kvm)) return -EINVAL;
- if (mop->flags & KVM_S390_MEMOP_F_SKEY_PROTECTION) {
if (access_key_invalid(mop->key))
return -EINVAL;
- } else {
mop->key = 0;
- } if (!(mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY)) { tmpbuf = vmalloc(mop->size); if (!tmpbuf)
@@ -5250,23 +5252,17 @@ static long kvm_s390_vcpu_mem_op(struct kvm_vcpu *vcpu, { void __user *uaddr = (void __user *)mop->buf; void *tmpbuf = NULL;
- int r = 0;
- const u64 supported_flags = KVM_S390_MEMOP_F_INJECT_EXCEPTION
| KVM_S390_MEMOP_F_CHECK_ONLY
| KVM_S390_MEMOP_F_SKEY_PROTECTION;
- int r;
- if (mop->flags & ~supported_flags || mop->ar >= NUM_ACRS || !mop->size)
- r = mem_op_validate_common(mop, KVM_S390_MEMOP_F_INJECT_EXCEPTION |
KVM_S390_MEMOP_F_CHECK_ONLY |
KVM_S390_MEMOP_F_SKEY_PROTECTION);
- if (r)
return r;
- if (mop->ar >= NUM_ACRS) return -EINVAL;
- if (mop->size > MEM_OP_MAX_SIZE)
if (kvm_s390_pv_cpu_is_protected(vcpu)) return -EINVAL;return -E2BIG;
- if (mop->flags & KVM_S390_MEMOP_F_SKEY_PROTECTION) {
if (access_key_invalid(mop->key))
return -EINVAL;
- } else {
mop->key = 0;
- } if (!(mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY)) { tmpbuf = vmalloc(mop->size); if (!tmpbuf)
Instead of having one function covering all mem_op operations, have a function implementing absolute access and dispatch to that function in its caller, based on the operation code. This way additional future operations can be implemented by adding an implementing function without changing existing operations.
Suggested-by: Janosch Frank frankja@linux.ibm.com Signed-off-by: Janis Schoetterl-Glausch scgl@linux.ibm.com Reviewed-by: Thomas Huth thuth@redhat.com --- arch/s390/kvm/kvm-s390.c | 38 ++++++++++++++++++++++++-------------- 1 file changed, 24 insertions(+), 14 deletions(-)
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index 0367c1a7e69a..707967a296f1 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -2779,7 +2779,7 @@ static int mem_op_validate_common(struct kvm_s390_mem_op *mop, u64 supported_fla return 0; }
-static int kvm_s390_vm_mem_op(struct kvm *kvm, struct kvm_s390_mem_op *mop) +static int kvm_s390_vm_mem_op_abs(struct kvm *kvm, struct kvm_s390_mem_op *mop) { void __user *uaddr = (void __user *)mop->buf; void *tmpbuf = NULL; @@ -2790,17 +2790,6 @@ static int kvm_s390_vm_mem_op(struct kvm *kvm, struct kvm_s390_mem_op *mop) if (r) return r;
- /* - * This is technically a heuristic only, if the kvm->lock is not - * taken, it is not guaranteed that the vm is/remains non-protected. - * This is ok from a kernel perspective, wrongdoing is detected - * on the access, -EFAULT is returned and the vm may crash the - * next time it accesses the memory in question. - * There is no sane usecase to do switching and a memop on two - * different CPUs at the same time. - */ - if (kvm_s390_pv_get_handle(kvm)) - return -EINVAL; if (!(mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY)) { tmpbuf = vmalloc(mop->size); if (!tmpbuf) @@ -2841,8 +2830,6 @@ static int kvm_s390_vm_mem_op(struct kvm *kvm, struct kvm_s390_mem_op *mop) } break; } - default: - r = -EINVAL; }
out_unlock: @@ -2852,6 +2839,29 @@ static int kvm_s390_vm_mem_op(struct kvm *kvm, struct kvm_s390_mem_op *mop) return r; }
+static int kvm_s390_vm_mem_op(struct kvm *kvm, struct kvm_s390_mem_op *mop) +{ + /* + * This is technically a heuristic only, if the kvm->lock is not + * taken, it is not guaranteed that the vm is/remains non-protected. + * This is ok from a kernel perspective, wrongdoing is detected + * on the access, -EFAULT is returned and the vm may crash the + * next time it accesses the memory in question. + * There is no sane usecase to do switching and a memop on two + * different CPUs at the same time. + */ + if (kvm_s390_pv_get_handle(kvm)) + return -EINVAL; + + switch (mop->op) { + case KVM_S390_MEMOP_ABSOLUTE_READ: + case KVM_S390_MEMOP_ABSOLUTE_WRITE: + return kvm_s390_vm_mem_op_abs(kvm, mop); + default: + return -EINVAL; + } +} + long kvm_arch_vm_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg) {
On 2/6/23 17:45, Janis Schoetterl-Glausch wrote:
Instead of having one function covering all mem_op operations, have a function implementing absolute access and dispatch to that function in its caller, based on the operation code. This way additional future operations can be implemented by adding an implementing function without changing existing operations.
Suggested-by: Janosch Frank frankja@linux.ibm.com Signed-off-by: Janis Schoetterl-Glausch scgl@linux.ibm.com Reviewed-by: Thomas Huth thuth@redhat.com
Reviewed-by: Janosch Frank frankja@linux.ibm.com
arch/s390/kvm/kvm-s390.c | 38 ++++++++++++++++++++++++-------------- 1 file changed, 24 insertions(+), 14 deletions(-)
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index 0367c1a7e69a..707967a296f1 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -2779,7 +2779,7 @@ static int mem_op_validate_common(struct kvm_s390_mem_op *mop, u64 supported_fla return 0; } -static int kvm_s390_vm_mem_op(struct kvm *kvm, struct kvm_s390_mem_op *mop) +static int kvm_s390_vm_mem_op_abs(struct kvm *kvm, struct kvm_s390_mem_op *mop) { void __user *uaddr = (void __user *)mop->buf; void *tmpbuf = NULL; @@ -2790,17 +2790,6 @@ static int kvm_s390_vm_mem_op(struct kvm *kvm, struct kvm_s390_mem_op *mop) if (r) return r;
- /*
* This is technically a heuristic only, if the kvm->lock is not
* taken, it is not guaranteed that the vm is/remains non-protected.
* This is ok from a kernel perspective, wrongdoing is detected
* on the access, -EFAULT is returned and the vm may crash the
* next time it accesses the memory in question.
* There is no sane usecase to do switching and a memop on two
* different CPUs at the same time.
*/
- if (kvm_s390_pv_get_handle(kvm))
if (!(mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY)) { tmpbuf = vmalloc(mop->size); if (!tmpbuf)return -EINVAL;
@@ -2841,8 +2830,6 @@ static int kvm_s390_vm_mem_op(struct kvm *kvm, struct kvm_s390_mem_op *mop) } break; }
- default:
}r = -EINVAL;
out_unlock: @@ -2852,6 +2839,29 @@ static int kvm_s390_vm_mem_op(struct kvm *kvm, struct kvm_s390_mem_op *mop) return r; } +static int kvm_s390_vm_mem_op(struct kvm *kvm, struct kvm_s390_mem_op *mop) +{
- /*
* This is technically a heuristic only, if the kvm->lock is not
* taken, it is not guaranteed that the vm is/remains non-protected.
* This is ok from a kernel perspective, wrongdoing is detected
* on the access, -EFAULT is returned and the vm may crash the
* next time it accesses the memory in question.
* There is no sane usecase to do switching and a memop on two
* different CPUs at the same time.
*/
- if (kvm_s390_pv_get_handle(kvm))
return -EINVAL;
- switch (mop->op) {
- case KVM_S390_MEMOP_ABSOLUTE_READ:
- case KVM_S390_MEMOP_ABSOLUTE_WRITE:
return kvm_s390_vm_mem_op_abs(kvm, mop);
- default:
return -EINVAL;
- }
+}
- long kvm_arch_vm_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg) {
Remove code duplication with regards to the CHECK_ONLY flag. Decrease the number of indents. No functional change indented.
Suggested-by: Janosch Frank frankja@linux.ibm.com Signed-off-by: Janis Schoetterl-Glausch scgl@linux.ibm.com Reviewed-by: Thomas Huth thuth@redhat.com ---
Cosmetic only, can be dropped
arch/s390/kvm/kvm-s390.c | 43 +++++++++++++++++----------------------- 1 file changed, 18 insertions(+), 25 deletions(-)
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index 707967a296f1..1f94b18f1cb5 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -2782,6 +2782,7 @@ static int mem_op_validate_common(struct kvm_s390_mem_op *mop, u64 supported_fla static int kvm_s390_vm_mem_op_abs(struct kvm *kvm, struct kvm_s390_mem_op *mop) { void __user *uaddr = (void __user *)mop->buf; + enum gacc_mode acc_mode; void *tmpbuf = NULL; int r, srcu_idx;
@@ -2803,33 +2804,25 @@ static int kvm_s390_vm_mem_op_abs(struct kvm *kvm, struct kvm_s390_mem_op *mop) goto out_unlock; }
- switch (mop->op) { - case KVM_S390_MEMOP_ABSOLUTE_READ: { - if (mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY) { - r = check_gpa_range(kvm, mop->gaddr, mop->size, GACC_FETCH, mop->key); - } else { - r = access_guest_abs_with_key(kvm, mop->gaddr, tmpbuf, - mop->size, GACC_FETCH, mop->key); - if (r == 0) { - if (copy_to_user(uaddr, tmpbuf, mop->size)) - r = -EFAULT; - } - } - break; + acc_mode = mop->op == KVM_S390_MEMOP_ABSOLUTE_READ ? GACC_FETCH : GACC_STORE; + if (mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY) { + r = check_gpa_range(kvm, mop->gaddr, mop->size, acc_mode, mop->key); + goto out_unlock; } - case KVM_S390_MEMOP_ABSOLUTE_WRITE: { - if (mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY) { - r = check_gpa_range(kvm, mop->gaddr, mop->size, GACC_STORE, mop->key); - } else { - if (copy_from_user(tmpbuf, uaddr, mop->size)) { - r = -EFAULT; - break; - } - r = access_guest_abs_with_key(kvm, mop->gaddr, tmpbuf, - mop->size, GACC_STORE, mop->key); + if (acc_mode == GACC_FETCH) { + r = access_guest_abs_with_key(kvm, mop->gaddr, tmpbuf, + mop->size, GACC_FETCH, mop->key); + if (r) + goto out_unlock; + if (copy_to_user(uaddr, tmpbuf, mop->size)) + r = -EFAULT; + } else { + if (copy_from_user(tmpbuf, uaddr, mop->size)) { + r = -EFAULT; + goto out_unlock; } - break; - } + r = access_guest_abs_with_key(kvm, mop->gaddr, tmpbuf, + mop->size, GACC_STORE, mop->key); }
out_unlock:
On 2/6/23 17:45, Janis Schoetterl-Glausch wrote:
Remove code duplication with regards to the CHECK_ONLY flag. Decrease the number of indents. No functional change indented.
Suggested-by: Janosch Frank frankja@linux.ibm.com Signed-off-by: Janis Schoetterl-Glausch scgl@linux.ibm.com Reviewed-by: Thomas Huth thuth@redhat.com
Reviewed-by: Janosch Frank frankja@linux.ibm.com
Cosmetic only, can be dropped
arch/s390/kvm/kvm-s390.c | 43 +++++++++++++++++----------------------- 1 file changed, 18 insertions(+), 25 deletions(-)
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index 707967a296f1..1f94b18f1cb5 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -2782,6 +2782,7 @@ static int mem_op_validate_common(struct kvm_s390_mem_op *mop, u64 supported_fla static int kvm_s390_vm_mem_op_abs(struct kvm *kvm, struct kvm_s390_mem_op *mop) { void __user *uaddr = (void __user *)mop->buf;
- enum gacc_mode acc_mode; void *tmpbuf = NULL; int r, srcu_idx;
@@ -2803,33 +2804,25 @@ static int kvm_s390_vm_mem_op_abs(struct kvm *kvm, struct kvm_s390_mem_op *mop) goto out_unlock; }
- switch (mop->op) {
- case KVM_S390_MEMOP_ABSOLUTE_READ: {
if (mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY) {
r = check_gpa_range(kvm, mop->gaddr, mop->size, GACC_FETCH, mop->key);
} else {
r = access_guest_abs_with_key(kvm, mop->gaddr, tmpbuf,
mop->size, GACC_FETCH, mop->key);
if (r == 0) {
if (copy_to_user(uaddr, tmpbuf, mop->size))
r = -EFAULT;
}
}
break;
- acc_mode = mop->op == KVM_S390_MEMOP_ABSOLUTE_READ ? GACC_FETCH : GACC_STORE;
- if (mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY) {
r = check_gpa_range(kvm, mop->gaddr, mop->size, acc_mode, mop->key);
}goto out_unlock;
- case KVM_S390_MEMOP_ABSOLUTE_WRITE: {
if (mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY) {
r = check_gpa_range(kvm, mop->gaddr, mop->size, GACC_STORE, mop->key);
} else {
if (copy_from_user(tmpbuf, uaddr, mop->size)) {
r = -EFAULT;
break;
}
r = access_guest_abs_with_key(kvm, mop->gaddr, tmpbuf,
mop->size, GACC_STORE, mop->key);
- if (acc_mode == GACC_FETCH) {
r = access_guest_abs_with_key(kvm, mop->gaddr, tmpbuf,
mop->size, GACC_FETCH, mop->key);
if (r)
goto out_unlock;
if (copy_to_user(uaddr, tmpbuf, mop->size))
r = -EFAULT;
- } else {
if (copy_from_user(tmpbuf, uaddr, mop->size)) {
r = -EFAULT;
}goto out_unlock;
break;
- }
r = access_guest_abs_with_key(kvm, mop->gaddr, tmpbuf,
}mop->size, GACC_STORE, mop->key);
out_unlock:
On 2/6/23 17:45, Janis Schoetterl-Glausch wrote:
Remove code duplication with regards to the CHECK_ONLY flag. Decrease the number of indents. No functional change indented.
Suggested-by: Janosch Frank frankja@linux.ibm.com Signed-off-by: Janis Schoetterl-Glausch scgl@linux.ibm.com Reviewed-by: Thomas Huth thuth@redhat.com
Reviewed-by: Janosch Frank frankja@linux.ibm.com
Cosmetic only, can be dropped
arch/s390/kvm/kvm-s390.c | 43 +++++++++++++++++----------------------- 1 file changed, 18 insertions(+), 25 deletions(-)
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index 707967a296f1..1f94b18f1cb5 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -2782,6 +2782,7 @@ static int mem_op_validate_common(struct kvm_s390_mem_op *mop, u64 supported_fla static int kvm_s390_vm_mem_op_abs(struct kvm *kvm, struct kvm_s390_mem_op *mop) { void __user *uaddr = (void __user *)mop->buf;
- enum gacc_mode acc_mode; void *tmpbuf = NULL; int r, srcu_idx;
@@ -2803,33 +2804,25 @@ static int kvm_s390_vm_mem_op_abs(struct kvm *kvm, struct kvm_s390_mem_op *mop) goto out_unlock; }
- switch (mop->op) {
- case KVM_S390_MEMOP_ABSOLUTE_READ: {
if (mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY) {
r = check_gpa_range(kvm, mop->gaddr, mop->size, GACC_FETCH, mop->key);
} else {
r = access_guest_abs_with_key(kvm, mop->gaddr, tmpbuf,
mop->size, GACC_FETCH, mop->key);
if (r == 0) {
if (copy_to_user(uaddr, tmpbuf, mop->size))
r = -EFAULT;
}
}
break;
- acc_mode = mop->op == KVM_S390_MEMOP_ABSOLUTE_READ ? GACC_FETCH : GACC_STORE;
- if (mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY) {
r = check_gpa_range(kvm, mop->gaddr, mop->size, acc_mode, mop->key);
}goto out_unlock;
- case KVM_S390_MEMOP_ABSOLUTE_WRITE: {
if (mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY) {
r = check_gpa_range(kvm, mop->gaddr, mop->size, GACC_STORE, mop->key);
} else {
if (copy_from_user(tmpbuf, uaddr, mop->size)) {
r = -EFAULT;
break;
}
r = access_guest_abs_with_key(kvm, mop->gaddr, tmpbuf,
mop->size, GACC_STORE, mop->key);
- if (acc_mode == GACC_FETCH) {
r = access_guest_abs_with_key(kvm, mop->gaddr, tmpbuf,
mop->size, GACC_FETCH, mop->key);
if (r)
goto out_unlock;
if (copy_to_user(uaddr, tmpbuf, mop->size))
r = -EFAULT;
- } else {
if (copy_from_user(tmpbuf, uaddr, mop->size)) {
r = -EFAULT;
}goto out_unlock;
break;
- }
r = access_guest_abs_with_key(kvm, mop->gaddr, tmpbuf,
}mop->size, GACC_STORE, mop->key);
out_unlock:
Remove code duplication with regards to the CHECK_ONLY flag. Decrease the number of indents. No functional change indented.
Suggested-by: Janosch Frank frankja@linux.ibm.com Signed-off-by: Janis Schoetterl-Glausch scgl@linux.ibm.com ---
Cosmetic only, can be dropped
arch/s390/kvm/kvm-s390.c | 36 +++++++++++++++++------------------- 1 file changed, 17 insertions(+), 19 deletions(-)
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index 1f94b18f1cb5..8a74b7b62ecf 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -5254,6 +5254,7 @@ static long kvm_s390_vcpu_mem_op(struct kvm_vcpu *vcpu, struct kvm_s390_mem_op *mop) { void __user *uaddr = (void __user *)mop->buf; + enum gacc_mode acc_mode; void *tmpbuf = NULL; int r;
@@ -5272,38 +5273,35 @@ static long kvm_s390_vcpu_mem_op(struct kvm_vcpu *vcpu, return -ENOMEM; }
- switch (mop->op) { - case KVM_S390_MEMOP_LOGICAL_READ: - if (mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY) { - r = check_gva_range(vcpu, mop->gaddr, mop->ar, mop->size, - GACC_FETCH, mop->key); - break; - } + acc_mode = mop->op == KVM_S390_MEMOP_LOGICAL_READ ? GACC_FETCH : GACC_STORE; + if (mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY) { + r = check_gva_range(vcpu, mop->gaddr, mop->ar, mop->size, + acc_mode, mop->key); + goto out_inject; + } + if (acc_mode == GACC_FETCH) { r = read_guest_with_key(vcpu, mop->gaddr, mop->ar, tmpbuf, mop->size, mop->key); - if (r == 0) { - if (copy_to_user(uaddr, tmpbuf, mop->size)) - r = -EFAULT; - } - break; - case KVM_S390_MEMOP_LOGICAL_WRITE: - if (mop->flags & KVM_S390_MEMOP_F_CHECK_ONLY) { - r = check_gva_range(vcpu, mop->gaddr, mop->ar, mop->size, - GACC_STORE, mop->key); - break; + if (r) + goto out_inject; + if (copy_to_user(uaddr, tmpbuf, mop->size)) { + r = -EFAULT; + goto out_free; } + } else { if (copy_from_user(tmpbuf, uaddr, mop->size)) { r = -EFAULT; - break; + goto out_free; } r = write_guest_with_key(vcpu, mop->gaddr, mop->ar, tmpbuf, mop->size, mop->key); - break; }
+out_inject: if (r > 0 && (mop->flags & KVM_S390_MEMOP_F_INJECT_EXCEPTION) != 0) kvm_s390_inject_prog_irq(vcpu, &vcpu->arch.pgm);
+out_free: vfree(tmpbuf); return r; }
User space can use the MEM_OP ioctl to make storage key checked reads and writes to the guest, however, it has no way of performing atomic, key checked, accesses to the guest. Extend the MEM_OP ioctl in order to allow for this, by adding a cmpxchg op. For now, support this op for absolute accesses only.
This op can be used, for example, to set the device-state-change indicator and the adapter-local-summary indicator atomically.
Signed-off-by: Janis Schoetterl-Glausch scgl@linux.ibm.com --- include/uapi/linux/kvm.h | 8 +++ arch/s390/kvm/gaccess.h | 3 ++ arch/s390/kvm/gaccess.c | 109 +++++++++++++++++++++++++++++++++++++++ arch/s390/kvm/kvm-s390.c | 56 +++++++++++++++++++- 4 files changed, 175 insertions(+), 1 deletion(-)
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 55155e262646..d2f30463c133 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -583,6 +583,8 @@ struct kvm_s390_mem_op { struct { __u8 ar; /* the access register number */ __u8 key; /* access key, ignored if flag unset */ + __u8 pad1[6]; /* ignored */ + __u64 old_addr; /* ignored if cmpxchg flag unset */ }; __u32 sida_offset; /* offset into the sida */ __u8 reserved[32]; /* ignored */ @@ -595,11 +597,17 @@ struct kvm_s390_mem_op { #define KVM_S390_MEMOP_SIDA_WRITE 3 #define KVM_S390_MEMOP_ABSOLUTE_READ 4 #define KVM_S390_MEMOP_ABSOLUTE_WRITE 5 +#define KVM_S390_MEMOP_ABSOLUTE_CMPXCHG 6 + /* flags for kvm_s390_mem_op->flags */ #define KVM_S390_MEMOP_F_CHECK_ONLY (1ULL << 0) #define KVM_S390_MEMOP_F_INJECT_EXCEPTION (1ULL << 1) #define KVM_S390_MEMOP_F_SKEY_PROTECTION (1ULL << 2)
+/* flags specifying extension support via KVM_CAP_S390_MEM_OP_EXTENSION */ +#define KVM_S390_MEMOP_EXTENSION_CAP_BASE (1 << 0) +#define KVM_S390_MEMOP_EXTENSION_CAP_CMPXCHG (1 << 1) + /* for KVM_INTERRUPT */ struct kvm_interrupt { /* in */ diff --git a/arch/s390/kvm/gaccess.h b/arch/s390/kvm/gaccess.h index 9408d6cc8e2c..b320d12aa049 100644 --- a/arch/s390/kvm/gaccess.h +++ b/arch/s390/kvm/gaccess.h @@ -206,6 +206,9 @@ int access_guest_with_key(struct kvm_vcpu *vcpu, unsigned long ga, u8 ar, int access_guest_real(struct kvm_vcpu *vcpu, unsigned long gra, void *data, unsigned long len, enum gacc_mode mode);
+int cmpxchg_guest_abs_with_key(struct kvm *kvm, gpa_t gpa, int len, __uint128_t *old, + __uint128_t new, u8 access_key, bool *success); + /** * write_guest_with_key - copy data from kernel space to guest space * @vcpu: virtual cpu diff --git a/arch/s390/kvm/gaccess.c b/arch/s390/kvm/gaccess.c index 0243b6e38d36..3eb85f254881 100644 --- a/arch/s390/kvm/gaccess.c +++ b/arch/s390/kvm/gaccess.c @@ -1161,6 +1161,115 @@ int access_guest_real(struct kvm_vcpu *vcpu, unsigned long gra, return rc; }
+/** + * cmpxchg_guest_abs_with_key() - Perform cmpxchg on guest absolute address. + * @kvm: Virtual machine instance. + * @gpa: Absolute guest address of the location to be changed. + * @len: Operand length of the cmpxchg, required: 1 <= len <= 16. Providing a + * non power of two will result in failure. + * @old_addr: Pointer to old value. If the location at @gpa contains this value, + * the exchange will succeed. After calling cmpxchg_guest_abs_with_key() + * *@old_addr contains the value at @gpa before the attempt to + * exchange the value. + * @new: The value to place at @gpa. + * @access_key: The access key to use for the guest access. + * @success: output value indicating if an exchange occurred. + * + * Atomically exchange the value at @gpa by @new, if it contains *@old. + * Honors storage keys. + * + * Return: * 0: successful exchange + * * >0: a program interruption code indicating the reason cmpxchg could + * not be attempted + * * -EINVAL: address misaligned or len not power of two + * * -EAGAIN: transient failure (len 1 or 2) + * * -EOPNOTSUPP: read-only memslot (should never occur) + */ +int cmpxchg_guest_abs_with_key(struct kvm *kvm, gpa_t gpa, int len, + __uint128_t *old_addr, __uint128_t new, + u8 access_key, bool *success) +{ + gfn_t gfn = gpa_to_gfn(gpa); + struct kvm_memory_slot *slot = gfn_to_memslot(kvm, gfn); + bool writable; + hva_t hva; + int ret; + + if (!IS_ALIGNED(gpa, len)) + return -EINVAL; + + hva = gfn_to_hva_memslot_prot(slot, gfn, &writable); + if (kvm_is_error_hva(hva)) + return PGM_ADDRESSING; + /* + * Check if it's a read-only memslot, even though that cannot occur + * since those are unsupported. + * Don't try to actually handle that case. + */ + if (!writable) + return -EOPNOTSUPP; + + hva += offset_in_page(gpa); + /* + * The cmpxchg_user_key macro depends on the type of "old", so we need + * a case for each valid length and get some code duplication as long + * as we don't introduce a new macro. + */ + switch (len) { + case 1: { + u8 old; + + ret = cmpxchg_user_key((u8 __user *)hva, &old, *old_addr, new, access_key); + *success = !ret && old == *old_addr; + *old_addr = old; + break; + } + case 2: { + u16 old; + + ret = cmpxchg_user_key((u16 __user *)hva, &old, *old_addr, new, access_key); + *success = !ret && old == *old_addr; + *old_addr = old; + break; + } + case 4: { + u32 old; + + ret = cmpxchg_user_key((u32 __user *)hva, &old, *old_addr, new, access_key); + *success = !ret && old == *old_addr; + *old_addr = old; + break; + } + case 8: { + u64 old; + + ret = cmpxchg_user_key((u64 __user *)hva, &old, *old_addr, new, access_key); + *success = !ret && old == *old_addr; + *old_addr = old; + break; + } + case 16: { + __uint128_t old; + + ret = cmpxchg_user_key((__uint128_t __user *)hva, &old, *old_addr, new, access_key); + *success = !ret && old == *old_addr; + *old_addr = old; + break; + } + default: + return -EINVAL; + } + if (*success) + mark_page_dirty_in_slot(kvm, slot, gfn); + /* + * Assume that the fault is caused by protection, either key protection + * or user page write protection. + */ + if (ret == -EFAULT) + ret = PGM_PROTECTION; + return ret; +} + /** * guest_translate_address_with_key - translate guest logical into guest absolute address * @vcpu: virtual cpu diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index 8a74b7b62ecf..5b77269e6536 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -584,7 +584,6 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) case KVM_CAP_S390_VCPU_RESETS: case KVM_CAP_SET_GUEST_DEBUG: case KVM_CAP_S390_DIAG318: - case KVM_CAP_S390_MEM_OP_EXTENSION: r = 1; break; case KVM_CAP_SET_GUEST_DEBUG2: @@ -598,6 +597,15 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) case KVM_CAP_S390_MEM_OP: r = MEM_OP_MAX_SIZE; break; + case KVM_CAP_S390_MEM_OP_EXTENSION: + /* + * Flag bits indicating which extensions are supported. + * If r > 0, the base extension must also be supported/indicated, + * in order to maintain backwards compatibility. + */ + r = KVM_S390_MEMOP_EXTENSION_CAP_BASE | + KVM_S390_MEMOP_EXTENSION_CAP_CMPXCHG; + break; case KVM_CAP_NR_VCPUS: case KVM_CAP_MAX_VCPUS: case KVM_CAP_MAX_VCPU_ID: @@ -2832,6 +2840,50 @@ static int kvm_s390_vm_mem_op_abs(struct kvm *kvm, struct kvm_s390_mem_op *mop) return r; }
+static int kvm_s390_vm_mem_op_cmpxchg(struct kvm *kvm, struct kvm_s390_mem_op *mop) +{ + void __user *uaddr = (void __user *)mop->buf; + void __user *old_addr = (void __user *)mop->old_addr; + union { + __uint128_t quad; + char raw[sizeof(__uint128_t)]; + } old = { .quad = 0}, new = { .quad = 0 }; + unsigned int off_in_quad = sizeof(new) - mop->size; + int r, srcu_idx; + bool success; + + r = mem_op_validate_common(mop, KVM_S390_MEMOP_F_SKEY_PROTECTION); + if (r) + return r; + /* + * This validates off_in_quad. Checking that size is a power + * of two is not necessary, as cmpxchg_guest_abs_with_key + * takes care of that + */ + if (mop->size > sizeof(new)) + return -EINVAL; + if (copy_from_user(&new.raw[off_in_quad], uaddr, mop->size)) + return -EFAULT; + if (copy_from_user(&old.raw[off_in_quad], old_addr, mop->size)) + return -EFAULT; + + srcu_idx = srcu_read_lock(&kvm->srcu); + + if (kvm_is_error_gpa(kvm, mop->gaddr)) { + r = PGM_ADDRESSING; + goto out_unlock; + } + + r = cmpxchg_guest_abs_with_key(kvm, mop->gaddr, mop->size, &old.quad, + new.quad, mop->key, &success); + if (!success && copy_to_user(old_addr, &old.raw[off_in_quad], mop->size)) + r = -EFAULT; + +out_unlock: + srcu_read_unlock(&kvm->srcu, srcu_idx); + return r; +} + static int kvm_s390_vm_mem_op(struct kvm *kvm, struct kvm_s390_mem_op *mop) { /* @@ -2850,6 +2902,8 @@ static int kvm_s390_vm_mem_op(struct kvm *kvm, struct kvm_s390_mem_op *mop) case KVM_S390_MEMOP_ABSOLUTE_READ: case KVM_S390_MEMOP_ABSOLUTE_WRITE: return kvm_s390_vm_mem_op_abs(kvm, mop); + case KVM_S390_MEMOP_ABSOLUTE_CMPXCHG: + return kvm_s390_vm_mem_op_cmpxchg(kvm, mop); default: return -EINVAL; }
On 2/6/23 17:46, Janis Schoetterl-Glausch wrote:
User space can use the MEM_OP ioctl to make storage key checked reads and writes to the guest, however, it has no way of performing atomic, key checked, accesses to the guest. Extend the MEM_OP ioctl in order to allow for this, by adding a cmpxchg op. For now, support this op for absolute accesses only.
This op can be used, for example, to set the device-state-change indicator and the adapter-local-summary indicator atomically.
Signed-off-by: Janis Schoetterl-Glausch scgl@linux.ibm.com
Reviewed-by: Janosch Frank frankja@linux.ibm.com
include/uapi/linux/kvm.h | 8 +++ arch/s390/kvm/gaccess.h | 3 ++ arch/s390/kvm/gaccess.c | 109 +++++++++++++++++++++++++++++++++++++++ arch/s390/kvm/kvm-s390.c | 56 +++++++++++++++++++- 4 files changed, 175 insertions(+), 1 deletion(-)
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 55155e262646..d2f30463c133 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -583,6 +583,8 @@ struct kvm_s390_mem_op { struct { __u8 ar; /* the access register number */ __u8 key; /* access key, ignored if flag unset */
__u8 pad1[6]; /* ignored */
}; __u32 sida_offset; /* offset into the sida */ __u8 reserved[32]; /* ignored */__u64 old_addr; /* ignored if cmpxchg flag unset */
@@ -595,11 +597,17 @@ struct kvm_s390_mem_op { #define KVM_S390_MEMOP_SIDA_WRITE 3 #define KVM_S390_MEMOP_ABSOLUTE_READ 4 #define KVM_S390_MEMOP_ABSOLUTE_WRITE 5 +#define KVM_S390_MEMOP_ABSOLUTE_CMPXCHG 6
- /* flags for kvm_s390_mem_op->flags */ #define KVM_S390_MEMOP_F_CHECK_ONLY (1ULL << 0) #define KVM_S390_MEMOP_F_INJECT_EXCEPTION (1ULL << 1) #define KVM_S390_MEMOP_F_SKEY_PROTECTION (1ULL << 2)
+/* flags specifying extension support via KVM_CAP_S390_MEM_OP_EXTENSION */ +#define KVM_S390_MEMOP_EXTENSION_CAP_BASE (1 << 0) +#define KVM_S390_MEMOP_EXTENSION_CAP_CMPXCHG (1 << 1)
- /* for KVM_INTERRUPT */ struct kvm_interrupt { /* in */
diff --git a/arch/s390/kvm/gaccess.h b/arch/s390/kvm/gaccess.h index 9408d6cc8e2c..b320d12aa049 100644 --- a/arch/s390/kvm/gaccess.h +++ b/arch/s390/kvm/gaccess.h @@ -206,6 +206,9 @@ int access_guest_with_key(struct kvm_vcpu *vcpu, unsigned long ga, u8 ar, int access_guest_real(struct kvm_vcpu *vcpu, unsigned long gra, void *data, unsigned long len, enum gacc_mode mode); +int cmpxchg_guest_abs_with_key(struct kvm *kvm, gpa_t gpa, int len, __uint128_t *old,
__uint128_t new, u8 access_key, bool *success);
- /**
- write_guest_with_key - copy data from kernel space to guest space
- @vcpu: virtual cpu
diff --git a/arch/s390/kvm/gaccess.c b/arch/s390/kvm/gaccess.c index 0243b6e38d36..3eb85f254881 100644 --- a/arch/s390/kvm/gaccess.c +++ b/arch/s390/kvm/gaccess.c @@ -1161,6 +1161,115 @@ int access_guest_real(struct kvm_vcpu *vcpu, unsigned long gra, return rc; } +/**
- cmpxchg_guest_abs_with_key() - Perform cmpxchg on guest absolute address.
- @kvm: Virtual machine instance.
- @gpa: Absolute guest address of the location to be changed.
- @len: Operand length of the cmpxchg, required: 1 <= len <= 16. Providing a
non power of two will result in failure.
- @old_addr: Pointer to old value. If the location at @gpa contains this value,
the exchange will succeed. After calling cmpxchg_guest_abs_with_key()
*@old_addr contains the value at @gpa before the attempt to
exchange the value.
- @new: The value to place at @gpa.
- @access_key: The access key to use for the guest access.
- @success: output value indicating if an exchange occurred.
- Atomically exchange the value at @gpa by @new, if it contains *@old.
- Honors storage keys.
- Return: * 0: successful exchange
* >0: a program interruption code indicating the reason cmpxchg could
not be attempted
* -EINVAL: address misaligned or len not power of two
* -EAGAIN: transient failure (len 1 or 2)
* -EOPNOTSUPP: read-only memslot (should never occur)
- */
+int cmpxchg_guest_abs_with_key(struct kvm *kvm, gpa_t gpa, int len,
__uint128_t *old_addr, __uint128_t new,
u8 access_key, bool *success)
+{
- gfn_t gfn = gpa_to_gfn(gpa);
- struct kvm_memory_slot *slot = gfn_to_memslot(kvm, gfn);
- bool writable;
- hva_t hva;
- int ret;
- if (!IS_ALIGNED(gpa, len))
return -EINVAL;
- hva = gfn_to_hva_memslot_prot(slot, gfn, &writable);
- if (kvm_is_error_hva(hva))
return PGM_ADDRESSING;
- /*
* Check if it's a read-only memslot, even though that cannot occur
* since those are unsupported.
* Don't try to actually handle that case.
*/
- if (!writable)
return -EOPNOTSUPP;
- hva += offset_in_page(gpa);
- /*
* The cmpxchg_user_key macro depends on the type of "old", so we need
* a case for each valid length and get some code duplication as long
* as we don't introduce a new macro.
*/
- switch (len) {
- case 1: {
u8 old;
ret = cmpxchg_user_key((u8 __user *)hva, &old, *old_addr, new, access_key);
*success = !ret && old == *old_addr;
*old_addr = old;
break;
- }
- case 2: {
u16 old;
ret = cmpxchg_user_key((u16 __user *)hva, &old, *old_addr, new, access_key);
*success = !ret && old == *old_addr;
*old_addr = old;
break;
- }
- case 4: {
u32 old;
ret = cmpxchg_user_key((u32 __user *)hva, &old, *old_addr, new, access_key);
*success = !ret && old == *old_addr;
*old_addr = old;
break;
- }
- case 8: {
u64 old;
ret = cmpxchg_user_key((u64 __user *)hva, &old, *old_addr, new, access_key);
*success = !ret && old == *old_addr;
*old_addr = old;
break;
- }
- case 16: {
__uint128_t old;
ret = cmpxchg_user_key((__uint128_t __user *)hva, &old, *old_addr, new, access_key);
*success = !ret && old == *old_addr;
*old_addr = old;
break;
- }
- default:
return -EINVAL;
- }
- if (*success)
mark_page_dirty_in_slot(kvm, slot, gfn);
- /*
* Assume that the fault is caused by protection, either key protection
* or user page write protection.
*/
- if (ret == -EFAULT)
ret = PGM_PROTECTION;
- return ret;
+}
- /**
- guest_translate_address_with_key - translate guest logical into guest absolute address
- @vcpu: virtual cpu
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index 8a74b7b62ecf..5b77269e6536 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -584,7 +584,6 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) case KVM_CAP_S390_VCPU_RESETS: case KVM_CAP_SET_GUEST_DEBUG: case KVM_CAP_S390_DIAG318:
- case KVM_CAP_S390_MEM_OP_EXTENSION: r = 1; break; case KVM_CAP_SET_GUEST_DEBUG2:
@@ -598,6 +597,15 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) case KVM_CAP_S390_MEM_OP: r = MEM_OP_MAX_SIZE; break;
- case KVM_CAP_S390_MEM_OP_EXTENSION:
/*
* Flag bits indicating which extensions are supported.
* If r > 0, the base extension must also be supported/indicated,
* in order to maintain backwards compatibility.
*/
r = KVM_S390_MEMOP_EXTENSION_CAP_BASE |
KVM_S390_MEMOP_EXTENSION_CAP_CMPXCHG;
case KVM_CAP_NR_VCPUS: case KVM_CAP_MAX_VCPUS: case KVM_CAP_MAX_VCPU_ID:break;
@@ -2832,6 +2840,50 @@ static int kvm_s390_vm_mem_op_abs(struct kvm *kvm, struct kvm_s390_mem_op *mop) return r; } +static int kvm_s390_vm_mem_op_cmpxchg(struct kvm *kvm, struct kvm_s390_mem_op *mop) +{
- void __user *uaddr = (void __user *)mop->buf;
- void __user *old_addr = (void __user *)mop->old_addr;
- union {
__uint128_t quad;
char raw[sizeof(__uint128_t)];
- } old = { .quad = 0}, new = { .quad = 0 };
- unsigned int off_in_quad = sizeof(new) - mop->size;
- int r, srcu_idx;
- bool success;
- r = mem_op_validate_common(mop, KVM_S390_MEMOP_F_SKEY_PROTECTION);
- if (r)
return r;
- /*
* This validates off_in_quad. Checking that size is a power
* of two is not necessary, as cmpxchg_guest_abs_with_key
* takes care of that
*/
- if (mop->size > sizeof(new))
return -EINVAL;
- if (copy_from_user(&new.raw[off_in_quad], uaddr, mop->size))
return -EFAULT;
- if (copy_from_user(&old.raw[off_in_quad], old_addr, mop->size))
return -EFAULT;
- srcu_idx = srcu_read_lock(&kvm->srcu);
- if (kvm_is_error_gpa(kvm, mop->gaddr)) {
r = PGM_ADDRESSING;
goto out_unlock;
- }
- r = cmpxchg_guest_abs_with_key(kvm, mop->gaddr, mop->size, &old.quad,
new.quad, mop->key, &success);
- if (!success && copy_to_user(old_addr, &old.raw[off_in_quad], mop->size))
r = -EFAULT;
+out_unlock:
- srcu_read_unlock(&kvm->srcu, srcu_idx);
- return r;
+}
- static int kvm_s390_vm_mem_op(struct kvm *kvm, struct kvm_s390_mem_op *mop) { /*
@@ -2850,6 +2902,8 @@ static int kvm_s390_vm_mem_op(struct kvm *kvm, struct kvm_s390_mem_op *mop) case KVM_S390_MEMOP_ABSOLUTE_READ: case KVM_S390_MEMOP_ABSOLUTE_WRITE: return kvm_s390_vm_mem_op_abs(kvm, mop);
- case KVM_S390_MEMOP_ABSOLUTE_CMPXCHG:
default: return -EINVAL; }return kvm_s390_vm_mem_op_cmpxchg(kvm, mop);
Describe the semantics of the new KVM_S390_MEMOP_F_CMPXCHG flag for absolute vm write memops which allows user space to perform (storage key checked) cmpxchg operations on guest memory.
Signed-off-by: Janis Schoetterl-Glausch scgl@linux.ibm.com --- Documentation/virt/kvm/api.rst | 29 ++++++++++++++++++++++++++--- 1 file changed, 26 insertions(+), 3 deletions(-)
diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 0a67cb738013..d09d7223c2a6 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -3736,7 +3736,8 @@ The fields in each entry are defined as follows: :Parameters: struct kvm_s390_mem_op (in) :Returns: = 0 on success, < 0 on generic error (e.g. -EFAULT or -ENOMEM), - > 0 if an exception occurred while walking the page tables + 16 bit program exception code if the access causes such an exception, + other code > 0xffff with special meaning.
Read or write data from/to the VM's memory. The KVM_CAP_S390_MEM_OP_EXTENSION capability specifies what functionality is @@ -3754,6 +3755,8 @@ Parameters are specified via the following structure:: struct { __u8 ar; /* the access register number */ __u8 key; /* access key, ignored if flag unset */ + __u8 pad1[6]; /* ignored */ + __u64 old_addr; /* ignored if flag unset */ }; __u32 sida_offset; /* offset into the sida */ __u8 reserved[32]; /* ignored */ @@ -3781,6 +3784,7 @@ Possible operations are: * ``KVM_S390_MEMOP_ABSOLUTE_WRITE`` * ``KVM_S390_MEMOP_SIDA_READ`` * ``KVM_S390_MEMOP_SIDA_WRITE`` + * ``KVM_S390_MEMOP_ABSOLUTE_CMPXCHG``
Logical read/write: ^^^^^^^^^^^^^^^^^^^ @@ -3829,7 +3833,7 @@ the checks required for storage key protection as one operation (as opposed to user space getting the storage keys, performing the checks, and accessing memory thereafter, which could lead to a delay between check and access). Absolute accesses are permitted for the VM ioctl if KVM_CAP_S390_MEM_OP_EXTENSION -is > 0. +has the KVM_S390_MEMOP_EXTENSION_CAP_BASE bit set. Currently absolute accesses are not permitted for VCPU ioctls. Absolute accesses are permitted for non-protected guests only.
@@ -3837,7 +3841,26 @@ Supported flags: * ``KVM_S390_MEMOP_F_CHECK_ONLY`` * ``KVM_S390_MEMOP_F_SKEY_PROTECTION``
-The semantics of the flags are as for logical accesses. +The semantics of the flags common with logical accesses are as for logical +accesses. + +Absolute cmpxchg: +^^^^^^^^^^^^^^^^^ + +Perform cmpxchg on absolute guest memory. Intended for use with the +KVM_S390_MEMOP_F_SKEY_PROTECTION flag. +Instead of doing an unconditional write, the access occurs only if the target +location contains the value pointed to by "old_addr". +This is performed as an atomic cmpxchg with the length specified by the "size" +parameter. "size" must be a power of two up to and including 16. +If the exchange did not take place because the target value doesn't match the +old value, the value "old_addr" points to is replaced by the target value. +User space can tell if an exchange took place by checking if this replacement +occurred. The cmpxchg op is permitted for the VM ioctl if +KVM_CAP_S390_MEM_OP_EXTENSION has flag KVM_S390_MEMOP_EXTENSION_CAP_CMPXCHG set. + +Supported flags: + * ``KVM_S390_MEMOP_F_SKEY_PROTECTION``
SIDA read/write: ^^^^^^^^^^^^^^^^
On 2/6/23 17:46, Janis Schoetterl-Glausch wrote:
Describe the semantics of the new KVM_S390_MEMOP_F_CMPXCHG flag for absolute vm write memops which allows user space to perform (storage key checked) cmpxchg operations on guest memory.
Signed-off-by: Janis Schoetterl-Glausch scgl@linux.ibm.com
Documentation/virt/kvm/api.rst | 29 ++++++++++++++++++++++++++--- 1 file changed, 26 insertions(+), 3 deletions(-)
diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 0a67cb738013..d09d7223c2a6 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -3736,7 +3736,8 @@ The fields in each entry are defined as follows: :Parameters: struct kvm_s390_mem_op (in) :Returns: = 0 on success, < 0 on generic error (e.g. -EFAULT or -ENOMEM),
> 0 if an exception occurred while walking the page tables
16 bit program exception code if the access causes such an exception,
other code > 0xffff with special meaning.
The 0xffff part is rebase damage from an old patch, I've discussed it off list and will remove that line when picking since I see no other reason for a re-spin.
Reviewed-by: Janosch Frank frankja@linux.ibm.com
Test successful exchange, unsuccessful exchange, storage key protection and invalid arguments.
Signed-off-by: Janis Schoetterl-Glausch scgl@linux.ibm.com --- tools/testing/selftests/kvm/s390x/memop.c | 410 +++++++++++++++++++++- 1 file changed, 395 insertions(+), 15 deletions(-)
diff --git a/tools/testing/selftests/kvm/s390x/memop.c b/tools/testing/selftests/kvm/s390x/memop.c index c5fec84ef3c2..06e0dfb4006f 100644 --- a/tools/testing/selftests/kvm/s390x/memop.c +++ b/tools/testing/selftests/kvm/s390x/memop.c @@ -9,6 +9,7 @@ #include <stdlib.h> #include <string.h> #include <sys/ioctl.h> +#include <pthread.h>
#include <linux/bits.h>
@@ -26,6 +27,7 @@ enum mop_target { enum mop_access_mode { READ, WRITE, + CMPXCHG, };
struct mop_desc { @@ -44,13 +46,16 @@ struct mop_desc { enum mop_access_mode mode; void *buf; uint32_t sida_offset; + void *old; + uint8_t old_value[16]; + bool *cmpxchg_success; uint8_t ar; uint8_t key; };
const uint8_t NO_KEY = 0xff;
-static struct kvm_s390_mem_op ksmo_from_desc(const struct mop_desc *desc) +static struct kvm_s390_mem_op ksmo_from_desc(struct mop_desc *desc) { struct kvm_s390_mem_op ksmo = { .gaddr = (uintptr_t)desc->gaddr, @@ -77,6 +82,11 @@ static struct kvm_s390_mem_op ksmo_from_desc(const struct mop_desc *desc) ksmo.op = KVM_S390_MEMOP_ABSOLUTE_READ; if (desc->mode == WRITE) ksmo.op = KVM_S390_MEMOP_ABSOLUTE_WRITE; + if (desc->mode == CMPXCHG) { + ksmo.op = KVM_S390_MEMOP_ABSOLUTE_CMPXCHG; + ksmo.old_addr = (uint64_t)desc->old; + memcpy(desc->old_value, desc->old, desc->size); + } break; case INVALID: ksmo.op = -1; @@ -135,9 +145,13 @@ static void print_memop(struct kvm_vcpu *vcpu, const struct kvm_s390_mem_op *ksm case KVM_S390_MEMOP_ABSOLUTE_WRITE: printf("ABSOLUTE, WRITE, "); break; + case KVM_S390_MEMOP_ABSOLUTE_CMPXCHG: + printf("ABSOLUTE, CMPXCHG, "); + break; } - printf("gaddr=%llu, size=%u, buf=%llu, ar=%u, key=%u", - ksmo->gaddr, ksmo->size, ksmo->buf, ksmo->ar, ksmo->key); + printf("gaddr=%llu, size=%u, buf=%llu, ar=%u, key=%u, old_addr=%llx", + ksmo->gaddr, ksmo->size, ksmo->buf, ksmo->ar, ksmo->key, + ksmo->old_addr); if (ksmo->flags & KVM_S390_MEMOP_F_CHECK_ONLY) printf(", CHECK_ONLY"); if (ksmo->flags & KVM_S390_MEMOP_F_INJECT_EXCEPTION) @@ -147,24 +161,31 @@ static void print_memop(struct kvm_vcpu *vcpu, const struct kvm_s390_mem_op *ksm puts(")"); }
-static void memop_ioctl(struct test_info info, struct kvm_s390_mem_op *ksmo) +static int err_memop_ioctl(struct test_info info, struct kvm_s390_mem_op *ksmo, + struct mop_desc *desc) { struct kvm_vcpu *vcpu = info.vcpu;
if (!vcpu) - vm_ioctl(info.vm, KVM_S390_MEM_OP, ksmo); + return __vm_ioctl(info.vm, KVM_S390_MEM_OP, ksmo); else - vcpu_ioctl(vcpu, KVM_S390_MEM_OP, ksmo); + return __vcpu_ioctl(vcpu, KVM_S390_MEM_OP, ksmo); }
-static int err_memop_ioctl(struct test_info info, struct kvm_s390_mem_op *ksmo) +static void memop_ioctl(struct test_info info, struct kvm_s390_mem_op *ksmo, + struct mop_desc *desc) { - struct kvm_vcpu *vcpu = info.vcpu; + int r; + + r = err_memop_ioctl(info, ksmo, desc); + if (ksmo->op == KVM_S390_MEMOP_ABSOLUTE_CMPXCHG) { + if (desc->cmpxchg_success) { + int diff = memcmp(desc->old_value, desc->old, desc->size); + *desc->cmpxchg_success = !diff; + } + } + TEST_ASSERT(!r, __KVM_IOCTL_ERROR("KVM_S390_MEM_OP", r));
- if (!vcpu) - return __vm_ioctl(info.vm, KVM_S390_MEM_OP, ksmo); - else - return __vcpu_ioctl(vcpu, KVM_S390_MEM_OP, ksmo); }
#define MEMOP(err, info_p, mop_target_p, access_mode_p, buf_p, size_p, ...) \ @@ -187,7 +208,7 @@ static int err_memop_ioctl(struct test_info info, struct kvm_s390_mem_op *ksmo) } \ __ksmo = ksmo_from_desc(&__desc); \ print_memop(__info.vcpu, &__ksmo); \ - err##memop_ioctl(__info, &__ksmo); \ + err##memop_ioctl(__info, &__ksmo, &__desc); \ })
#define MOP(...) MEMOP(, __VA_ARGS__) @@ -201,6 +222,8 @@ static int err_memop_ioctl(struct test_info info, struct kvm_s390_mem_op *ksmo) #define AR(a) ._ar = 1, .ar = (a) #define KEY(a) .f_key = 1, .key = (a) #define INJECT .f_inject = 1 +#define CMPXCHG_OLD(o) .old = (o) +#define CMPXCHG_SUCCESS(s) .cmpxchg_success = (s)
#define CHECK_N_DO(f, ...) ({ f(__VA_ARGS__, CHECK_ONLY); f(__VA_ARGS__); })
@@ -210,8 +233,8 @@ static int err_memop_ioctl(struct test_info info, struct kvm_s390_mem_op *ksmo) #define CR0_FETCH_PROTECTION_OVERRIDE (1UL << (63 - 38)) #define CR0_STORAGE_PROTECTION_OVERRIDE (1UL << (63 - 39))
-static uint8_t mem1[65536]; -static uint8_t mem2[65536]; +static uint8_t __aligned(PAGE_SIZE) mem1[65536]; +static uint8_t __aligned(PAGE_SIZE) mem2[65536];
struct test_default { struct kvm_vm *kvm_vm; @@ -243,6 +266,8 @@ enum stage { STAGE_SKEYS_SET, /* Guest copied memory (locations up to test case) */ STAGE_COPIED, + /* End of guest code reached */ + STAGE_DONE, };
#define HOST_SYNC(info_p, stage) \ @@ -254,6 +279,9 @@ enum stage { \ vcpu_run(__vcpu); \ get_ucall(__vcpu, &uc); \ + if (uc.cmd == UCALL_ABORT) { \ + REPORT_GUEST_ASSERT_2(uc, "hints: %lu, %lu"); \ + } \ ASSERT_EQ(uc.cmd, UCALL_SYNC); \ ASSERT_EQ(uc.args[1], __stage); \ }) \ @@ -293,6 +321,44 @@ static void default_read(struct test_info copy_cpu, struct test_info mop_cpu, ASSERT_MEM_EQ(mem1, mem2, size); }
+static void default_cmpxchg(struct test_default *test, uint8_t key) +{ + for (int size = 1; size <= 16; size *= 2) { + for (int offset = 0; offset < 16; offset += size) { + uint8_t __aligned(16) new[16] = {}; + uint8_t __aligned(16) old[16]; + bool succ; + + prepare_mem12(); + default_write_read(test->vcpu, test->vcpu, LOGICAL, 16, NO_KEY); + + memcpy(&old, mem1, 16); + MOP(test->vm, ABSOLUTE, CMPXCHG, new + offset, + size, GADDR_V(mem1 + offset), + CMPXCHG_OLD(old + offset), + CMPXCHG_SUCCESS(&succ), KEY(key)); + HOST_SYNC(test->vcpu, STAGE_COPIED); + MOP(test->vm, ABSOLUTE, READ, mem2, 16, GADDR_V(mem2)); + TEST_ASSERT(succ, "exchange of values should succeed"); + memcpy(mem1 + offset, new + offset, size); + ASSERT_MEM_EQ(mem1, mem2, 16); + + memcpy(&old, mem1, 16); + new[offset]++; + old[offset]++; + MOP(test->vm, ABSOLUTE, CMPXCHG, new + offset, + size, GADDR_V(mem1 + offset), + CMPXCHG_OLD(old + offset), + CMPXCHG_SUCCESS(&succ), KEY(key)); + HOST_SYNC(test->vcpu, STAGE_COPIED); + MOP(test->vm, ABSOLUTE, READ, mem2, 16, GADDR_V(mem2)); + TEST_ASSERT(!succ, "exchange of values should not succeed"); + ASSERT_MEM_EQ(mem1, mem2, 16); + ASSERT_MEM_EQ(&old, mem1, 16); + } + } +} + static void guest_copy(void) { GUEST_SYNC(STAGE_INITED); @@ -377,6 +443,250 @@ static void test_copy_key(void) kvm_vm_free(t.kvm_vm); }
+static void test_cmpxchg_key(void) +{ + struct test_default t = test_default_init(guest_copy_key); + + HOST_SYNC(t.vcpu, STAGE_SKEYS_SET); + + default_cmpxchg(&t, NO_KEY); + default_cmpxchg(&t, 0); + default_cmpxchg(&t, 9); + + kvm_vm_free(t.kvm_vm); +} + +static __uint128_t cut_to_size(int size, __uint128_t val) +{ + switch (size) { + case 1: + return (uint8_t)val; + case 2: + return (uint16_t)val; + case 4: + return (uint32_t)val; + case 8: + return (uint64_t)val; + case 16: + return val; + } + GUEST_ASSERT_1(false, "Invalid size"); + return 0; +} + +static bool popcount_eq(__uint128_t a, __uint128_t b) +{ + unsigned int count_a, count_b; + + count_a = __builtin_popcountl((uint64_t)(a >> 64)) + + __builtin_popcountl((uint64_t)a); + count_b = __builtin_popcountl((uint64_t)(b >> 64)) + + __builtin_popcountl((uint64_t)b); + return count_a == count_b; +} + +static __uint128_t rotate(int size, __uint128_t val, int amount) +{ + unsigned int bits = size * 8; + + amount = (amount + bits) % bits; + val = cut_to_size(size, val); + return (val << (bits - amount)) | (val >> amount); +} + +const unsigned int max_block = 16; + +static void choose_block(bool guest, int i, int *size, int *offset) +{ + unsigned int rand; + + rand = i; + if (guest) { + rand = rand * 19 + 11; + *size = 1 << ((rand % 3) + 2); + rand = rand * 19 + 11; + *offset = (rand % max_block) & ~(*size - 1); + } else { + rand = rand * 17 + 5; + *size = 1 << (rand % 5); + rand = rand * 17 + 5; + *offset = (rand % max_block) & ~(*size - 1); + } +} + +static __uint128_t permutate_bits(bool guest, int i, int size, __uint128_t old) +{ + unsigned int rand; + bool swap; + + rand = i; + rand = rand * 3 + 1; + if (guest) + rand = rand * 3 + 1; + swap = rand % 2 == 0; + if (swap) { + int i, j; + __uint128_t new; + uint8_t byte0, byte1; + + rand = rand * 3 + 1; + i = rand % size; + rand = rand * 3 + 1; + j = rand % size; + if (i == j) + return old; + new = rotate(16, old, i * 8); + byte0 = new & 0xff; + new &= ~0xff; + new = rotate(16, new, -i * 8); + new = rotate(16, new, j * 8); + byte1 = new & 0xff; + new = (new & ~0xff) | byte0; + new = rotate(16, new, -j * 8); + new = rotate(16, new, i * 8); + new = new | byte1; + new = rotate(16, new, -i * 8); + return new; + } else { + int amount; + + rand = rand * 3 + 1; + amount = rand % (size * 8); + return rotate(size, old, amount); + } +} + +static bool _cmpxchg(int size, void *target, __uint128_t *old_addr, __uint128_t new) +{ + bool ret; + + switch (size) { + case 4: { + uint32_t old = *old_addr; + + asm volatile ("cs %[old],%[new],%[address]" + : [old] "+d" (old), + [address] "+Q" (*(uint32_t *)(target)) + : [new] "d" ((uint32_t)new) + : "cc" + ); + ret = old == (uint32_t)*old_addr; + *old_addr = old; + return ret; + } + case 8: { + uint64_t old = *old_addr; + + asm volatile ("csg %[old],%[new],%[address]" + : [old] "+d" (old), + [address] "+Q" (*(uint64_t *)(target)) + : [new] "d" ((uint64_t)new) + : "cc" + ); + ret = old == (uint64_t)*old_addr; + *old_addr = old; + return ret; + } + case 16: { + __uint128_t old = *old_addr; + + asm volatile ("cdsg %[old],%[new],%[address]" + : [old] "+d" (old), + [address] "+Q" (*(__uint128_t *)(target)) + : [new] "d" (new) + : "cc" + ); + ret = old == *old_addr; + *old_addr = old; + return ret; + } + } + GUEST_ASSERT_1(false, "Invalid size"); + return 0; +} + +const unsigned int cmpxchg_iter_outer = 100, cmpxchg_iter_inner = 10000; + +static void guest_cmpxchg_key(void) +{ + int size, offset; + __uint128_t old, new; + + set_storage_key_range(mem1, max_block, 0x10); + set_storage_key_range(mem2, max_block, 0x10); + GUEST_SYNC(STAGE_SKEYS_SET); + + for (int i = 0; i < cmpxchg_iter_outer; i++) { + do { + old = 1; + } while (!_cmpxchg(16, mem1, &old, 0)); + for (int j = 0; j < cmpxchg_iter_inner; j++) { + choose_block(true, i + j, &size, &offset); + do { + new = permutate_bits(true, i + j, size, old); + } while (!_cmpxchg(size, mem2 + offset, &old, new)); + } + } + + GUEST_SYNC(STAGE_DONE); +} + +static void *run_guest(void *data) +{ + struct test_info *info = data; + + HOST_SYNC(*info, STAGE_DONE); + return NULL; +} + +static char *quad_to_char(__uint128_t *quad, int size) +{ + return ((char *)quad) + (sizeof(*quad) - size); +} + +static void test_cmpxchg_key_concurrent(void) +{ + struct test_default t = test_default_init(guest_cmpxchg_key); + int size, offset; + __uint128_t old, new; + bool success; + pthread_t thread; + + HOST_SYNC(t.vcpu, STAGE_SKEYS_SET); + prepare_mem12(); + MOP(t.vcpu, LOGICAL, WRITE, mem1, max_block, GADDR_V(mem2)); + pthread_create(&thread, NULL, run_guest, &t.vcpu); + + for (int i = 0; i < cmpxchg_iter_outer; i++) { + do { + old = 0; + new = 1; + MOP(t.vm, ABSOLUTE, CMPXCHG, &new, + sizeof(new), GADDR_V(mem1), + CMPXCHG_OLD(&old), + CMPXCHG_SUCCESS(&success), KEY(1)); + } while (!success); + for (int j = 0; j < cmpxchg_iter_inner; j++) { + choose_block(false, i + j, &size, &offset); + do { + new = permutate_bits(false, i + j, size, old); + MOP(t.vm, ABSOLUTE, CMPXCHG, quad_to_char(&new, size), + size, GADDR_V(mem2 + offset), + CMPXCHG_OLD(quad_to_char(&old, size)), + CMPXCHG_SUCCESS(&success), KEY(1)); + } while (!success); + } + } + + pthread_join(thread, NULL); + + MOP(t.vcpu, LOGICAL, READ, mem2, max_block, GADDR_V(mem2)); + TEST_ASSERT(popcount_eq(*(__uint128_t *)mem1, *(__uint128_t *)mem2), + "Must retain number of set bits"); + + kvm_vm_free(t.kvm_vm); +} + static void guest_copy_key_fetch_prot(void) { /* @@ -457,6 +767,24 @@ static void test_errors_key(void) kvm_vm_free(t.kvm_vm); }
+static void test_errors_cmpxchg_key(void) +{ + struct test_default t = test_default_init(guest_copy_key_fetch_prot); + int i; + + HOST_SYNC(t.vcpu, STAGE_INITED); + HOST_SYNC(t.vcpu, STAGE_SKEYS_SET); + + for (i = 1; i <= 16; i *= 2) { + __uint128_t old = 0; + + ERR_PROT_MOP(t.vm, ABSOLUTE, CMPXCHG, mem2, i, GADDR_V(mem2), + CMPXCHG_OLD(&old), KEY(2)); + } + + kvm_vm_free(t.kvm_vm); +} + static void test_termination(void) { struct test_default t = test_default_init(guest_error_key); @@ -692,6 +1020,38 @@ static void test_errors(void) kvm_vm_free(t.kvm_vm); }
+static void test_errors_cmpxchg(void) +{ + struct test_default t = test_default_init(guest_idle); + __uint128_t old; + int rv, i, power = 1; + + HOST_SYNC(t.vcpu, STAGE_INITED); + + for (i = 0; i < 32; i++) { + if (i == power) { + power *= 2; + continue; + } + rv = ERR_MOP(t.vm, ABSOLUTE, CMPXCHG, mem1, i, GADDR_V(mem1), + CMPXCHG_OLD(&old)); + TEST_ASSERT(rv == -1 && errno == EINVAL, + "ioctl allows bad size for cmpxchg"); + } + for (i = 1; i <= 16; i *= 2) { + rv = ERR_MOP(t.vm, ABSOLUTE, CMPXCHG, mem1, i, GADDR((void *)~0xfffUL), + CMPXCHG_OLD(&old)); + TEST_ASSERT(rv > 0, "ioctl allows bad guest address for cmpxchg"); + } + for (i = 2; i <= 16; i *= 2) { + rv = ERR_MOP(t.vm, ABSOLUTE, CMPXCHG, mem1, i, GADDR_V(mem1 + 1), + CMPXCHG_OLD(&old)); + TEST_ASSERT(rv == -1 && errno == EINVAL, + "ioctl allows bad alignment for cmpxchg"); + } + + kvm_vm_free(t.kvm_vm); +}
int main(int argc, char *argv[]) { @@ -720,6 +1080,16 @@ int main(int argc, char *argv[]) .test = test_copy_key, .requirements_met = extension_cap > 0, }, + { + .name = "cmpxchg with storage keys", + .test = test_cmpxchg_key, + .requirements_met = extension_cap & 0x2, + }, + { + .name = "concurrently cmpxchg with storage keys", + .test = test_cmpxchg_key_concurrent, + .requirements_met = extension_cap & 0x2, + }, { .name = "copy with key storage protection override", .test = test_copy_key_storage_prot_override, @@ -740,6 +1110,16 @@ int main(int argc, char *argv[]) .test = test_errors_key, .requirements_met = extension_cap > 0, }, + { + .name = "error checks for cmpxchg with key", + .test = test_errors_cmpxchg_key, + .requirements_met = extension_cap & 0x2, + }, + { + .name = "error checks for cmpxchg", + .test = test_errors_cmpxchg, + .requirements_met = extension_cap & 0x2, + }, { .name = "termination", .test = test_termination,
On 2/6/23 17:46, Janis Schoetterl-Glausch wrote:
Test successful exchange, unsuccessful exchange, storage key protection and invalid arguments.
Signed-off-by: Janis Schoetterl-Glausch scgl@linux.ibm.com
Acked-by: Janosch Frank frankja@linux.ibm.com
It's getting progressively harder to understand this test which isn't your fault. If this continues I'll need to have a closer look at the general kvm kselftest harness or try to find / write a new harness.
On 2/6/23 17:46, Janis Schoetterl-Glausch wrote:
Test successful exchange, unsuccessful exchange, storage key protection and invalid arguments.
Signed-off-by: Janis Schoetterl-Glausch scgl@linux.ibm.com
I've had checkpatch have a look at this and there's an extra \n and a else after a return as well as a false-positive.
Could you provide me with a fixed patch as a reply to this mail?
Test successful exchange, unsuccessful exchange, storage key protection and invalid arguments.
Signed-off-by: Janis Schoetterl-Glausch scgl@linux.ibm.com Acked-by: Janosch Frank frankja@linux.ibm.com --- tools/testing/selftests/kvm/s390x/memop.c | 407 +++++++++++++++++++++- 1 file changed, 392 insertions(+), 15 deletions(-)
diff --git a/tools/testing/selftests/kvm/s390x/memop.c b/tools/testing/selftests/kvm/s390x/memop.c index c5fec84ef3c2..8e4b94d7b8dd 100644 --- a/tools/testing/selftests/kvm/s390x/memop.c +++ b/tools/testing/selftests/kvm/s390x/memop.c @@ -9,6 +9,7 @@ #include <stdlib.h> #include <string.h> #include <sys/ioctl.h> +#include <pthread.h>
#include <linux/bits.h>
@@ -26,6 +27,7 @@ enum mop_target { enum mop_access_mode { READ, WRITE, + CMPXCHG, };
struct mop_desc { @@ -44,13 +46,16 @@ struct mop_desc { enum mop_access_mode mode; void *buf; uint32_t sida_offset; + void *old; + uint8_t old_value[16]; + bool *cmpxchg_success; uint8_t ar; uint8_t key; };
const uint8_t NO_KEY = 0xff;
-static struct kvm_s390_mem_op ksmo_from_desc(const struct mop_desc *desc) +static struct kvm_s390_mem_op ksmo_from_desc(struct mop_desc *desc) { struct kvm_s390_mem_op ksmo = { .gaddr = (uintptr_t)desc->gaddr, @@ -77,6 +82,11 @@ static struct kvm_s390_mem_op ksmo_from_desc(const struct mop_desc *desc) ksmo.op = KVM_S390_MEMOP_ABSOLUTE_READ; if (desc->mode == WRITE) ksmo.op = KVM_S390_MEMOP_ABSOLUTE_WRITE; + if (desc->mode == CMPXCHG) { + ksmo.op = KVM_S390_MEMOP_ABSOLUTE_CMPXCHG; + ksmo.old_addr = (uint64_t)desc->old; + memcpy(desc->old_value, desc->old, desc->size); + } break; case INVALID: ksmo.op = -1; @@ -135,9 +145,13 @@ static void print_memop(struct kvm_vcpu *vcpu, const struct kvm_s390_mem_op *ksm case KVM_S390_MEMOP_ABSOLUTE_WRITE: printf("ABSOLUTE, WRITE, "); break; + case KVM_S390_MEMOP_ABSOLUTE_CMPXCHG: + printf("ABSOLUTE, CMPXCHG, "); + break; } - printf("gaddr=%llu, size=%u, buf=%llu, ar=%u, key=%u", - ksmo->gaddr, ksmo->size, ksmo->buf, ksmo->ar, ksmo->key); + printf("gaddr=%llu, size=%u, buf=%llu, ar=%u, key=%u, old_addr=%llx", + ksmo->gaddr, ksmo->size, ksmo->buf, ksmo->ar, ksmo->key, + ksmo->old_addr); if (ksmo->flags & KVM_S390_MEMOP_F_CHECK_ONLY) printf(", CHECK_ONLY"); if (ksmo->flags & KVM_S390_MEMOP_F_INJECT_EXCEPTION) @@ -147,24 +161,30 @@ static void print_memop(struct kvm_vcpu *vcpu, const struct kvm_s390_mem_op *ksm puts(")"); }
-static void memop_ioctl(struct test_info info, struct kvm_s390_mem_op *ksmo) +static int err_memop_ioctl(struct test_info info, struct kvm_s390_mem_op *ksmo, + struct mop_desc *desc) { struct kvm_vcpu *vcpu = info.vcpu;
if (!vcpu) - vm_ioctl(info.vm, KVM_S390_MEM_OP, ksmo); + return __vm_ioctl(info.vm, KVM_S390_MEM_OP, ksmo); else - vcpu_ioctl(vcpu, KVM_S390_MEM_OP, ksmo); + return __vcpu_ioctl(vcpu, KVM_S390_MEM_OP, ksmo); }
-static int err_memop_ioctl(struct test_info info, struct kvm_s390_mem_op *ksmo) +static void memop_ioctl(struct test_info info, struct kvm_s390_mem_op *ksmo, + struct mop_desc *desc) { - struct kvm_vcpu *vcpu = info.vcpu; + int r;
- if (!vcpu) - return __vm_ioctl(info.vm, KVM_S390_MEM_OP, ksmo); - else - return __vcpu_ioctl(vcpu, KVM_S390_MEM_OP, ksmo); + r = err_memop_ioctl(info, ksmo, desc); + if (ksmo->op == KVM_S390_MEMOP_ABSOLUTE_CMPXCHG) { + if (desc->cmpxchg_success) { + int diff = memcmp(desc->old_value, desc->old, desc->size); + *desc->cmpxchg_success = !diff; + } + } + TEST_ASSERT(!r, __KVM_IOCTL_ERROR("KVM_S390_MEM_OP", r)); }
#define MEMOP(err, info_p, mop_target_p, access_mode_p, buf_p, size_p, ...) \ @@ -187,7 +207,7 @@ static int err_memop_ioctl(struct test_info info, struct kvm_s390_mem_op *ksmo) } \ __ksmo = ksmo_from_desc(&__desc); \ print_memop(__info.vcpu, &__ksmo); \ - err##memop_ioctl(__info, &__ksmo); \ + err##memop_ioctl(__info, &__ksmo, &__desc); \ })
#define MOP(...) MEMOP(, __VA_ARGS__) @@ -201,6 +221,8 @@ static int err_memop_ioctl(struct test_info info, struct kvm_s390_mem_op *ksmo) #define AR(a) ._ar = 1, .ar = (a) #define KEY(a) .f_key = 1, .key = (a) #define INJECT .f_inject = 1 +#define CMPXCHG_OLD(o) .old = (o) +#define CMPXCHG_SUCCESS(s) .cmpxchg_success = (s)
#define CHECK_N_DO(f, ...) ({ f(__VA_ARGS__, CHECK_ONLY); f(__VA_ARGS__); })
@@ -210,8 +232,8 @@ static int err_memop_ioctl(struct test_info info, struct kvm_s390_mem_op *ksmo) #define CR0_FETCH_PROTECTION_OVERRIDE (1UL << (63 - 38)) #define CR0_STORAGE_PROTECTION_OVERRIDE (1UL << (63 - 39))
-static uint8_t mem1[65536]; -static uint8_t mem2[65536]; +static uint8_t __aligned(PAGE_SIZE) mem1[65536]; +static uint8_t __aligned(PAGE_SIZE) mem2[65536];
struct test_default { struct kvm_vm *kvm_vm; @@ -243,6 +265,8 @@ enum stage { STAGE_SKEYS_SET, /* Guest copied memory (locations up to test case) */ STAGE_COPIED, + /* End of guest code reached */ + STAGE_DONE, };
#define HOST_SYNC(info_p, stage) \ @@ -254,6 +278,9 @@ enum stage { \ vcpu_run(__vcpu); \ get_ucall(__vcpu, &uc); \ + if (uc.cmd == UCALL_ABORT) { \ + REPORT_GUEST_ASSERT_2(uc, "hints: %lu, %lu"); \ + } \ ASSERT_EQ(uc.cmd, UCALL_SYNC); \ ASSERT_EQ(uc.args[1], __stage); \ }) \ @@ -293,6 +320,44 @@ static void default_read(struct test_info copy_cpu, struct test_info mop_cpu, ASSERT_MEM_EQ(mem1, mem2, size); }
+static void default_cmpxchg(struct test_default *test, uint8_t key) +{ + for (int size = 1; size <= 16; size *= 2) { + for (int offset = 0; offset < 16; offset += size) { + uint8_t __aligned(16) new[16] = {}; + uint8_t __aligned(16) old[16]; + bool succ; + + prepare_mem12(); + default_write_read(test->vcpu, test->vcpu, LOGICAL, 16, NO_KEY); + + memcpy(&old, mem1, 16); + MOP(test->vm, ABSOLUTE, CMPXCHG, new + offset, + size, GADDR_V(mem1 + offset), + CMPXCHG_OLD(old + offset), + CMPXCHG_SUCCESS(&succ), KEY(key)); + HOST_SYNC(test->vcpu, STAGE_COPIED); + MOP(test->vm, ABSOLUTE, READ, mem2, 16, GADDR_V(mem2)); + TEST_ASSERT(succ, "exchange of values should succeed"); + memcpy(mem1 + offset, new + offset, size); + ASSERT_MEM_EQ(mem1, mem2, 16); + + memcpy(&old, mem1, 16); + new[offset]++; + old[offset]++; + MOP(test->vm, ABSOLUTE, CMPXCHG, new + offset, + size, GADDR_V(mem1 + offset), + CMPXCHG_OLD(old + offset), + CMPXCHG_SUCCESS(&succ), KEY(key)); + HOST_SYNC(test->vcpu, STAGE_COPIED); + MOP(test->vm, ABSOLUTE, READ, mem2, 16, GADDR_V(mem2)); + TEST_ASSERT(!succ, "exchange of values should not succeed"); + ASSERT_MEM_EQ(mem1, mem2, 16); + ASSERT_MEM_EQ(&old, mem1, 16); + } + } +} + static void guest_copy(void) { GUEST_SYNC(STAGE_INITED); @@ -377,6 +442,248 @@ static void test_copy_key(void) kvm_vm_free(t.kvm_vm); }
+static void test_cmpxchg_key(void) +{ + struct test_default t = test_default_init(guest_copy_key); + + HOST_SYNC(t.vcpu, STAGE_SKEYS_SET); + + default_cmpxchg(&t, NO_KEY); + default_cmpxchg(&t, 0); + default_cmpxchg(&t, 9); + + kvm_vm_free(t.kvm_vm); +} + +static __uint128_t cut_to_size(int size, __uint128_t val) +{ + switch (size) { + case 1: + return (uint8_t)val; + case 2: + return (uint16_t)val; + case 4: + return (uint32_t)val; + case 8: + return (uint64_t)val; + case 16: + return val; + } + GUEST_ASSERT_1(false, "Invalid size"); + return 0; +} + +static bool popcount_eq(__uint128_t a, __uint128_t b) +{ + unsigned int count_a, count_b; + + count_a = __builtin_popcountl((uint64_t)(a >> 64)) + + __builtin_popcountl((uint64_t)a); + count_b = __builtin_popcountl((uint64_t)(b >> 64)) + + __builtin_popcountl((uint64_t)b); + return count_a == count_b; +} + +static __uint128_t rotate(int size, __uint128_t val, int amount) +{ + unsigned int bits = size * 8; + + amount = (amount + bits) % bits; + val = cut_to_size(size, val); + return (val << (bits - amount)) | (val >> amount); +} + +const unsigned int max_block = 16; + +static void choose_block(bool guest, int i, int *size, int *offset) +{ + unsigned int rand; + + rand = i; + if (guest) { + rand = rand * 19 + 11; + *size = 1 << ((rand % 3) + 2); + rand = rand * 19 + 11; + *offset = (rand % max_block) & ~(*size - 1); + } else { + rand = rand * 17 + 5; + *size = 1 << (rand % 5); + rand = rand * 17 + 5; + *offset = (rand % max_block) & ~(*size - 1); + } +} + +static __uint128_t permutate_bits(bool guest, int i, int size, __uint128_t old) +{ + unsigned int rand; + int amount; + bool swap; + + rand = i; + rand = rand * 3 + 1; + if (guest) + rand = rand * 3 + 1; + swap = rand % 2 == 0; + if (swap) { + int i, j; + __uint128_t new; + uint8_t byte0, byte1; + + rand = rand * 3 + 1; + i = rand % size; + rand = rand * 3 + 1; + j = rand % size; + if (i == j) + return old; + new = rotate(16, old, i * 8); + byte0 = new & 0xff; + new &= ~0xff; + new = rotate(16, new, -i * 8); + new = rotate(16, new, j * 8); + byte1 = new & 0xff; + new = (new & ~0xff) | byte0; + new = rotate(16, new, -j * 8); + new = rotate(16, new, i * 8); + new = new | byte1; + new = rotate(16, new, -i * 8); + return new; + } + rand = rand * 3 + 1; + amount = rand % (size * 8); + return rotate(size, old, amount); +} + +static bool _cmpxchg(int size, void *target, __uint128_t *old_addr, __uint128_t new) +{ + bool ret; + + switch (size) { + case 4: { + uint32_t old = *old_addr; + + asm volatile ("cs %[old],%[new],%[address]" + : [old] "+d" (old), + [address] "+Q" (*(uint32_t *)(target)) + : [new] "d" ((uint32_t)new) + : "cc" + ); + ret = old == (uint32_t)*old_addr; + *old_addr = old; + return ret; + } + case 8: { + uint64_t old = *old_addr; + + asm volatile ("csg %[old],%[new],%[address]" + : [old] "+d" (old), + [address] "+Q" (*(uint64_t *)(target)) + : [new] "d" ((uint64_t)new) + : "cc" + ); + ret = old == (uint64_t)*old_addr; + *old_addr = old; + return ret; + } + case 16: { + __uint128_t old = *old_addr; + + asm volatile ("cdsg %[old],%[new],%[address]" + : [old] "+d" (old), + [address] "+Q" (*(__uint128_t *)(target)) + : [new] "d" (new) + : "cc" + ); + ret = old == *old_addr; + *old_addr = old; + return ret; + } + } + GUEST_ASSERT_1(false, "Invalid size"); + return 0; +} + +const unsigned int cmpxchg_iter_outer = 100, cmpxchg_iter_inner = 10000; + +static void guest_cmpxchg_key(void) +{ + int size, offset; + __uint128_t old, new; + + set_storage_key_range(mem1, max_block, 0x10); + set_storage_key_range(mem2, max_block, 0x10); + GUEST_SYNC(STAGE_SKEYS_SET); + + for (int i = 0; i < cmpxchg_iter_outer; i++) { + do { + old = 1; + } while (!_cmpxchg(16, mem1, &old, 0)); + for (int j = 0; j < cmpxchg_iter_inner; j++) { + choose_block(true, i + j, &size, &offset); + do { + new = permutate_bits(true, i + j, size, old); + } while (!_cmpxchg(size, mem2 + offset, &old, new)); + } + } + + GUEST_SYNC(STAGE_DONE); +} + +static void *run_guest(void *data) +{ + struct test_info *info = data; + + HOST_SYNC(*info, STAGE_DONE); + return NULL; +} + +static char *quad_to_char(__uint128_t *quad, int size) +{ + return ((char *)quad) + (sizeof(*quad) - size); +} + +static void test_cmpxchg_key_concurrent(void) +{ + struct test_default t = test_default_init(guest_cmpxchg_key); + int size, offset; + __uint128_t old, new; + bool success; + pthread_t thread; + + HOST_SYNC(t.vcpu, STAGE_SKEYS_SET); + prepare_mem12(); + MOP(t.vcpu, LOGICAL, WRITE, mem1, max_block, GADDR_V(mem2)); + pthread_create(&thread, NULL, run_guest, &t.vcpu); + + for (int i = 0; i < cmpxchg_iter_outer; i++) { + do { + old = 0; + new = 1; + MOP(t.vm, ABSOLUTE, CMPXCHG, &new, + sizeof(new), GADDR_V(mem1), + CMPXCHG_OLD(&old), + CMPXCHG_SUCCESS(&success), KEY(1)); + } while (!success); + for (int j = 0; j < cmpxchg_iter_inner; j++) { + choose_block(false, i + j, &size, &offset); + do { + new = permutate_bits(false, i + j, size, old); + MOP(t.vm, ABSOLUTE, CMPXCHG, quad_to_char(&new, size), + size, GADDR_V(mem2 + offset), + CMPXCHG_OLD(quad_to_char(&old, size)), + CMPXCHG_SUCCESS(&success), KEY(1)); + } while (!success); + } + } + + pthread_join(thread, NULL); + + MOP(t.vcpu, LOGICAL, READ, mem2, max_block, GADDR_V(mem2)); + TEST_ASSERT(popcount_eq(*(__uint128_t *)mem1, *(__uint128_t *)mem2), + "Must retain number of set bits"); + + kvm_vm_free(t.kvm_vm); +} + static void guest_copy_key_fetch_prot(void) { /* @@ -457,6 +764,24 @@ static void test_errors_key(void) kvm_vm_free(t.kvm_vm); }
+static void test_errors_cmpxchg_key(void) +{ + struct test_default t = test_default_init(guest_copy_key_fetch_prot); + int i; + + HOST_SYNC(t.vcpu, STAGE_INITED); + HOST_SYNC(t.vcpu, STAGE_SKEYS_SET); + + for (i = 1; i <= 16; i *= 2) { + __uint128_t old = 0; + + ERR_PROT_MOP(t.vm, ABSOLUTE, CMPXCHG, mem2, i, GADDR_V(mem2), + CMPXCHG_OLD(&old), KEY(2)); + } + + kvm_vm_free(t.kvm_vm); +} + static void test_termination(void) { struct test_default t = test_default_init(guest_error_key); @@ -692,6 +1017,38 @@ static void test_errors(void) kvm_vm_free(t.kvm_vm); }
+static void test_errors_cmpxchg(void) +{ + struct test_default t = test_default_init(guest_idle); + __uint128_t old; + int rv, i, power = 1; + + HOST_SYNC(t.vcpu, STAGE_INITED); + + for (i = 0; i < 32; i++) { + if (i == power) { + power *= 2; + continue; + } + rv = ERR_MOP(t.vm, ABSOLUTE, CMPXCHG, mem1, i, GADDR_V(mem1), + CMPXCHG_OLD(&old)); + TEST_ASSERT(rv == -1 && errno == EINVAL, + "ioctl allows bad size for cmpxchg"); + } + for (i = 1; i <= 16; i *= 2) { + rv = ERR_MOP(t.vm, ABSOLUTE, CMPXCHG, mem1, i, GADDR((void *)~0xfffUL), + CMPXCHG_OLD(&old)); + TEST_ASSERT(rv > 0, "ioctl allows bad guest address for cmpxchg"); + } + for (i = 2; i <= 16; i *= 2) { + rv = ERR_MOP(t.vm, ABSOLUTE, CMPXCHG, mem1, i, GADDR_V(mem1 + 1), + CMPXCHG_OLD(&old)); + TEST_ASSERT(rv == -1 && errno == EINVAL, + "ioctl allows bad alignment for cmpxchg"); + } + + kvm_vm_free(t.kvm_vm); +}
int main(int argc, char *argv[]) { @@ -720,6 +1077,16 @@ int main(int argc, char *argv[]) .test = test_copy_key, .requirements_met = extension_cap > 0, }, + { + .name = "cmpxchg with storage keys", + .test = test_cmpxchg_key, + .requirements_met = extension_cap & 0x2, + }, + { + .name = "concurrently cmpxchg with storage keys", + .test = test_cmpxchg_key_concurrent, + .requirements_met = extension_cap & 0x2, + }, { .name = "copy with key storage protection override", .test = test_copy_key_storage_prot_override, @@ -740,6 +1107,16 @@ int main(int argc, char *argv[]) .test = test_errors_key, .requirements_met = extension_cap > 0, }, + { + .name = "error checks for cmpxchg with key", + .test = test_errors_cmpxchg_key, + .requirements_met = extension_cap & 0x2, + }, + { + .name = "error checks for cmpxchg", + .test = test_errors_cmpxchg, + .requirements_met = extension_cap & 0x2, + }, { .name = "termination", .test = test_termination,
linux-kselftest-mirror@lists.linaro.org