On Fri, 12 Feb 2021 11:57:46 -0500 Tony Krowiak akrowiak@linux.ibm.com wrote:
This patch fixes a circular locking dependency in the CI introduced by commit f21916ec4826 ("s390/vfio-ap: clean up vfio_ap resources when KVM pointer invalidated"). The lockdep only occurs when starting a Secure Execution guest. Crypto virtualization (vfio_ap) is not yet supported for SE guests; however, in order to avoid CI errors, this fix is being provided.
The circular lockdep was introduced when the masks in the guest's APCB were taken under the matrix_dev->lock. While the lock is definitely needed to protect the setting/unsetting of the KVM pointer, it is not necessarily critical for setting the masks, so this will not be done under protection of the matrix_dev->lock.
Fixes: f21916ec4826 ("s390/vfio-ap: clean up vfio_ap resources when KVM pointer invalidated") Cc: stable@vger.kernel.org Signed-off-by: Tony Krowiak akrowiak@linux.ibm.com
drivers/s390/crypto/vfio_ap_ops.c | 78 +++++++++++++++++++------------ 1 file changed, 48 insertions(+), 30 deletions(-)
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c index 41fc2e4135fe..bba0f64aa1f7 100644 --- a/drivers/s390/crypto/vfio_ap_ops.c +++ b/drivers/s390/crypto/vfio_ap_ops.c @@ -1028,7 +1028,10 @@ static const struct attribute_group *vfio_ap_mdev_attr_groups[] = {
- @kvm: reference to KVM instance
- Verifies no other mediated matrix device has @kvm and sets a reference to
- it in @matrix_mdev->kvm.
- it in @matrix_mdev->kvm. The matrix_dev->lock must be taken prior to calling
- this function; however, the lock will be temporarily released while updating
- the guest's APCB to avoid a potential circular lock dependency with other
- asynchronous processes.
- Return 0 if no other mediated matrix device has a reference to @kvm;
- otherwise, returns an -EPERM.
@@ -1043,10 +1046,19 @@ static int vfio_ap_mdev_set_kvm(struct ap_matrix_mdev *matrix_mdev, return -EPERM; }
- matrix_mdev->kvm = kvm; kvm_get_kvm(kvm);
matrix_mdev->kvm = kvm; kvm->arch.crypto.pqap_hook = &matrix_mdev->pqap_hook;
if (matrix_mdev->kvm && matrix_mdev->kvm->arch.crypto.crycbd) {
mutex_unlock(&matrix_dev->lock);
kvm_arch_crypto_set_masks(kvm,
matrix_mdev->matrix.apm,
matrix_mdev->matrix.aqm,
matrix_mdev->matrix.adm);
I think in theory kvm_arch_crypto_set_masks() and kvm_arch_crypto_clear_masks() can happen in reverse order.
E.g. vfio_ap_mdev_set_kvm() runs till just before crypto_set_mask() but there the thread gets preempted, then vfio_ap_mdev_unset_kvm() executes up to and including crypto_clear_mask(), then the set_kvm() thread does crypto_set_mask() and then the unset_kvm() thread runs the second critical section.
Please notice that matrix_mdev->kvm is set before crypto_set_mask() is called, so the unset_kvm() thread won't bail out, but it bailing out wouldn't help, because we would still end up not doing the cleanup.
I have to tell you, I never understood our cleanup logic. And with having to drop the lock, things are getting awfully convoluted.
Probably the easiest way to resolving this problem would have been to route PQAP via QEMU. I think that is what Pierre did at some point, but I was against it, because the extra mile seemed unnecessary, and Christian agreed. Back then, I didn't know that the vcpu lock is taken under the kvm lock.
What I'm trying to say. I'm willing to let an imperfect solution slip here, because I don't have a perfect one. I didn't try hard to come up with a perfect solution, but usually I don't have to try hard to see what needs to be done.
I'm curious do you have a plan for dynamic. Because with dynamics amplifies the problem, as crycb is changed in various situations.
mutex_lock(&matrix_dev->lock);
- }
- return 0;
}
@@ -1079,13 +1091,34 @@ static int vfio_ap_mdev_iommu_notifier(struct notifier_block *nb, return NOTIFY_DONE; }
+/**
- vfio_ap_mdev_unset_kvm
- @matrix_mdev: a matrix mediated device
- Clears the masks in the guest's APCB as well as the reference to KVM from
- @matrix_mdev. The matrix_dev->lock must be taken prior to calling this
- function; however, the lock will be temporarily released while updating
- the guest's APCB to avoid a potential circular lock dependency with other
- asynchronous processes.
- */
static void vfio_ap_mdev_unset_kvm(struct ap_matrix_mdev *matrix_mdev) {
- kvm_arch_crypto_clear_masks(matrix_mdev->kvm);
- matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
- vfio_ap_mdev_reset_queues(matrix_mdev->mdev);
- kvm_put_kvm(matrix_mdev->kvm);
- matrix_mdev->kvm = NULL;
- struct kvm *kvm;
- if (matrix_mdev->kvm) {
kvm = matrix_mdev->kvm;
kvm_get_kvm(kvm);
kvm->arch.crypto.pqap_hook = NULL;
You moved setting the hook to NULL before clearing the mask. I can't tell right now if it matters or not. What is the idea? I'm tired.
mutex_unlock(&matrix_dev->lock);
kvm_arch_crypto_clear_masks(kvm);
mutex_lock(&matrix_dev->lock);
kvm_put_kvm(kvm);
vfio_ap_mdev_reset_queues(matrix_mdev->mdev);
if (matrix_mdev->kvm)
kvm_put_kvm(matrix_mdev->kvm);
matrix_mdev->kvm = NULL;
- }
}
static int vfio_ap_mdev_group_notifier(struct notifier_block *nb, @@ -1097,33 +1130,19 @@ static int vfio_ap_mdev_group_notifier(struct notifier_block *nb, if (action != VFIO_GROUP_NOTIFY_SET_KVM) return NOTIFY_OK;
- matrix_mdev = container_of(nb, struct ap_matrix_mdev, group_notifier); mutex_lock(&matrix_dev->lock);
- matrix_mdev = container_of(nb, struct ap_matrix_mdev, group_notifier);
I'm not sure moving this inside the critical section buys us anything, but I'm pretty sure it won't hurt either. What is the idea?
I guess, the only thing that matters is that matrix_mdev is still alive, and since it is freed in vfio_ap_mdev_remove() with the lock not held I can't see how this being protected by the lock will prevent problems.
- if (!data) {
if (matrix_mdev->kvm)
vfio_ap_mdev_unset_kvm(matrix_mdev);
goto notify_done;
- }
- ret = vfio_ap_mdev_set_kvm(matrix_mdev, data);
- if (ret) {
notify_rc = NOTIFY_DONE;
goto notify_done;
- }
- if (!data)
vfio_ap_mdev_unset_kvm(matrix_mdev);
- else
ret = vfio_ap_mdev_set_kvm(matrix_mdev, data);
- /* If there is no CRYCB pointer, then we can't copy the masks */
- if (!matrix_mdev->kvm->arch.crypto.crycbd) {
- if (ret)
I think this is undefined behavior if !data, because ret is uninitialized.
notify_rc = NOTIFY_DONE;
goto notify_done;
- }
- kvm_arch_crypto_set_masks(matrix_mdev->kvm, matrix_mdev->matrix.apm,
matrix_mdev->matrix.aqm,
matrix_mdev->matrix.adm);
-notify_done: mutex_unlock(&matrix_dev->lock);
- return notify_rc;
}
@@ -1258,8 +1277,7 @@ static void vfio_ap_mdev_release(struct mdev_device *mdev) struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
mutex_lock(&matrix_dev->lock);
- if (matrix_mdev->kvm)
vfio_ap_mdev_unset_kvm(matrix_mdev);
vfio_ap_mdev_unset_kvm(matrix_mdev); mutex_unlock(&matrix_dev->lock);
vfio_unregister_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY,