On 2/10/21 5:53 AM, Cornelia Huck wrote:
On Tue, 9 Feb 2021 14:48:30 -0500 Tony Krowiak akrowiak@linux.ibm.com wrote:
This patch fixes a circular locking dependency in the CI introduced by commit f21916ec4826 ("s390/vfio-ap: clean up vfio_ap resources when KVM pointer invalidated"). The lockdep only occurs when starting a Secure Execution guest. Crypto virtualization (vfio_ap) is not yet supported for SE guests; however, in order to avoid CI errors, this fix is being provided.
The circular lockdep was introduced when the masks in the guest's APCB were taken under the matrix_dev->lock. While the lock is definitely needed to protect the setting/unsetting of the KVM pointer, it is not necessarily critical for setting the masks, so this will not be done under protection of the matrix_dev->lock.
Fixes: f21916ec4826 ("s390/vfio-ap: clean up vfio_ap resources when KVM pointer invalidated") Cc: stable@vger.kernel.org Signed-off-by: Tony Krowiak akrowiak@linux.ibm.com
drivers/s390/crypto/vfio_ap_ops.c | 75 ++++++++++++++++++------------- 1 file changed, 45 insertions(+), 30 deletions(-)
static void vfio_ap_mdev_unset_kvm(struct ap_matrix_mdev *matrix_mdev) {
- kvm_arch_crypto_clear_masks(matrix_mdev->kvm);
- matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
- vfio_ap_mdev_reset_queues(matrix_mdev->mdev);
- kvm_put_kvm(matrix_mdev->kvm);
- matrix_mdev->kvm = NULL;
- if (matrix_mdev->kvm) {
If you're doing setting/unsetting under matrix_dev->lock, is it possible that matrix_mdev->kvm gets unset between here and the next line, as you don't hold the lock?
That is highly unlikely because the only place the matrix_mdev->kvm pointer is cleared is in this function which is called from only two places: the notifier that handles the VFIO_GROUP_NOTIFY_SET_KVM notification when the KVM pointer is cleared; the vfio_ap_mdev_release() function which is called when the mdev fd is closed (i.e., when the guest is shut down). The fact is, with the only end-to-end implementation currently available, the notifier callback is never invoked to clear the KVM pointer because the vfio_ap_mdev_release callback is invoked first and it unregisters the notifier callback.
Having said that, I suppose there is no guarantee that there will not be different userspace clients in the future that do things in a different order. At the very least, it wouldn't hurt to protect against that as you suggest below.
Maybe you could
- grab a reference to kvm while holding the lock
- call the mask handling functions with that kvm reference
- lock again, drop the reference, and do the rest of the processing?
kvm_arch_crypto_clear_masks(matrix_mdev->kvm);
mutex_lock(&matrix_dev->lock);
matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
vfio_ap_mdev_reset_queues(matrix_mdev->mdev);
kvm_put_kvm(matrix_mdev->kvm);
matrix_mdev->kvm = NULL;
mutex_unlock(&matrix_dev->lock);
- } }