When converting to directly create the vfio_device the mdev driver has to put a vfio_register_emulated_iommu_dev() in the probe() and a pairing vfio_unregister_group_dev() in the remove.
This was missed for gvt, add it.
Cc: stable@vger.kernel.org Fixes: 978cf586ac35 ("drm/i915/gvt: convert to use vfio_register_emulated_iommu_dev") Reported-by: Alex Williamson alex.williamson@redhat.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com --- drivers/gpu/drm/i915/gvt/kvmgt.c | 1 + 1 file changed, 1 insertion(+)
Should go through Alex's tree.
diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c index 41bba40feef8f4..9003145adb5a93 100644 --- a/drivers/gpu/drm/i915/gvt/kvmgt.c +++ b/drivers/gpu/drm/i915/gvt/kvmgt.c @@ -1615,6 +1615,7 @@ static void intel_vgpu_remove(struct mdev_device *mdev) if (WARN_ON_ONCE(vgpu->attached)) return;
+ vfio_unregister_group_dev(&vgpu->vfio_device); vfio_put_device(&vgpu->vfio_device); }
base-commit: c72e0034e6d4c36322d958b997d11d2627c6056c
From: Jason Gunthorpe Sent: Friday, September 30, 2022 1:49 AM
When converting to directly create the vfio_device the mdev driver has to put a vfio_register_emulated_iommu_dev() in the probe() and a pairing vfio_unregister_group_dev() in the remove.
This was missed for gvt, add it.
Cc: stable@vger.kernel.org Fixes: 978cf586ac35 ("drm/i915/gvt: convert to use vfio_register_emulated_iommu_dev") Reported-by: Alex Williamson alex.williamson@redhat.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com
Reviewed-by: Kevin Tian kevin.tian@intel.com
On Thu, 29 Sep 2022 14:48:35 -0300 Jason Gunthorpe jgg@nvidia.com wrote:
When converting to directly create the vfio_device the mdev driver has to put a vfio_register_emulated_iommu_dev() in the probe() and a pairing vfio_unregister_group_dev() in the remove.
This was missed for gvt, add it.
Cc: stable@vger.kernel.org Fixes: 978cf586ac35 ("drm/i915/gvt: convert to use vfio_register_emulated_iommu_dev") Reported-by: Alex Williamson alex.williamson@redhat.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com
drivers/gpu/drm/i915/gvt/kvmgt.c | 1 + 1 file changed, 1 insertion(+)
Should go through Alex's tree.
Applied to vfio next branch for v6.1. Thanks for the quick fix!
Alex
diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c index 41bba40feef8f4..9003145adb5a93 100644 --- a/drivers/gpu/drm/i915/gvt/kvmgt.c +++ b/drivers/gpu/drm/i915/gvt/kvmgt.c @@ -1615,6 +1615,7 @@ static void intel_vgpu_remove(struct mdev_device *mdev) if (WARN_ON_ONCE(vgpu->attached)) return;
- vfio_unregister_group_dev(&vgpu->vfio_device); vfio_put_device(&vgpu->vfio_device);
}
base-commit: c72e0034e6d4c36322d958b997d11d2627c6056c
On Thu, 29 Sep 2022 14:48:35 -0300 Jason Gunthorpe jgg@nvidia.com wrote:
When converting to directly create the vfio_device the mdev driver has to put a vfio_register_emulated_iommu_dev() in the probe() and a pairing vfio_unregister_group_dev() in the remove.
This was missed for gvt, add it.
Cc: stable@vger.kernel.org Fixes: 978cf586ac35 ("drm/i915/gvt: convert to use vfio_register_emulated_iommu_dev") Reported-by: Alex Williamson alex.williamson@redhat.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com
drivers/gpu/drm/i915/gvt/kvmgt.c | 1 + 1 file changed, 1 insertion(+)
Should go through Alex's tree.
diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c index 41bba40feef8f4..9003145adb5a93 100644 --- a/drivers/gpu/drm/i915/gvt/kvmgt.c +++ b/drivers/gpu/drm/i915/gvt/kvmgt.c @@ -1615,6 +1615,7 @@ static void intel_vgpu_remove(struct mdev_device *mdev) if (WARN_ON_ONCE(vgpu->attached)) return;
- vfio_unregister_group_dev(&vgpu->vfio_device); vfio_put_device(&vgpu->vfio_device);
}
base-commit: c72e0034e6d4c36322d958b997d11d2627c6056c
This is marked for stable, but I think the stable backport for existing kernels is actually:
diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c index e3cd58946477..de89946c4817 100644 --- a/drivers/gpu/drm/i915/gvt/kvmgt.c +++ b/drivers/gpu/drm/i915/gvt/kvmgt.c @@ -1595,6 +1595,9 @@ static void intel_vgpu_remove(struct mdev_device *mdev)
if (WARN_ON_ONCE(vgpu->attached)) return; + + vfio_unregister_group_dev(&vgpu->vfio_device); + vfio_uninit_group_dev(&vgpu->vfio_device); intel_gvt_destroy_vgpu(vgpu); }
On Wed, 5 Oct 2022 14:17:17 -0600 Alex Williamson alex.williamson@redhat.com wrote:
On Thu, 29 Sep 2022 14:48:35 -0300 Jason Gunthorpe jgg@nvidia.com wrote:
When converting to directly create the vfio_device the mdev driver has to put a vfio_register_emulated_iommu_dev() in the probe() and a pairing vfio_unregister_group_dev() in the remove.
This was missed for gvt, add it.
Cc: stable@vger.kernel.org Fixes: 978cf586ac35 ("drm/i915/gvt: convert to use vfio_register_emulated_iommu_dev") Reported-by: Alex Williamson alex.williamson@redhat.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com
drivers/gpu/drm/i915/gvt/kvmgt.c | 1 + 1 file changed, 1 insertion(+)
Should go through Alex's tree.
diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c index 41bba40feef8f4..9003145adb5a93 100644 --- a/drivers/gpu/drm/i915/gvt/kvmgt.c +++ b/drivers/gpu/drm/i915/gvt/kvmgt.c @@ -1615,6 +1615,7 @@ static void intel_vgpu_remove(struct mdev_device *mdev) if (WARN_ON_ONCE(vgpu->attached)) return;
Actually, what's the purpose of this ^^^^ ?
We can't have a .remove callback that does nothing, this breaks removing the device while it's in use. Once we have the vfio_unregister_group_dev() fix below, we'll block until the device is unused, at which point vgpu->attached becomes false. Unless I'm missing something, I think we should also follow-up with a patch to remove that bogus warn-on branch, right? Thanks,
Alex
- vfio_unregister_group_dev(&vgpu->vfio_device); vfio_put_device(&vgpu->vfio_device);
}
base-commit: c72e0034e6d4c36322d958b997d11d2627c6056c
This is marked for stable, but I think the stable backport for existing kernels is actually:
diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c index e3cd58946477..de89946c4817 100644 --- a/drivers/gpu/drm/i915/gvt/kvmgt.c +++ b/drivers/gpu/drm/i915/gvt/kvmgt.c @@ -1595,6 +1595,9 @@ static void intel_vgpu_remove(struct mdev_device *mdev) if (WARN_ON_ONCE(vgpu->attached)) return;
- vfio_unregister_group_dev(&vgpu->vfio_device);
- vfio_uninit_group_dev(&vgpu->vfio_device); intel_gvt_destroy_vgpu(vgpu);
}
On Wed, Oct 05, 2022 at 02:17:17PM -0600, Alex Williamson wrote:
diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c index 41bba40feef8f4..9003145adb5a93 100644 --- a/drivers/gpu/drm/i915/gvt/kvmgt.c +++ b/drivers/gpu/drm/i915/gvt/kvmgt.c @@ -1615,6 +1615,7 @@ static void intel_vgpu_remove(struct mdev_device *mdev) if (WARN_ON_ONCE(vgpu->attached)) return;
- vfio_unregister_group_dev(&vgpu->vfio_device); vfio_put_device(&vgpu->vfio_device);
}
base-commit: c72e0034e6d4c36322d958b997d11d2627c6056c
This is marked for stable, but I think the stable backport for existing kernels is actually:
Yes probably, this patch won't apply so if anyone wants to see it in the stable series they need to follow the process to send the reworked version.
Jason
On Wed, Oct 05, 2022 at 04:03:56PM -0600, Alex Williamson wrote:
We can't have a .remove callback that does nothing, this breaks removing the device while it's in use. Once we have the vfio_unregister_group_dev() fix below, we'll block until the device is unused, at which point vgpu->attached becomes false. Unless I'm missing something, I think we should also follow-up with a patch to remove that bogus warn-on branch, right? Thanks,
Yes, looks right to me.
I question all the logical arround attached, where is the locking?
Jason
On Thu, 6 Oct 2022 08:37:09 -0300 Jason Gunthorpe jgg@nvidia.com wrote:
On Wed, Oct 05, 2022 at 04:03:56PM -0600, Alex Williamson wrote:
We can't have a .remove callback that does nothing, this breaks removing the device while it's in use. Once we have the vfio_unregister_group_dev() fix below, we'll block until the device is unused, at which point vgpu->attached becomes false. Unless I'm missing something, I think we should also follow-up with a patch to remove that bogus warn-on branch, right? Thanks,
Yes, looks right to me.
I question all the logical arround attached, where is the locking?
Zhenyu, Zhi, Kevin,
Could someone please take a look at use of vgpu->attached in the GVT-g driver? It's use in intel_vgpu_remove() is bogus, the .release callback needs to use vfio_unregister_group_dev() to wait for the device to be unused. The WARN_ON/return here breaks all future use of the device. I assume @attached has something to do with the page table interface with KVM, but it all looks racy anyway.
Also, whatever purpose vgpu->released served looks unnecessary now. Thanks,
Alex
From: Alex Williamson alex.williamson@redhat.com Sent: Friday, October 7, 2022 2:31 AM
On Thu, 6 Oct 2022 08:37:09 -0300 Jason Gunthorpe jgg@nvidia.com wrote:
On Wed, Oct 05, 2022 at 04:03:56PM -0600, Alex Williamson wrote:
We can't have a .remove callback that does nothing, this breaks removing the device while it's in use. Once we have the vfio_unregister_group_dev() fix below, we'll block until the device is unused, at which point vgpu->attached becomes false. Unless I'm missing something, I think we should also follow-up with a patch to remove that bogus warn-on branch, right? Thanks,
Yes, looks right to me.
I question all the logical arround attached, where is the locking?
Zhenyu, Zhi, Kevin,
Could someone please take a look at use of vgpu->attached in the GVT-g driver? It's use in intel_vgpu_remove() is bogus, the .release callback needs to use vfio_unregister_group_dev() to wait for the device to be unused. The WARN_ON/return here breaks all future use of the device. I assume @attached has something to do with the page table interface with KVM, but it all looks racy anyway.
Also, whatever purpose vgpu->released served looks unnecessary now. Thanks,
Zhi is looking at it.
On 10/6/22 18:31, Alex Williamson wrote:
On Thu, 6 Oct 2022 08:37:09 -0300 Jason Gunthorpe jgg@nvidia.com wrote:
On Wed, Oct 05, 2022 at 04:03:56PM -0600, Alex Williamson wrote:
We can't have a .remove callback that does nothing, this breaks removing the device while it's in use. Once we have the vfio_unregister_group_dev() fix below, we'll block until the device is unused, at which point vgpu->attached becomes false. Unless I'm missing something, I think we should also follow-up with a patch to remove that bogus warn-on branch, right? Thanks,
Yes, looks right to me.
I question all the logical arround attached, where is the locking?
Zhenyu, Zhi, Kevin,
Could someone please take a look at use of vgpu->attached in the GVT-g driver? It's use in intel_vgpu_remove() is bogus, the .release callback needs to use vfio_unregister_group_dev() to wait for the device to be unused. The WARN_ON/return here breaks all future use of the device. I assume @attached has something to do with the page table interface with KVM, but it all looks racy anyway.
Thanks for pointing this out.
It was introduced in the GVT-g refactor patch series and Christoph might not want to touch the vgpu->released while he needed a new state.
I dig it a bit. vgpu->attached would be used for preventing multiple open on a single vGPU and indicate the kvm_get_kvm() has been done. vgpu->released was to prevent the release before close, which is now handled by the vfio_device_*.
What I would like to do are: 1) Remove the vgpu->released. 2) Use alock to protect vgpu->attached.
After those were solved, the WARN_ON/return in the intel_vgpu_remove() should be safely removed as the .release will be called after .close_device deceases the vfio_device->refcnt to zero.
Thanks, Zhi.
Also, whatever purpose vgpu->released served looks unnecessary now. Thanks,
Alex
From: Wang, Zhi A zhi.a.wang@intel.com Sent: Wednesday, October 19, 2022 5:41 PM
On 10/6/22 18:31, Alex Williamson wrote:
On Thu, 6 Oct 2022 08:37:09 -0300 Jason Gunthorpe jgg@nvidia.com wrote:
On Wed, Oct 05, 2022 at 04:03:56PM -0600, Alex Williamson wrote:
We can't have a .remove callback that does nothing, this breaks removing the device while it's in use. Once we have the vfio_unregister_group_dev() fix below, we'll block until the device is unused, at which point vgpu->attached becomes false. Unless I'm missing something, I think we should also follow-up with a patch to remove that bogus warn-on branch, right? Thanks,
Yes, looks right to me.
I question all the logical arround attached, where is the locking?
Zhenyu, Zhi, Kevin,
Could someone please take a look at use of vgpu->attached in the GVT-g driver? It's use in intel_vgpu_remove() is bogus, the .release callback needs to use vfio_unregister_group_dev() to wait for the device to be unused. The WARN_ON/return here breaks all future use of the device. I assume @attached has something to do with the page table interface with KVM, but it all looks racy anyway.
Thanks for pointing this out.
It was introduced in the GVT-g refactor patch series and Christoph might not want to touch the vgpu->released while he needed a new state.
I dig it a bit. vgpu->attached would be used for preventing multiple open on a single vGPU and indicate the kvm_get_kvm() has been done.
vfio core already ensures that .open_device() is called only once:
vfio_device_open() { ... mutex_lock(&device->dev_set->lock); device->open_count++; if (device->open_count == 1) { ... if (device->ops->open_device) { ret = device->ops->open_device(device); ... }
vgpu->released was to prevent the release before close, which is now handled by the vfio_device_*.
What I would like to do are:
- Remove the vgpu->released. 2) Use alock to protect vgpu->attached.
After those were solved, the WARN_ON/return in the intel_vgpu_remove() should be safely removed as the .release will be called after .close_device deceases the vfio_device->refcnt to zero.
Thanks, Zhi.
Also, whatever purpose vgpu->released served looks unnecessary now. Thanks,
Alex
linux-stable-mirror@lists.linaro.org