[+cc Bartosz]
On Fri, Feb 02, 2024 at 05:25:56PM -0500, Hamza Mahfooz wrote:
Removing an amdgpu device that still has user space references allocated to it causes undefined behaviour. So, implement amdgpu_pci_can_remove() and disallow devices that still have files allocated to them from being unbound.
Maybe this would help for things that are completely built-in or soldered down, but nothing can prevent a user from physically pulling a card or cable, so I don't think this is a generic solution to the problem of dangling user space references.
Maybe Bartosz's recent LPC talk is relevant: https://lpc.events/event/17/contributions/1627/
Cc: stable@vger.kernel.org Signed-off-by: Hamza Mahfooz hamza.mahfooz@amd.com
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index cc69005f5b46..cfa64f3c5be5 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -2323,6 +2323,22 @@ static int amdgpu_pci_probe(struct pci_dev *pdev, return ret; } +static bool amdgpu_pci_can_remove(struct pci_dev *pdev) +{
- struct drm_device *dev = pci_get_drvdata(pdev);
- mutex_lock(&dev->filelist_mutex);
- if (!list_empty(&dev->filelist)) {
mutex_unlock(&dev->filelist_mutex);
return false;
- }
- mutex_unlock(&dev->filelist_mutex);
- return true;
+}
static void amdgpu_pci_remove(struct pci_dev *pdev) { @@ -2929,6 +2945,7 @@ static struct pci_driver amdgpu_kms_pci_driver = { .name = DRIVER_NAME, .id_table = pciidlist, .probe = amdgpu_pci_probe,
- .can_remove = amdgpu_pci_can_remove, .remove = amdgpu_pci_remove, .shutdown = amdgpu_pci_shutdown, .driver.pm = &amdgpu_pm_ops,
-- 2.43.0