An unfortunate sequence of events, but it turns out there is a valid usecase for being able to free/decouple the driver objects before they are freed by the DRM core. In particular, if we have a pointer into a drm core object from inside a driver object, that pointer needs to be nerfed *before* it is freed so that concurrent access (e.g. debugfs) does not following the dangling pointer.
The legacy marker was adding in the code movement from drp_fops.c to drm_file.c
References: 9acdac68bcdc ("drm: rename drm_fops.c to drm_file.c") Signed-off-by: Chris Wilson chris@chris-wilson.co.uk Cc: Daniel Vetter daniel.vetter@intel.com Cc: Gustavo Padovan gustavo.padovan@collabora.com Cc: CQ Tang cq.tang@intel.com Cc: stable@vger.kernel.org # v4.12+ --- drivers/gpu/drm/drm_file.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c index 0ac4566ae3f4..7b4258d6f7cc 100644 --- a/drivers/gpu/drm/drm_file.c +++ b/drivers/gpu/drm/drm_file.c @@ -258,8 +258,7 @@ void drm_file_free(struct drm_file *file) (long)old_encode_dev(file->minor->kdev->devt), atomic_read(&dev->open_count));
- if (drm_core_check_feature(dev, DRIVER_LEGACY) && - dev->driver->preclose) + if (dev->driver->preclose) dev->driver->preclose(dev, file);
if (drm_core_check_feature(dev, DRIVER_LEGACY))
Since the GEM contexts refer to other GEM state, we need to nerf those pointers before that state is freed during drm_gem_release(). We need to move i915_gem_context_close() from the postclose callback to the preclose.
In particular, debugfs likes to peek into the GEM contexts, and from there peek at the drm core objects. If the context is closed during the peeking, we may attempt to dereference a stale core object.
Signed-off-by: Chris Wilson chris@chris-wilson.co.uk Cc: CQ Tang cq.tang@intel.com Cc: Daniel Vetter daniel.vetter@intel.com Cc: stable@vger.kernel.org --- drivers/gpu/drm/i915/i915_drv.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 5fd5af4bc855..15242a8c70f7 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -1114,11 +1114,15 @@ static void i915_driver_lastclose(struct drm_device *dev) vga_switcheroo_process_delayed_switch(); }
+static void i915_driver_preclose(struct drm_device *dev, struct drm_file *file) +{ + i915_gem_context_close(file); +} + static void i915_driver_postclose(struct drm_device *dev, struct drm_file *file) { struct drm_i915_file_private *file_priv = file->driver_priv;
- i915_gem_context_close(file); i915_gem_release(dev, file);
kfree_rcu(file_priv, rcu); @@ -1850,6 +1854,7 @@ static struct drm_driver driver = { .release = i915_driver_release, .open = i915_driver_open, .lastclose = i915_driver_lastclose, + .preclose = i915_driver_preclose, .postclose = i915_driver_postclose,
.gem_close_object = i915_gem_close_object,
-----Original Message----- From: Chris Wilson chris@chris-wilson.co.uk Sent: Thursday, July 23, 2020 10:21 AM To: intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org; Chris Wilson chris@chris-wilson.co.uk; Tang, CQ cq.tang@intel.com; Vetter, Daniel daniel.vetter@intel.com; stable@vger.kernel.org Subject: [PATCH 2/3] drm/i915/gem: Move context decoupling from postclose to preclose
Since the GEM contexts refer to other GEM state, we need to nerf those pointers before that state is freed during drm_gem_release(). We need to move i915_gem_context_close() from the postclose callback to the preclose.
In particular, debugfs likes to peek into the GEM contexts, and from there peek at the drm core objects. If the context is closed during the peeking, we may attempt to dereference a stale core object.
Signed-off-by: Chris Wilson chris@chris-wilson.co.uk Cc: CQ Tang cq.tang@intel.com Cc: Daniel Vetter daniel.vetter@intel.com Cc: stable@vger.kernel.org
drivers/gpu/drm/i915/i915_drv.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 5fd5af4bc855..15242a8c70f7 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -1114,11 +1114,15 @@ static void i915_driver_lastclose(struct drm_device *dev) vga_switcheroo_process_delayed_switch(); }
+static void i915_driver_preclose(struct drm_device *dev, struct +drm_file *file) {
- i915_gem_context_close(file);
+}
static void i915_driver_postclose(struct drm_device *dev, struct drm_file *file) { struct drm_i915_file_private *file_priv = file->driver_priv;
- i915_gem_context_close(file); i915_gem_release(dev, file);
Now we separate i915_gem_context_close() from i915_gem_release() and other freeing code in postclose(), is there any side effect to allow code to run in between? Can we move all postclose() code into preclose()?
--CQ
kfree_rcu(file_priv, rcu); @@ -1850,6 +1854,7 @@ static struct drm_driver driver = { .release = i915_driver_release, .open = i915_driver_open, .lastclose = i915_driver_lastclose,
.preclose = i915_driver_preclose, .postclose = i915_driver_postclose,
.gem_close_object = i915_gem_close_object,
-- 2.20.1
Since the debugfs may peek into the GEM contexts as the corresponding client/fd is being closed, we may try and follow a dangling pointer. However, the context closure itself is serialised with the ctx->mutex, so if we hold that mutex as we inspect the state coupled in the context, we know the pointers within the context are stable and will remain valid as we inspect their tables.
Signed-off-by: Chris Wilson chris@chris-wilson.co.uk Cc: CQ Tang cq.tang@intel.com Cc: Daniel Vetter daniel.vetter@intel.com Cc: stable@vger.kernel.org --- drivers/gpu/drm/i915/i915_debugfs.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 784219962193..ea469168cd44 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -326,6 +326,7 @@ static void print_context_stats(struct seq_file *m, } i915_gem_context_unlock_engines(ctx);
+ mutex_lock(&ctx->mutex); if (!IS_ERR_OR_NULL(ctx->file_priv)) { struct file_stats stats = { .vm = rcu_access_pointer(ctx->vm), @@ -346,6 +347,7 @@ static void print_context_stats(struct seq_file *m,
print_file_stats(m, name, stats); } + mutex_unlock(&ctx->mutex);
spin_lock(&i915->gem.contexts.lock); list_safe_reset_next(ctx, cn, link);
Hi
[This is an automated email]
This commit has been processed because it contains a -stable tag. The stable tag indicates that it's relevant for the following trees: all
The bot has tested the following trees: v5.7.10, v5.4.53, v4.19.134, v4.14.189, v4.9.231, v4.4.231.
v5.7.10: Build OK! v5.4.53: Failed to apply! Possible dependencies: 061489c65ff5 ("drm/i915/dsb: single register write function for DSB.") 11988e393813 ("drm/i915/execlists: Try rearranging breadcrumb flush") 2850748ef876 ("drm/i915: Pull i915_vma_pin under the vm->mutex") 5a90606df7cb ("drm/i915: Replace obj->pin_global with obj->frontbuffer") 67f3b58f3bac ("drm/i915/dsb: DSB context creation.") 8a9a982767b7 ("drm/i915: use a separate context for gpu relocs") a4e7ccdac38e ("drm/i915: Move context management under GEM") b27a96ad72fd ("drm/i915/dsb: Indexed register write function for DSB.") bb120e1171a9 ("drm/i915: Show the logical context ring state on dumping") c210e85b8f33 ("drm/i915/tgl: Extend MI_SEMAPHORE_WAIT") d19d71fc2b15 ("drm/i915: Mark i915_request.timeline as a volatile, rcu pointer") e8f6b4952ec5 ("drm/i915/execlists: Flush the post-sync breadcrumb write harder")
v4.19.134: Failed to apply! Possible dependencies: 0258404f9d38 ("drm/i915: start moving runtime device info to a separate struct") 026844460743 ("drm/i915: Remove intel_context.active_link") 07d805721938 ("drm/i915: Introduce intel_runtime_pm_disable to pair intel_runtime_pm_enable") 13f1bfd3b332 ("drm/i915: Make object/vma allocation caches global") 1c71bc565cdb ("drm/i915/perf: simplify configure all context function") 2cc8376fd350 ("drm/i915: rename dev_priv info to __info to avoid usage") 2cd9a689e97b ("drm/i915: Refactor intel_display_set_init_power() logic") 37d7c9cc2eb6 ("drm/i915: Check engine->default_state mapping on module load") 55ac5a1614f9 ("drm/i915: Attach the pci match data to the device upon creation") 666424abfb86 ("drm/i915/execlists: Use coherent writes into the context image") 6dfc4a8f134f ("drm/i915: Verify power domains after enabling them") 722f3de39e03 ("i915/oa: Simplify updating contexts") 900ccf30f9e1 ("drm/i915: Only force GGTT coherency w/a on required chipsets") c4d52feb2c46 ("drm/i915: Move over to intel_context_lookup()") f6e8aa387171 ("drm/i915: Report the number of closed vma held by each context in debugfs") fa9f668141f4 ("drm/i915: Export intel_context_instance()")
v4.14.189: Failed to apply! Possible dependencies: 3bd4073524fa ("drm/i915: Consolidate get_fence with pin_fence") 465c403cb508 ("drm/i915: introduce simple gemfs") 66df1014efba ("drm/i915: Keep a small stash of preallocated WC pages") 67b48040255b ("drm/i915: Assert that the handle->vma lut is empty on object close") 73ebd503034c ("drm/i915: make mappable struct resource centric") 7789422665f5 ("drm/i915: make dsm struct resource centric") 82ad6443a55e ("drm/i915/gtt: Rename i915_hw_ppgtt base member") 969b0950a188 ("drm/i915: Add interface to reserve fence registers for vGPU") a65adaf8a834 ("drm/i915: Track user GTT faulting per-vma") b4563f595ed4 ("drm/i915: Pin fence for iomap") e91ef99b9543 ("drm/i915/selftests: Remember to create the fake preempt context") f6e8aa387171 ("drm/i915: Report the number of closed vma held by each context in debugfs") f773568b6ff8 ("drm/i915: nuke the duplicated stolen discovery")
v4.9.231: Failed to apply! Possible dependencies: 0e70447605f4 ("drm/i915: Move common code out of i915_gpu_error.c") 1b36595ffb35 ("drm/i915: Show RING registers through debugfs") 28a60dee2ce6 ("drm/i915/gvt: vGPU HW resource management") 3b3f1650b1ca ("drm/i915: Allocate intel_engine_cs structure only for the enabled engines") 82ad6443a55e ("drm/i915/gtt: Rename i915_hw_ppgtt base member") 85fd4f58d7ef ("drm/i915: Mark all non-vma being inserted into the address spaces") 9c870d03674f ("drm/i915: Use RPM as the barrier for controlling user mmap access") bb6dc8d96b68 ("drm/i915: Implement pread without struct-mutex") d636951ec01b ("drm/i915: Cleanup instdone collection") e007b19d7ba7 ("drm/i915: Use the MRU stack search after evicting") f6e8aa387171 ("drm/i915: Report the number of closed vma held by each context in debugfs") f9e613728090 ("drm/i915: Try to print INSTDONE bits for all slice/subslice")
v4.4.231: Failed to apply! Possible dependencies: 1b683729e7ac ("drm/i915: Remove redundant check in i915_gem_obj_to_vma") 1c7f4bca5a6f ("drm/i915: Rename vma->*_list to *_link for consistency") 3272db53136f ("drm/i915: Combine all i915_vma bitfields into a single set of flags") 596c5923197b ("drm/i915: Reduce the pointer dance of i915_is_ggtt()") c1a415e261aa ("drm/i915: Disable shrinker for non-swapped backed objects") d0710abbcd88 ("drm/i915: Set the map-and-fenceable flag for preallocated objects") f6e8aa387171 ("drm/i915: Report the number of closed vma held by each context in debugfs")
NOTE: The patch will not be queued to stable trees until it is upstream.
How should we proceed with this patch?
On 23/07/2020 18:21, Chris Wilson wrote:
Since the debugfs may peek into the GEM contexts as the corresponding client/fd is being closed, we may try and follow a dangling pointer. However, the context closure itself is serialised with the ctx->mutex, so if we hold that mutex as we inspect the state coupled in the context, we know the pointers within the context are stable and will remain valid as we inspect their tables.
Signed-off-by: Chris Wilson chris@chris-wilson.co.uk Cc: CQ Tang cq.tang@intel.com Cc: Daniel Vetter daniel.vetter@intel.com Cc: stable@vger.kernel.org
drivers/gpu/drm/i915/i915_debugfs.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 784219962193..ea469168cd44 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -326,6 +326,7 @@ static void print_context_stats(struct seq_file *m, } i915_gem_context_unlock_engines(ctx);
if (!IS_ERR_OR_NULL(ctx->file_priv)) { struct file_stats stats = { .vm = rcu_access_pointer(ctx->vm),mutex_lock(&ctx->mutex);
@@ -346,6 +347,7 @@ static void print_context_stats(struct seq_file *m, print_file_stats(m, name, stats); }
mutex_unlock(&ctx->mutex);
spin_lock(&i915->gem.contexts.lock); list_safe_reset_next(ctx, cn, link);
Hm this apparently never got it's r-b and so got re-discovered in the field. +Nikunj
Reviewed-by: Tvrtko Ursulin tvrtko.ursulin@intel.com
Regards,
Tvrtko
On Mon, Sep 14, 2020 at 05:45:09PM +0100, Tvrtko Ursulin wrote:
On 23/07/2020 18:21, Chris Wilson wrote:
Since the debugfs may peek into the GEM contexts as the corresponding client/fd is being closed, we may try and follow a dangling pointer. However, the context closure itself is serialised with the ctx->mutex, so if we hold that mutex as we inspect the state coupled in the context, we know the pointers within the context are stable and will remain valid as we inspect their tables.
Signed-off-by: Chris Wilson chris@chris-wilson.co.uk Cc: CQ Tang cq.tang@intel.com Cc: Daniel Vetter daniel.vetter@intel.com Cc: stable@vger.kernel.org
drivers/gpu/drm/i915/i915_debugfs.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 784219962193..ea469168cd44 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -326,6 +326,7 @@ static void print_context_stats(struct seq_file *m, } i915_gem_context_unlock_engines(ctx);
if (!IS_ERR_OR_NULL(ctx->file_priv)) { struct file_stats stats = { .vm = rcu_access_pointer(ctx->vm),mutex_lock(&ctx->mutex);
@@ -346,6 +347,7 @@ static void print_context_stats(struct seq_file *m, print_file_stats(m, name, stats); }
spin_lock(&i915->gem.contexts.lock); list_safe_reset_next(ctx, cn, link);mutex_unlock(&ctx->mutex);
Hm this apparently never got it's r-b and so got re-discovered in the field. +Nikunj
Reviewed-by: Tvrtko Ursulin tvrtko.ursulin@intel.com
I'm not super thrilled about patch 1 in this, for debugfs imo better to wrangle this in the driver. And without patch 1 and 2 this wont fix a whole lot. -Daniel
Regards,
Tvrtko _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
On 16/09/2020 08:42, Daniel Vetter wrote:
On Mon, Sep 14, 2020 at 05:45:09PM +0100, Tvrtko Ursulin wrote:
On 23/07/2020 18:21, Chris Wilson wrote:
Since the debugfs may peek into the GEM contexts as the corresponding client/fd is being closed, we may try and follow a dangling pointer. However, the context closure itself is serialised with the ctx->mutex, so if we hold that mutex as we inspect the state coupled in the context, we know the pointers within the context are stable and will remain valid as we inspect their tables.
Signed-off-by: Chris Wilson chris@chris-wilson.co.uk Cc: CQ Tang cq.tang@intel.com Cc: Daniel Vetter daniel.vetter@intel.com Cc: stable@vger.kernel.org
drivers/gpu/drm/i915/i915_debugfs.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 784219962193..ea469168cd44 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -326,6 +326,7 @@ static void print_context_stats(struct seq_file *m, } i915_gem_context_unlock_engines(ctx);
mutex_lock(&ctx->mutex); if (!IS_ERR_OR_NULL(ctx->file_priv)) { struct file_stats stats = { .vm = rcu_access_pointer(ctx->vm),
@@ -346,6 +347,7 @@ static void print_context_stats(struct seq_file *m, print_file_stats(m, name, stats); }
mutex_unlock(&ctx->mutex); spin_lock(&i915->gem.contexts.lock); list_safe_reset_next(ctx, cn, link);
Hm this apparently never got it's r-b and so got re-discovered in the field. +Nikunj
Reviewed-by: Tvrtko Ursulin tvrtko.ursulin@intel.com
I'm not super thrilled about patch 1 in this, for debugfs imo better to wrangle this in the driver. And without patch 1 and 2 this wont fix a whole lot.
I try to avoid spending too much time coming up with smart solutions for _debugfs_. So I was going by the fact it obviously fixes something so it is an improvement.
But your proposal to swith iteration to files->contexts also seems would work. It would be slightly semantically different where it wouldn't show the contexts which are active on the GPU but clients have exited, but its debugfs so no one should care.
Regards,
Tvrtko
On Thu, Jul 23, 2020 at 7:21 PM Chris Wilson chris@chris-wilson.co.uk wrote:
An unfortunate sequence of events, but it turns out there is a valid usecase for being able to free/decouple the driver objects before they are freed by the DRM core. In particular, if we have a pointer into a drm core object from inside a driver object, that pointer needs to be nerfed *before* it is freed so that concurrent access (e.g. debugfs) does not following the dangling pointer.
The legacy marker was adding in the code movement from drp_fops.c to drm_file.c
I might fumble a lot, but not this one:
commit 45c3d213a400c952ab7119f394c5293bb6877e6b Author: Daniel Vetter daniel.vetter@ffwll.ch Date: Mon May 8 10:26:33 2017 +0200
drm: Nerf the preclose callback for modern drivers
Also looking at the debugfs hook that has some rather adventurous stuff going on I think, feels a bit like a kitchensink with batteries included. If that's really all needed I'd say iterate the contexts by first going over files, then the ctx (which arent shared anyway) and the problem should also be gone. -Daniel
References: 9acdac68bcdc ("drm: rename drm_fops.c to drm_file.c") Signed-off-by: Chris Wilson chris@chris-wilson.co.uk Cc: Daniel Vetter daniel.vetter@intel.com Cc: Gustavo Padovan gustavo.padovan@collabora.com Cc: CQ Tang cq.tang@intel.com Cc: stable@vger.kernel.org # v4.12+
drivers/gpu/drm/drm_file.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c index 0ac4566ae3f4..7b4258d6f7cc 100644 --- a/drivers/gpu/drm/drm_file.c +++ b/drivers/gpu/drm/drm_file.c @@ -258,8 +258,7 @@ void drm_file_free(struct drm_file *file) (long)old_encode_dev(file->minor->kdev->devt), atomic_read(&dev->open_count));
if (drm_core_check_feature(dev, DRIVER_LEGACY) &&
dev->driver->preclose)
if (dev->driver->preclose) dev->driver->preclose(dev, file); if (drm_core_check_feature(dev, DRIVER_LEGACY))
-- 2.20.1
dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
-----Original Message----- From: Daniel Vetter daniel@ffwll.ch Sent: Monday, July 27, 2020 12:33 PM To: Chris Wilson chris@chris-wilson.co.uk; Dave Airlie airlied@redhat.com Cc: intel-gfx intel-gfx@lists.freedesktop.org; stable stable@vger.kernel.org; Gustavo Padovan gustavo.padovan@collabora.com; Tang, CQ cq.tang@intel.com; dri- devel dri-devel@lists.freedesktop.org; Vetter, Daniel daniel.vetter@intel.com Subject: Re: [PATCH 1/3] drm: Restore driver.preclose() for all to use
On Thu, Jul 23, 2020 at 7:21 PM Chris Wilson chris@chris-wilson.co.uk wrote:
An unfortunate sequence of events, but it turns out there is a valid usecase for being able to free/decouple the driver objects before they are freed by the DRM core. In particular, if we have a pointer into a drm core object from inside a driver object, that pointer needs to be nerfed *before* it is freed so that concurrent access (e.g. debugfs) does not following the dangling pointer.
The legacy marker was adding in the code movement from drp_fops.c to drm_file.c
I might fumble a lot, but not this one:
commit 45c3d213a400c952ab7119f394c5293bb6877e6b Author: Daniel Vetter daniel.vetter@ffwll.ch Date: Mon May 8 10:26:33 2017 +0200
drm: Nerf the preclose callback for modern drivers
Also looking at the debugfs hook that has some rather adventurous stuff going on I think, feels a bit like a kitchensink with batteries included. If that's really all needed I'd say iterate the contexts by first going over files, then the ctx (which arent shared anyway) and the problem should also be gone.
Debugfs code can jump in after drm_gem_release() (where file->object_idr is destroyed), but before postclose(). At this window, everything is fine for debugfs context accessing except the file->object_idr.
--CQ
-Daniel
References: 9acdac68bcdc ("drm: rename drm_fops.c to drm_file.c") Signed-off-by: Chris Wilson chris@chris-wilson.co.uk Cc: Daniel Vetter daniel.vetter@intel.com Cc: Gustavo Padovan gustavo.padovan@collabora.com Cc: CQ Tang cq.tang@intel.com Cc: stable@vger.kernel.org # v4.12+
drivers/gpu/drm/drm_file.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c index 0ac4566ae3f4..7b4258d6f7cc 100644 --- a/drivers/gpu/drm/drm_file.c +++ b/drivers/gpu/drm/drm_file.c @@ -258,8 +258,7 @@ void drm_file_free(struct drm_file *file) (long)old_encode_dev(file->minor->kdev->devt), atomic_read(&dev->open_count));
if (drm_core_check_feature(dev, DRIVER_LEGACY) &&
dev->driver->preclose)
if (dev->driver->preclose) dev->driver->preclose(dev, file); if (drm_core_check_feature(dev, DRIVER_LEGACY))
-- 2.20.1
dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Quoting Daniel Vetter (2020-07-27 20:32:45)
On Thu, Jul 23, 2020 at 7:21 PM Chris Wilson chris@chris-wilson.co.uk wrote:
An unfortunate sequence of events, but it turns out there is a valid usecase for being able to free/decouple the driver objects before they are freed by the DRM core. In particular, if we have a pointer into a drm core object from inside a driver object, that pointer needs to be nerfed *before* it is freed so that concurrent access (e.g. debugfs) does not following the dangling pointer.
The legacy marker was adding in the code movement from drp_fops.c to drm_file.c
I might fumble a lot, but not this one:
commit 45c3d213a400c952ab7119f394c5293bb6877e6b Author: Daniel Vetter daniel.vetter@ffwll.ch Date: Mon May 8 10:26:33 2017 +0200
drm: Nerf the preclose callback for modern drivers
Gah, when I going through the history it looked like it appeared out of nowhere.
Also looking at the debugfs hook that has some rather adventurous stuff going on I think, feels a bit like a kitchensink with batteries included. If that's really all needed I'd say iterate the contexts by first going over files, then the ctx (which arent shared anyway) and the problem should also be gone.
Or we could cut out the middlelayer and put the release under the driver control with a call to the drm_release() when the driver is ready. -Chris
-----Original Message----- From: Chris Wilson chris@chris-wilson.co.uk Sent: Tuesday, July 28, 2020 9:28 AM To: Daniel Vetter daniel@ffwll.ch; Dave Airlie airlied@redhat.com Cc: intel-gfx intel-gfx@lists.freedesktop.org; stable stable@vger.kernel.org; Gustavo Padovan gustavo.padovan@collabora.com; Tang, CQ cq.tang@intel.com; dri- devel dri-devel@lists.freedesktop.org; Vetter, Daniel daniel.vetter@intel.com Subject: Re: [PATCH 1/3] drm: Restore driver.preclose() for all to use
Quoting Daniel Vetter (2020-07-27 20:32:45)
On Thu, Jul 23, 2020 at 7:21 PM Chris Wilson chris@chris-wilson.co.uk
wrote:
An unfortunate sequence of events, but it turns out there is a valid usecase for being able to free/decouple the driver objects before they are freed by the DRM core. In particular, if we have a pointer into a drm core object from inside a driver object, that pointer needs to be nerfed *before* it is freed so that concurrent access (e.g. debugfs) does not following the dangling pointer.
The legacy marker was adding in the code movement from drp_fops.c to drm_file.c
I might fumble a lot, but not this one:
commit 45c3d213a400c952ab7119f394c5293bb6877e6b Author: Daniel Vetter daniel.vetter@ffwll.ch Date: Mon May 8 10:26:33 2017 +0200
drm: Nerf the preclose callback for modern drivers
Gah, when I going through the history it looked like it appeared out of nowhere.
Also looking at the debugfs hook that has some rather adventurous stuff going on I think, feels a bit like a kitchensink with batteries included. If that's really all needed I'd say iterate the contexts by first going over files, then the ctx (which arent shared anyway) and the problem should also be gone.
Or we could cut out the middlelayer and put the release under the driver control with a call to the drm_release() when the driver is ready.
Chiris, can explain this idea, or post a patch ?
--CQ
-Chris
Hi
[This is an automated email]
This commit has been processed because it contains a -stable tag. The stable tag indicates that it's relevant for the following trees: 4.12+
The bot has tested the following trees: v5.7.10, v5.4.53, v4.19.134, v4.14.189.
v5.7.10: Build OK! v5.4.53: Build OK! v4.19.134: Build OK! v4.14.189: Failed to apply! Possible dependencies: 112ed2d31a46 ("drm/i915: Move GraphicsTechnology files under gt/") 1572042a4ab2 ("drm: provide management functions for drm_file") 7a2c65dd32b1 ("drm: Release filp before global lock") 7e13ad896484 ("drm: Avoid drm_global_mutex for simple inc/dec of dev->open_count") b46a33e271ed ("drm/i915/pmu: Expose a PMU interface for perf queries") c2400ec3b6d1 ("drm/i915: add Makefile magic for testing headers are self-contained") cc662126b413 ("drm/i915: Introduce DRM_I915_GEM_MMAP_OFFSET") e7af3116836f ("drm/i915: Introduce a preempt context") f0e4a0639752 ("drm/i915: Move GEM domain management to its own file")
NOTE: The patch will not be queued to stable trees until it is upstream.
How should we proceed with this patch?
linux-stable-mirror@lists.linaro.org