The patch below does not apply to the 5.10-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to stable@vger.kernel.org.
Possible dependencies:
35aba5f51a39 ("drm/i915: Never return 0 if not all requests retired") b97060a99b01 ("drm/i915/guc: Update intel_gt_wait_for_idle to work with GuC") f4eb1f3fe946 ("drm/i915/guc: Ensure G2H response has space in buffer") e0717063ccb4 ("drm/i915/guc: Defer context unpin until scheduling is disabled") 3a4cdf1982f0 ("drm/i915/guc: Implement GuC context operations for new inteface") 925dc1cf58ed ("drm/i915/guc: Implement GuC submission tasklet") 27213d79b384 ("drm/i915/guc: Add LRC descriptor context lookup array") 7518d9b67cf5 ("drm/i915/guc: Remove GuC stage descriptor, add LRC descriptor") 56bc88745e73 ("drm/i915/guc: Add new GuC interface defines and structures") 75452167a279 ("drm/i915/guc: Optimize CTB writes and reads") b43b9950486e ("drm/i915/guc: Add stall timer to non blocking CTB send function") 1681924d8bde ("drm/i915/guc: Add non blocking CTB send function") c26e289f1d8d ("drm/i915/guc: Increase size of CTB buffers") 572f2a5cd974 ("drm/i915/guc: Update firmware to v62.0.0") 22916bad07a5 ("drm/i915: Move submission tasklet to i915_sched_engine") d2a31d026492 ("drm/i915: Update i915_scheduler to operate on i915_sched_engine") 71ed60112d5d ("drm/i915: Add kick_backend function to i915_sched_engine") 3f623e06cd56 ("drm/i915: Move engine->schedule to i915_sched_engine") 349a2bc5aae4 ("drm/i915: Move active tracking to i915_sched_engine") c4fd7d8cc3ca ("drm/i915: Reset sched_engine.no_priolist immediately after dequeue")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 35aba5f51a39fb95351844ffb14ec02b8970e19f Mon Sep 17 00:00:00 2001 From: Janusz Krzysztofik janusz.krzysztofik@linux.intel.com Date: Mon, 21 Nov 2022 15:56:55 +0100 Subject: [PATCH] drm/i915: Never return 0 if not all requests retired
Users of intel_gt_retire_requests_timeout() expect 0 return value on success. However, we have no protection from passing back 0 potentially returned by a call to dma_fence_wait_timeout() when it succedes right after its timeout has expired.
Replace 0 with -ETIME before potentially using the timeout value as return code, so -ETIME is returned if there are still some requests not retired after timeout, 0 otherwise.
v3: Use conditional expression, more compact but also better reflecting intention standing behind the change.
v2: Move the added lines down so flush_submission() is not affected.
Fixes: f33a8a51602c ("drm/i915: Merge wait_for_timelines with retire_request") Signed-off-by: Janusz Krzysztofik janusz.krzysztofik@linux.intel.com Reviewed-by: Andrzej Hajda andrzej.hajda@intel.com Cc: stable@vger.kernel.org # v5.5+ Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20221121145655.75141-3-janusz.... (cherry picked from commit f301a29f143760ce8d3d6b6a8436d45d3448cde6) Signed-off-by: Rodrigo Vivi rodrigo.vivi@intel.com
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c index edb881d75630..1dfd01668c79 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c @@ -199,7 +199,7 @@ out_active: spin_lock(&timelines->lock); if (remaining_timeout) *remaining_timeout = timeout;
- return active_count ? timeout : 0; + return active_count ? timeout ?: -ETIME : 0; }
static void retire_work_handler(struct work_struct *work)
Hi Greg,
On Wednesday, 4 January 2023 15:39:15 CET gregkh@linuxfoundation.org wrote:
The patch below does not apply to the 5.10-stable tree.
FYI, I can see it already added to v5.10.158, commit 648b92e5760721fbf230e242950182d7e9222143. The same for other stable trees as well as my other fixes for which I received such failure reports from you today.
Thanks, Janusz
If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to stable@vger.kernel.org.
Possible dependencies:
35aba5f51a39 ("drm/i915: Never return 0 if not all requests retired") b97060a99b01 ("drm/i915/guc: Update intel_gt_wait_for_idle to work with GuC") f4eb1f3fe946 ("drm/i915/guc: Ensure G2H response has space in buffer") e0717063ccb4 ("drm/i915/guc: Defer context unpin until scheduling is disabled") 3a4cdf1982f0 ("drm/i915/guc: Implement GuC context operations for new inteface") 925dc1cf58ed ("drm/i915/guc: Implement GuC submission tasklet") 27213d79b384 ("drm/i915/guc: Add LRC descriptor context lookup array") 7518d9b67cf5 ("drm/i915/guc: Remove GuC stage descriptor, add LRC descriptor") 56bc88745e73 ("drm/i915/guc: Add new GuC interface defines and structures") 75452167a279 ("drm/i915/guc: Optimize CTB writes and reads") b43b9950486e ("drm/i915/guc: Add stall timer to non blocking CTB send function") 1681924d8bde ("drm/i915/guc: Add non blocking CTB send function") c26e289f1d8d ("drm/i915/guc: Increase size of CTB buffers") 572f2a5cd974 ("drm/i915/guc: Update firmware to v62.0.0") 22916bad07a5 ("drm/i915: Move submission tasklet to i915_sched_engine") d2a31d026492 ("drm/i915: Update i915_scheduler to operate on i915_sched_engine") 71ed60112d5d ("drm/i915: Add kick_backend function to i915_sched_engine") 3f623e06cd56 ("drm/i915: Move engine->schedule to i915_sched_engine") 349a2bc5aae4 ("drm/i915: Move active tracking to i915_sched_engine") c4fd7d8cc3ca ("drm/i915: Reset sched_engine.no_priolist immediately after dequeue")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 35aba5f51a39fb95351844ffb14ec02b8970e19f Mon Sep 17 00:00:00 2001 From: Janusz Krzysztofik janusz.krzysztofik@linux.intel.com Date: Mon, 21 Nov 2022 15:56:55 +0100 Subject: [PATCH] drm/i915: Never return 0 if not all requests retired
Users of intel_gt_retire_requests_timeout() expect 0 return value on success. However, we have no protection from passing back 0 potentially returned by a call to dma_fence_wait_timeout() when it succedes right after its timeout has expired.
Replace 0 with -ETIME before potentially using the timeout value as return code, so -ETIME is returned if there are still some requests not retired after timeout, 0 otherwise.
v3: Use conditional expression, more compact but also better reflecting intention standing behind the change.
v2: Move the added lines down so flush_submission() is not affected.
Fixes: f33a8a51602c ("drm/i915: Merge wait_for_timelines with retire_request") Signed-off-by: Janusz Krzysztofik janusz.krzysztofik@linux.intel.com Reviewed-by: Andrzej Hajda andrzej.hajda@intel.com Cc: stable@vger.kernel.org # v5.5+ Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20221121145655.75141-3-janusz.... (cherry picked from commit f301a29f143760ce8d3d6b6a8436d45d3448cde6) Signed-off-by: Rodrigo Vivi rodrigo.vivi@intel.com
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c index edb881d75630..1dfd01668c79 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c @@ -199,7 +199,7 @@ out_active: spin_lock(&timelines->lock); if (remaining_timeout) *remaining_timeout = timeout;
- return active_count ? timeout : 0;
- return active_count ? timeout ?: -ETIME : 0;
} static void retire_work_handler(struct work_struct *work)
On Wed, Jan 04, 2023 at 05:02:10PM +0100, Janusz Krzysztofik wrote:
Hi Greg,
On Wednesday, 4 January 2023 15:39:15 CET gregkh@linuxfoundation.org wrote:
The patch below does not apply to the 5.10-stable tree.
FYI, I can see it already added to v5.10.158, commit 648b92e5760721fbf230e242950182d7e9222143. The same for other stable trees as well as my other fixes for which I received such failure reports from you today.
Then why is it coming in with a different git id into Linus's tree?
This is really really annoying, as I have been saying for years. There's no way for me to know the difference between "this didn't apply for some reason" and "this did not apply because it is already in Linus's tree and has been backported already". So I end up with loads of failures and you end up with an inbox full of junk.
And then the _real_ failures that we need backports for get lost in the noise.
There's no reason why the DRM subsystem is somehow so special that it is the only one broken this way...
{sigh}
greg k-h
linux-stable-mirror@lists.linaro.org