In a simple test case that writes to scratch and then busy-waits for the batch to be signaled, we observe that the signal is before the write is posted. That is bad news.
Splitting the flush + write_dword into two separate flush_dw prevents the issue from being reproduced, we can presume the post-sync op is not so post-sync.
Testcase: igt/gem_exec_fence/parallel Signed-off-by: Chris Wilson chris@chris-wilson.co.uk Cc: Mika Kuoppala mika.kuoppala@linux.intel.com Cc: stable@vger.kernel.org --- drivers/gpu/drm/i915/gt/intel_lrc.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index d0be98b67138..a437140a987d 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -5047,7 +5047,8 @@ gen12_emit_fini_breadcrumb_tail(struct i915_request *request, u32 *cs)
static u32 *gen12_emit_fini_breadcrumb(struct i915_request *rq, u32 *cs) { - return gen12_emit_fini_breadcrumb_tail(rq, emit_xcs_breadcrumb(rq, cs)); + cs = emit_xcs_breadcrumb(rq, __gen8_emit_flush_dw(cs, 0, 0, 0)); + return gen12_emit_fini_breadcrumb_tail(rq, cs); }
static u32 *
Chris Wilson chris@chris-wilson.co.uk writes:
In a simple test case that writes to scratch and then busy-waits for the batch to be signaled, we observe that the signal is before the write is posted. That is bad news.
Splitting the flush + write_dword into two separate flush_dw prevents the issue from being reproduced, we can presume the post-sync op is not so post-sync.
Only thing that is mildly surpricing is that first one doesnt need postop write.
Testcase: igt/gem_exec_fence/parallel Signed-off-by: Chris Wilson chris@chris-wilson.co.uk Cc: Mika Kuoppala mika.kuoppala@linux.intel.com Cc: stable@vger.kernel.org
Acked-by: Mika Kuoppala mika.kuoppala@linux.intel.com
drivers/gpu/drm/i915/gt/intel_lrc.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c index d0be98b67138..a437140a987d 100644 --- a/drivers/gpu/drm/i915/gt/intel_lrc.c +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c @@ -5047,7 +5047,8 @@ gen12_emit_fini_breadcrumb_tail(struct i915_request *request, u32 *cs) static u32 *gen12_emit_fini_breadcrumb(struct i915_request *rq, u32 *cs) {
- return gen12_emit_fini_breadcrumb_tail(rq, emit_xcs_breadcrumb(rq, cs));
- cs = emit_xcs_breadcrumb(rq, __gen8_emit_flush_dw(cs, 0, 0, 0));
- return gen12_emit_fini_breadcrumb_tail(rq, cs);
} static u32 * -- 2.20.1
Quoting Mika Kuoppala (2020-11-03 12:44:53)
Chris Wilson chris@chris-wilson.co.uk writes:
In a simple test case that writes to scratch and then busy-waits for the batch to be signaled, we observe that the signal is before the write is posted. That is bad news.
Splitting the flush + write_dword into two separate flush_dw prevents the issue from being reproduced, we can presume the post-sync op is not so post-sync.
Only thing that is mildly surpricing is that first one doesnt need postop write.
The MI_FLUSH_DW is stalling, so the second will^W should wait for the first to complete. (And we don't want to do the write from the first as we observe that write is too early.) -Chris
linux-stable-mirror@lists.linaro.org