6.18-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jan Maslak jan.maslak@intel.com
[ Upstream commit eed5b815fa49c17d513202f54e980eb91955d3ed ]
During GT reset recovery in do_gt_restart(), xe_uc_start() was called before xe_reg_sr_apply_mmio() restored engine-specific registers. This created a race window where the scheduler could run jobs before hardware state was fully restored.
This caused failures in eudebug tests (xe_exec_sip_eudebug@breakpoint- waitsip-*) where TD_CTL register (containing TD_CTL_GLOBAL_DEBUG_ENABLE) wasn't restored before jobs started executing. Breakpoints would fail to trigger SIP entry because the debug enable bit wasn't set yet.
Fix by moving xe_uc_start() after all MMIO register restoration, including engine registers and CCS mode configuration, ensuring all hardware state is fully restored before any jobs can be scheduled.
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Jan Maslak jan.maslak@intel.com Reviewed-by: Jonathan Cavitt jonathan.cavitt@intel.com Reviewed-by: Matthew Brost matthew.brost@intel.com Signed-off-by: Matthew Brost matthew.brost@intel.com Link: https://patch.msgid.link/20251210145618.169625-2-jan.maslak@intel.com (cherry picked from commit 825aed0328588b2837636c1c5a0c48795d724617) Signed-off-by: Thomas Hellström thomas.hellstrom@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/xe/xe_gt.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c index 6d3db5e55d98..61bed3b04ded 100644 --- a/drivers/gpu/drm/xe/xe_gt.c +++ b/drivers/gpu/drm/xe/xe_gt.c @@ -784,9 +784,6 @@ static int do_gt_restart(struct xe_gt *gt) xe_gt_sriov_pf_init_hw(gt);
xe_mocs_init(gt); - err = xe_uc_start(>->uc); - if (err) - return err;
for_each_hw_engine(hwe, gt, id) xe_reg_sr_apply_mmio(&hwe->reg_sr, gt); @@ -794,6 +791,10 @@ static int do_gt_restart(struct xe_gt *gt) /* Get CCS mode in sync between sw/hw */ xe_gt_apply_ccs_mode(gt);
+ err = xe_uc_start(>->uc); + if (err) + return err; + /* Restore GT freq to expected values */ xe_gt_sanitize_freq(gt);