[PATCH v5 3/3] drm/xe/guc/tlb: Flush g2h worker in case of tlb timeout

29 Oct 2024

Flush the g2h worker explicitly if TLB timeout happens which is
observed on LNL and that points to the recent scheduling issue with
E-cores on LNL.
This is similar to the recent fix:
commit e51527233804 ("drm/xe/guc/ct: Flush g2h worker in case of g2h
response timeout") and should be removed once there is E core
scheduling fix.
v2: Add platform check(Himal)
v3: Remove gfx platform check as the issue related to cpu
    platform(John)
    Use the common WA macro(John) and print when the flush
    resolves timeout(Matt B)
v4: Remove the resolves log and do the flush before taking
    pending_lock(Matt A)
Cc: Badal Nilawar badal.nilawar@intel.com
Cc: Matthew Brost matthew.brost@intel.com
Cc: Matthew Auld matthew.auld@intel.com
Cc: John Harrison John.C.Harrison@Intel.com
Cc: Himal Prasad Ghimiray himal.prasad.ghimiray@intel.com
Cc: Lucas De Marchi lucas.demarchi@intel.com
Cc: stable@vger.kernel.org # v6.11+
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2687
Signed-off-by: Nirmoy Das nirmoy.das@intel.com
---
 drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
index 773de1f08db9..3cb228c773cd 100644
--- a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
+++ b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
@@ -72,6 +72,8 @@ static void xe_gt_tlb_fence_timeout(struct work_struct *work)
    struct xe_device *xe = gt_to_xe(gt);
    struct xe_gt_tlb_invalidation_fence *fence, *next;
+	LNL_FLUSH_WORK(&gt->uc.guc.ct.g2h_worker);
+
    spin_lock_irq(&gt->tlb_invalidation.pending_lock);
    list_for_each_entry_safe(fence, next,
    			 &gt->tlb_invalidation.pending_fences, link) {
-- 
2.46.0



    

2025

2024

2023

2022

2021

2020

2019

2018

2017

[PATCH v5 3/3] drm/xe/guc/tlb: Flush g2h worker in case of tlb timeout