Re: [PATCH 5/6] drm/i915/gt: Serialize GRDOM access between multiple engine resets

4 Jul 2022

On Fri, 1 Jul 2022 08:56:53 +0100
Tvrtko Ursulin tvrtko.ursulin@linux.intel.com wrote:
...
On 30/06/2022 17:01, Mauro Carvalho Chehab wrote:
...
Em Thu, 30 Jun 2022 09:12:41 +0100
Tvrtko Ursulin tvrtko.ursulin@linux.intel.com escreveu:
...
On 30/06/2022 08:32, Mauro Carvalho Chehab wrote:
...
Em Wed, 29 Jun 2022 17:02:59 +0100
Tvrtko Ursulin tvrtko.ursulin@linux.intel.com escreveu:
...
On 29/06/2022 16:30, Mauro Carvalho Chehab wrote:
...
On Tue, 28 Jun 2022 16:49:23 +0100
Tvrtko Ursulin tvrtko.ursulin@linux.intel.com wrote:
        
> .. which for me means a different patch 1, followed by patch 6 (moved
> to be patch 2) would be ideal stable material.
>
> Then we have the current patch 2 which is open/unknown (to me at least).
>
> And the rest seem like optimisations which shouldn't be tagged as fixes.
>
> Apart from patch 5 which should be cc: stable, but no fixes as agreed.
>
> Could you please double check if what I am suggesting here is feasible
> to implement and if it is just send those minimal patches out alone?
Tested and porting just those 3 patches are enough to fix the Broadwell
bug.
So, I submitted a v2 of this series with just those. They all need to
be backported to stable.
I would really like to give even a smaller fix a try. Something like, although not even compile tested:
commit 4d5e94aef164772f4d85b3b4c1a46eac9a2bd680
Author: Chris Wilson chris.p.wilson@intel.com
Date:   Wed Jun 29 16:25:24 2022 +0100
   drm/i915/gt: Serialize TLB invalidates with GT resets
   
   Avoid trying to invalidate the TLB in the middle of performing an
   engine reset, as this may result in the reset timing out. Currently,
   the TLB invalidate is only serialised by its own mutex, forgoing the
   uncore lock, but we can take the uncore->lock as well to serialise
   the mmio access, thereby serialising with the GDRST.
   
   Tested on a NUC5i7RYB, BIOS RYBDWi35.86A.0380.2019.0517.1530 with
   i915 selftest/hangcheck.
   
   Cc: stable@vger.kernel.org
   Fixes: 7938d61591d3 ("drm/i915: Flush TLBs before releasing backing store")
   Reported-by: Mauro Carvalho Chehab <mchehab@kernel.org>
   Tested-by: Mauro Carvalho Chehab <mchehab@kernel.org>
   Reviewed-by: Mauro Carvalho Chehab <mchehab@kernel.org>
   Signed-off-by: Chris Wilson <chris.p.wilson@intel.com>
   Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
   Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
   Reviewed-by: Andi Shyti <andi.shyti@intel.com>
   Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
   Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c
index 8da3314bb6bf..aaadd0b02043 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt.c
@@ -952,7 +952,23 @@ void intel_gt_invalidate_tlbs(struct intel_gt *gt)
           mutex_lock(&gt->tlb_invalidate_lock);
           intel_uncore_forcewake_get(uncore, FORCEWAKE_ALL);

  spin_lock_irq(&uncore->lock); /* serialise invalidate with GT reset */



  for_each_engine(engine, gt, id) {


          struct reg_and_bit rb;



          rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num);


          if (!i915_mmio_reg_offset(rb.reg))


                  continue;



          intel_uncore_write_fw(uncore, rb.reg, rb.bit);


  }



  spin_unlock_irq(&uncore->lock);


     for_each_engine(engine, gt, id) {


          struct reg_and_bit rb;


             /*
              * HW architecture suggest typical invalidation time at 40us,
              * with pessimistic cases up to 100us and a recommendation to



@@ -960,13 +976,11 @@ void intel_gt_invalidate_tlbs(struct intel_gt *gt)
                    */
                   const unsigned int timeout_us = 100;
                   const unsigned int timeout_ms = 4;

          struct reg_and_bit rb;

rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num);
                 if (!i915_mmio_reg_offset(rb.reg))
                         continue;
          intel_uncore_write_fw(uncore, rb.reg, rb.bit);
             if (__intel_wait_for_register_fw(uncore,
                                              rb.reg, rb.bit, 0,
                                              timeout_us, timeout_ms,



...
...
What about intel_engine_pm_is_awake, what will you do with that one?
Ok, let's keep this series plain simple. I'm dropping PM awake logic
as you suggested on v3, keeping just the bare minimal required to
fix the selftest breakage.
That actually means that we're not considering on such backports that TLB 
cache invalidation does add performance penalties and might cause apps
to break.
I suspect that we'll need to also backport at least some of the other
patches like the PM awake logic and the one that avoids TLB cache 
invalidation when the memory was not touched by userspace, but let's
focus first on fixing the regression pointed by selftest.
Regards,
Mauro

    

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH 5/6] drm/i915/gt: Serialize GRDOM access between multiple engine resets