On Wed, Mar 06, 2024 at 02:22:45AM +0100, Andi Shyti wrote:
The hardware should not dynamically balance the load between CCS engines. Wa_14019159160 recommends disabling it across all platforms.
Fixes: d2eae8e98d59 ("drm/i915/dg2: Drop force_probe requirement") Signed-off-by: Andi Shyti andi.shyti@linux.intel.com Cc: Chris Wilson chris.p.wilson@linux.intel.com Cc: Joonas Lahtinen joonas.lahtinen@linux.intel.com Cc: Matt Roper matthew.d.roper@intel.com Cc: stable@vger.kernel.org # v6.2+
drivers/gpu/drm/i915/gt/intel_gt_regs.h | 1 + drivers/gpu/drm/i915/gt/intel_workarounds.c | 5 +++++ 2 files changed, 6 insertions(+)
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h b/drivers/gpu/drm/i915/gt/intel_gt_regs.h index 50962cfd1353..cf709f6c05ae 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h +++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h @@ -1478,6 +1478,7 @@ #define GEN12_RCU_MODE _MMIO(0x14800) #define GEN12_RCU_MODE_CCS_ENABLE REG_BIT(0) +#define XEHP_RCU_MODE_FIXED_SLICE_CCS_MODE REG_BIT(1) #define CHV_FUSE_GT _MMIO(VLV_GUNIT_BASE + 0x2168) #define CHV_FGT_DISABLE_SS0 (1 << 10) diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index d67d44611c28..a2e78cf0b5f5 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -2945,6 +2945,11 @@ general_render_compute_wa_init(struct intel_engine_cs *engine, struct i915_wa_li /* Wa_18028616096 */ wa_mcr_write_or(wal, LSC_CHICKEN_BIT_0_UDW, UGM_FRAGMENT_THRESHOLD_TO_3);
/*
* Wa_14019159160: disable the automatic CCS load balancing
I'm still a bit concerned that this doesn't really match what this specific workaround is asking us to do. There seems to be an agreement on various internal email threads that we need to disable load balancing, but there's no single specific workaround that officially documents that decision.
This specific workaround asks us to do a bunch of different things, and the third item it asks for is to disable load balancing in very specific cases (i.e., while the RCS is active at the same time as one or more CCS engines). Taking this workaround in isolation, it would be valid to keep load balancing active if you were just using the CCS engines and leaving the RCS idle, or if balancing was turned on/off by the GuC scheduler according to engine use at the moment, as the documented workaround seems to assume will be the case.
So in general I think we do need to disable load balancing based on other offline discussion, but blaming that entire change on Wa_14019159160 seems a bit questionable since it's not really what this specific workaround is asking us to do and someone may come back and try to "correct" the implementation of this workaround in the future without realizing there are other factors too. It would be great if we could get hardware teams to properly document this expectation somewhere (either in a separate dedicated workaround, or in the MMIO tuning guide) so that we'll have a more direct and authoritative source for such a large behavioral change.
Matt
*/
}wa_masked_en(wal, GEN12_RCU_MODE, XEHP_RCU_MODE_FIXED_SLICE_CCS_MODE);
if (IS_DG2_G11(i915)) { -- 2.43.0