Hi Matt,
first of all thanks a lot for the observations you are raising.
On Wed, Feb 21, 2024 at 12:51:04PM -0800, Matt Roper wrote:
On Wed, Feb 21, 2024 at 01:12:18AM +0100, Andi Shyti wrote:
On Tue, Feb 20, 2024 at 03:39:18PM -0800, Matt Roper wrote:
On Tue, Feb 20, 2024 at 03:35:26PM +0100, Andi Shyti wrote:
...
diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c index a425db5ed3a2..e19df4ef47f6 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.c +++ b/drivers/gpu/drm/i915/gt/intel_gt.c @@ -168,6 +168,14 @@ static void init_unused_rings(struct intel_gt *gt) } } +static void intel_gt_apply_ccs_mode(struct intel_gt *gt) +{
- if (!IS_DG2(gt->i915))
return;
- intel_uncore_write(gt->uncore, XEHP_CCS_MODE, 0);
This doesn't look right to me. A value of 0 means every cslice gets associated with CCS0.
Yes, that's what I'm trying to do. The behavior I'm looking for is this one:
/* ... * With 1 engine (ccs0): * slice 0, 1, 2, 3: ccs0 * * With 2 engines (ccs0, ccs1): * slice 0, 2: ccs0 * slice 1, 3: ccs1 * * With 4 engines (ccs0, ccs1, ccs2, ccs3): * slice 0: ccs0 * slice 1: ccs1 * slice 2: ccs2 * slice 3: ccs3 ... */
where the user can configure runtime the mode, making sure that no client is connected to i915.
But, this needs to be written
As we are now forcing mode '1', then all cslices are connected with ccs0.
Right --- and that's what I'm pointing out as illegal. I think that code comment above was taken out of context from a different RFC series; that's not an accurate description of the behavior we want here.
First, the above comment is using ccs# to refer to userspace engines, not hardware engines. As a simple example, DG2-G11 only ever has a single CCS which userspace sees as "instance 0" but which is actually CCS1 at the hardware level. If you try to follow the comment above when programming CCS_MODE, you've assigned all of the cslices to a non-existent engine and assigned no cslices to the CCS engine that actually exists. For DG2-G10 (and I think DG2-G12), there are different combinations of fused-off / not-fused-off engines that will always show up in userspace as CCS0-CCSn, even if those don't match the hardware IDs.
Second, the above comment is assuming that you have a part with a maximum fusing config (i.e., all cslices present). Using DG2-G11 again as an example, there's also only a single cslice (cslice1), so if you tell CCS1 that it's allowed to use EUs from non-existent cslice0, cslice2, and cslice3, you might not get the behavior you were hoping for.
if the hardware slices are fused off we wouldn't see them in a first place, right? And that's anyway a permanent configuration that wouldn't affect the patch.
BTW, is there any machine I can test this scenario?
On a DG2-G11 platform, that will flat out break compute since CCS0 is never present (G11 only has a single CCS and it's always the hardware's CCS1). Even on a G10 or G12 this could also break things depending on the fusing of your card if the hardware CCS0 happens to be missing.
Also, the register says that we need a field value of 0x7 for each cslice that's fused off. By passing 0, we're telling the CCS engine that it can use cslices that may not actually exist.
does it? Or do I need to write the id (0x0-0x3) of the user engine? That's how the mode is calculated in this algorithm.
0x0 - 0x3 are how you specify that a specific CCS engine can use the cslice. If the cslice can't be used at all (i.e., it's fused off), then you need to program a 0x7 to ensure no engines try to use the non-existent DSS/EUs.
I planned to limit this to the only DG2 (and ATSM, of course). Do you think it would it help?
Andi