Re: [Intel-gfx] [PATCH] drm/i915/gt: Limit VFE threads based on GT

7 Jan 2021


      Quoting Rodrigo Vivi (2021-01-07 19:50:37)
...
On Fri, Oct 16, 2020 at 06:54:11PM +0100, Chris Wilson wrote:
...
MEDIA_STATE_VFE only accepts the 'maximum number of threads' in the
range [0, n-1] where n is #EU * (#threads/EU) with the number of threads
based on plaform and the number of EU based on the number of slices and
subslices. This is a fixed number per platform/gt, so appropriately
limit the number of threads we spawn to match the device.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2024
we need to get this closed...
Unfortunately this failed the validation test. And as that test is still
not in CI, I cannot say why. My vote would be to remove the
clear_residuals until it works on all target platforms. Plus we clearly
need a hsw-gt1 in CI.
...
...
  bv->scratch_size = bv->surface_height * bv->surface_width;

@@ -244,7 +258,6 @@ gen7_emit_vfe_state(struct batch_chunk *batch,
                  u32 urb_size, u32 curbe_size,
                  u32 mode)
 {

u32 urb_entries = bv->max_urb_entries;
u32 threads = bv->max_primitives - 1;
u32 *cs = batch_alloc_items(batch, 32, 8);



@@ -254,7 +267,7 @@ gen7_emit_vfe_state(struct batch_chunk *batch,
      *cs++ = 0;
 
      /* number of threads & urb entries for GPGPU vs Media Mode */

*cs++ = threads << 16 | urb_entries << 8 | mode << 2;




*cs++ = threads << 16 | 1 << 8 | mode << 2;



why urb_entries = 1 ?
We only used a single entry. There was no measurable benefit from
assigning more entries, and the importance of any side effects from doing
so unknown.
...
the range is 0,64 and 0,128 depending on the sku.
in general there's a min of 32 URBs
Don't forget num_entries * entry_size must fit within the URB
allocation/allotment.
-Chris

    

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [Intel-gfx] [PATCH] drm/i915/gt: Limit VFE threads based on GT