Quoting Rodrigo Vivi (2021-01-07 19:50:37)
On Fri, Oct 16, 2020 at 06:54:11PM +0100, Chris Wilson wrote:
MEDIA_STATE_VFE only accepts the 'maximum number of threads' in the range [0, n-1] where n is #EU * (#threads/EU) with the number of threads based on plaform and the number of EU based on the number of slices and subslices. This is a fixed number per platform/gt, so appropriately limit the number of threads we spawn to match the device.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2024
we need to get this closed...
Unfortunately this failed the validation test. And as that test is still not in CI, I cannot say why. My vote would be to remove the clear_residuals until it works on all target platforms. Plus we clearly need a hsw-gt1 in CI.
bv->scratch_size = bv->surface_height * bv->surface_width;
@@ -244,7 +258,6 @@ gen7_emit_vfe_state(struct batch_chunk *batch, u32 urb_size, u32 curbe_size, u32 mode) {
u32 urb_entries = bv->max_urb_entries; u32 threads = bv->max_primitives - 1; u32 *cs = batch_alloc_items(batch, 32, 8);
@@ -254,7 +267,7 @@ gen7_emit_vfe_state(struct batch_chunk *batch, *cs++ = 0; /* number of threads & urb entries for GPGPU vs Media Mode */
*cs++ = threads << 16 | urb_entries << 8 | mode << 2;
*cs++ = threads << 16 | 1 << 8 | mode << 2;
why urb_entries = 1 ?
We only used a single entry. There was no measurable benefit from assigning more entries, and the importance of any side effects from doing so unknown.
the range is 0,64 and 0,128 depending on the sku.
in general there's a min of 32 URBs
Don't forget num_entries * entry_size must fit within the URB allocation/allotment. -Chris