Am 20.04.22 um 20:49 schrieb Christian König:
Am 20.04.22 um 20:41 schrieb Zack Rusin:
On Wed, 2022-04-20 at 19:40 +0200, Christian König wrote:
Am 20.04.22 um 19:38 schrieb Zack Rusin:
On Wed, 2022-04-20 at 09:37 +0200, Christian König wrote:
⚠ External Email
Hi Zack,
Am 20.04.22 um 05:56 schrieb Zack Rusin:
On Thu, 2022-04-07 at 10:59 +0200, Christian König wrote: > Rework the internals of the dma_resv object to allow adding > more > than > one > write fence and remember for each fence what purpose it had. > > This allows removing the workaround from amdgpu which used a > container > for > this instead. > > Signed-off-by: Christian König christian.koenig@amd.com > Reviewed-by: Daniel Vetter daniel.vetter@ffwll.ch > Cc: amd-gfx@lists.freedesktop.org afaict this change broke vmwgfx which now kernel oops right after boot. I haven't had the time to look into it yet, so I'm not sure what's the problem. I'll look at this tomorrow, but just in case you have some clues, the backtrace follows:
that's a known issue and should already be fixed with:
commit d72dcbe9fce505228dae43bef9da8f2b707d1b3d Author: Christian König christian.koenig@amd.com Date: Mon Apr 11 15:21:59 2022 +0200
Unfortunately that doesn't seem to be it. The backtrace is from the current (as of the time of sending of this email) drm-misc-next, which has this change, so it's something else.
Ok, that's strange. In this case I need to investigate further.
Maybe VMWGFX is adding more than one fence and we actually need to reserve multiple slots.
This might be helper code issue with CONFIG_DEBUG_MUTEXES set. On that config dma_resv_reset_max_fences does: fences->max_fences = fences->num_fences; For some objects num_fences is 0 and so after max_fences and num_fences are both 0. And then BUG_ON(num_fences >= max_fences) is triggered.
Yeah, but that's expected behavior.
What's not expected is that max_fences is still 0 (or equal to old num_fences) when VMWGFX tries to add a new fence. The function ttm_eu_reserve_buffers() should have reserved at least one fence slot.
So the underlying problem is that either ttm_eu_reserve_buffers() was never called or VMWGFX tried to add more than one fence.
To figure out what it is could you try the following code fragment:
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c b/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c index f46891012be3..a36f89d3f36d 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c @@ -288,7 +288,7 @@ int vmw_validation_add_bo(struct vmw_validation_context *ctx, val_buf->bo = ttm_bo_get_unless_zero(&vbo->base); if (!val_buf->bo) return -ESRCH; - val_buf->num_shared = 0; + val_buf->num_shared = 16; list_add_tail(&val_buf->head, &ctx->bo_list); bo_node->as_mob = as_mob; bo_node->cpu_blit = cpu_blit;
Thanks, Christian.
Regards, Christian.
z