On 5/12/26 05:07, Deepanshu Kartikey wrote:
virtio_gpu_cursor_plane_update() allocates a virtio_gpu_object_array, locks its dma_resv, and queues a fenced transfer to the host. The lock acquisition can fail in two ways:
- dma_resv_lock_interruptible() returns -EINTR when a signal is delivered while waiting for the reservation lock.
- dma_resv_reserve_fences() returns -ENOMEM if it fails to allocate a fence slot; in this case lock_resv unlocks before returning.
The return value was ignored, so the cursor path could proceed with the resv lock not held. The queue path then walks the object array and calls dma_resv_add_fence(), which requires the lock; with lockdep enabled this trips dma_resv_assert_held():
WARNING: drivers/dma-buf/dma-resv.c:296 at dma_resv_add_fence+0x71e/0x840 Call Trace: virtio_gpu_array_add_fence virtio_gpu_queue_ctrl_sgs virtio_gpu_queue_fenced_ctrl_buffer virtio_gpu_cursor_plane_update drm_atomic_helper_commit_planes drm_atomic_helper_commit_tail commit_tail drm_atomic_helper_commit drm_atomic_commit drm_atomic_helper_update_plane __setplane_atomic drm_mode_cursor_universal drm_mode_cursor_common drm_mode_cursor_ioctl drm_ioctl __x64_sys_ioctl
Beyond the WARN, mutating the dma_resv fence list without the lock races with concurrent readers/writers and can corrupt the list.
The DRM atomic helpers do not allow .atomic_update to fail: by the time it runs, the commit has been signed off to userspace and there is no clean rollback path. Move the fallible work -- objs allocation, dma_resv locking, and fence slot reservation -- into virtio_gpu_plane_prepare_fb, which is the designated callback for resource acquisition and may return errors that the framework handles by rolling back the commit. Stash the prepared object array on virtio_gpu_plane_state so the update step can consume it.
Make virtio_gpu_plane_cleanup_fb release the objs if the commit was rolled back before update ran (i.e., objs not consumed). The queue path already unlocks the resv after attaching the fence (vq.c:411) and frees the array via put_free_delayed after host completion (vq.c:271), so the update step only needs to clear vgplane_st->objs to transfer ownership.
Simplify virtio_gpu_cursor_plane_update to a no-fail queue submission that hands the prepared, locked objs to the queue path.
The bug was reported by syzbot, triggered via fault injection (fail_nth) on the DRM_IOCTL_MODE_CURSOR path, which forces the -ENOMEM branch in dma_resv_reserve_fences().
Reported-by: syzbot+72bd3dd3a5d5f39a0271@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=72bd3dd3a5d5f39a0271 Fixes: 5cfd31c5b3a3 ("drm/virtio: fix virtio_gpu_cursor_plane_update().") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20260510053025.100224-1-kartikey406@gmail.com/T/ [v1] Signed-off-by: Deepanshu Kartikey kartikey406@gmail.com
v2: Move resv lock acquisition from .atomic_update (which must not fail) to .prepare_fb (which may), per maintainer review of v1. The previous approach of silently skipping the cursor update on lock failure violated the atomic-commit contract with userspace.
drivers/gpu/drm/virtio/virtgpu_drv.h | 1 + drivers/gpu/drm/virtio/virtgpu_plane.c | 38 ++++++++++++++++++++------ 2 files changed, 30 insertions(+), 9 deletions(-)
diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h b/drivers/gpu/drm/virtio/virtgpu_drv.h index f17660a71a3e..e51f959dce46 100644 --- a/drivers/gpu/drm/virtio/virtgpu_drv.h +++ b/drivers/gpu/drm/virtio/virtgpu_drv.h @@ -198,6 +198,7 @@ struct virtio_gpu_framebuffer { struct virtio_gpu_plane_state { struct drm_plane_state base; struct virtio_gpu_fence *fence;
- struct virtio_gpu_object_array *objs;
}; #define to_virtio_gpu_plane_state(x) \ container_of(x, struct virtio_gpu_plane_state, base) diff --git a/drivers/gpu/drm/virtio/virtgpu_plane.c b/drivers/gpu/drm/virtio/virtgpu_plane.c index a126d1b25f46..b0511ace89e6 100644 --- a/drivers/gpu/drm/virtio/virtgpu_plane.c +++ b/drivers/gpu/drm/virtio/virtgpu_plane.c @@ -381,6 +381,23 @@ static int virtio_gpu_plane_prepare_fb(struct drm_plane *plane, goto err_fence; }
- if (plane->type == DRM_PLANE_TYPE_CURSOR && bo->dumb) {
struct virtio_gpu_object_array *objs;objs = virtio_gpu_array_alloc(1);if (!objs) {ret = -ENOMEM;goto err_fence;}virtio_gpu_array_add_obj(objs, vgfb->base.obj[0]);ret = virtio_gpu_array_lock_resv(objs);if (ret) {virtio_gpu_array_put_free(objs);goto err_fence;}vgplane_st->objs = objs;- }
- return 0;
err_fence: @@ -417,6 +434,12 @@ static void virtio_gpu_plane_cleanup_fb(struct drm_plane *plane, vgplane_st->fence = NULL; }
- if (vgplane_st->objs) {
virtio_gpu_array_unlock_resv(vgplane_st->objs);virtio_gpu_array_put_free(vgplane_st->objs);vgplane_st->objs = NULL;- }
- obj = state->fb->obj[0]; if (drm_gem_is_imported(obj)) virtio_gpu_cleanup_imported_obj(obj);
@@ -452,21 +475,18 @@ static void virtio_gpu_cursor_plane_update(struct drm_plane *plane, } if (bo && bo->dumb && (plane->state->fb != old_state->fb)) {
/* new cursor -- update & wait */struct virtio_gpu_object_array *objs;objs = virtio_gpu_array_alloc(1);if (!objs)return;virtio_gpu_array_add_obj(objs, vgfb->base.obj[0]);virtio_gpu_array_lock_resv(objs);
/* objs and fence were prepared in virtio_gpu_plane_prepare_fb;* the resv is already locked. The queue path takes ownership* of objs and unlocks the resv after attaching the fence. virtio_gpu_cmd_transfer_to_host_2d (vgdev, 0, plane->state->crtc_w, plane->state->crtc_h,*/
0, 0, objs, vgplane_st->fence);
virtio_gpu_notify(vgdev); dma_fence_wait(&vgplane_st->fence->f, true);0, 0, vgplane_st->objs, vgplane_st->fence); }vgplane_st->objs = NULL;if (plane->state->fb != old_state->fb) {
I'm getting lockup with this patch applied and now see that virtio_gpu_resource_flush() also locks BO.
Easiest option might be to add uninterruptible variant of virtio_gpu_array_lock_resv(). Could you please try it for v3?
linaro-mm-sig@lists.linaro.org