Both Tvrtko [1] and I [2] have recently proposed some improvals for drm_sched.
While taking Tvrtko's feedback into account for my patch, I realized that both his and my patch can be fully replaced with a bigger and far more beautiful series.
If I am not mistaken, it turns out that the entire entity->entity_idle completion is also nothing but a workaround around the grave mistake of not using the greatest helper with parallel programming that exists in computer science: Locking.
This series adds locking to the last_scheduled field and all checks related to detect the idleness of the entity. As before, the job_scheduled event queue causes the periodic checks.
This way, we can get rid of memory barriers, RCU, a few lines of code, make things more readable, understandable...
Tested with drm-sched-unit tests. I'm a bit busy right now, but wanted to show you guys the idea. Before merging I'd test it more exhaustively with Nouveau.
Greetings, Philipp
[1] https://lore.kernel.org/dri-devel/20260611123423.39819-1-tvrtko.ursulin@igal... [2] https://lore.kernel.org/dri-devel/20260626081942.2122144-2-phasta@kernel.org...
Philipp Stanner (5): drm/sched: Protect entity->last_scheduled with spinlock drm/sched: Lock spsc_queue in drm_sched_entity_pop_job() drm/sched: Avoid lock cycle for sched_entity drm/sched: Lock drm_sched_entity_is_idle() drm/sched: Remove entity->entity_idle
drivers/gpu/drm/scheduler/sched_entity.c | 75 +++++++++++------------- drivers/gpu/drm/scheduler/sched_main.c | 2 - drivers/gpu/drm/scheduler/sched_rq.c | 5 +- include/drm/gpu_scheduler.h | 16 ++--- 4 files changed, 41 insertions(+), 57 deletions(-)
base-commit: be4f10d44757211fd656fa57f37034657f26c883