On 09/03/2023 04.12, Christian König wrote:
Am 08.03.23 um 20:05 schrieb Asahi Lina:
[SNIP]
Well it's not the better way, it's the only way that works.
I have to admit that my bet on your intentions was wrong, but even that use case doesn't work correctly.
See when your callback returns false it is perfectly possible that all hw fences are signaled between returning that information and processing it.
The result would be that the scheduler goes to sleep and never wakes up again.
That can't happen, because it will just go into another iteration of the drm_sched main loop since there is an entity available still.
Rather there is probably the opposite bug in this patch: the can_run_job logic should be moved into the wait_event_interruptible() condition check, otherwise I think it can end up busy-looping since the condition itself can be true even when the can_run_job check blocks it.
But there is no risk of it going to sleep and never waking up because job completions will wake up the waitqueue by definition, and that happens after the driver-side queues are popped. If this problem could happen, then the existing hw_submission_limit logic would be broken in the same way. It is logically equivalent in how it works.
Basically, if properly done in wait_event_interruptible, it is exactly the logic of that macro that prevents this race condition and makes everything work at all. Without it, drm_sched would be completely broken.
As I said we exercised those ideas before and yes this approach here came up before as well and no it doesn't work.
It can never deadlock with this patch as it stands (though it could busy loop), and if properly moved into the wait_event_interruptible(), it would also never busy loop and work entirely as intended. The actual API change is sound.
I don't know why you're trying so hard to convince everyone that this approach is fundamentally broken... It might be a bad idea for other reasons, it might encourage incorrect usage, it might not be the best option, there are plenty of arguments you can make... but you just keep trying to make an argument that it just can't work at all for some reason. Why? I already said I'm happy dropping it in favor of the fences...
Well because it is broken.
When you move the check into the wait_event_interruptible condition then who is going to call wait_event_interruptible when the condition changes?
I think you mean wake_up_interruptible(). That would be drm_sched_job_done(), on the fence callback when a job completes, which as I keep saying is the same logic used for hw_rq_count/hw_submission_limit tracking.
Please think about it for a second, it's really not that complicated to see why it works:
- Driver pops off completed commands <-- can_run_job condition satisfied - Driver signals fence - drm_sched_job_done_cb() - drm_sched_job_done() - atomic_dec(&sched->hw_rq_count); <-- hw_submission_limit satisfied - ... - wake_up_interruptible(&sched->wake_up_worker); ^- happens after both conditions are potentially satisfied
It really is completely equivalent to just making the hw_rq_count logic customizable by the driver. The actual flow is the same. As long as the driver guarantees it satisfies the can_run_job() condition before signaling the completion fence that triggered that change, it works fine.
As I said this idea came up before and was rejected multiple times.
Maybe it was a different idea, or maybe it was rejected for other reasons, or maybe it was wrongly rejected for being broken when it isn't ^^
~~ Lina