On Thu, Jul 22, 2021 at 1:42 AM Christian König ckoenig.leichtzumerken@gmail.com wrote:
Am 21.07.21 um 21:03 schrieb Daniel Vetter:
On Wed, Jul 21, 2021 at 09:34:43AM -0700, Rob Clark wrote:
On Wed, Jul 21, 2021 at 12:59 AM Daniel Vetter daniel@ffwll.ch wrote:
On Wed, Jul 21, 2021 at 12:32 AM Rob Clark robdclark@gmail.com wrote:
On Tue, Jul 20, 2021 at 1:55 PM Daniel Vetter daniel@ffwll.ch wrote:
On Tue, Jul 20, 2021 at 8:26 PM Rob Clark robdclark@gmail.com wrote: > On Tue, Jul 20, 2021 at 11:03 AM Christian König > ckoenig.leichtzumerken@gmail.com wrote: >> Hi Rob, >> >> Am 20.07.21 um 17:07 schrieb Rob Clark: >>> From: Rob Clark robdclark@chromium.org >>> >>> Somehow we had neither ->wait() nor dma_fence_signal() calls, and no >>> one noticed. Oops. >> >> I'm not sure if that is a good idea. >> >> The dma_fence->wait() callback is pretty much deprecated and should not >> be used any more. >> >> What exactly do you need that for? > Well, the alternative is to track the set of fences which have > signalling enabled, and then figure out which ones to signal, which > seems like a lot more work, vs just re-purposing the wait > implementation we already have for non-dma_fence cases ;-) > > Why is the ->wait() callback (pretty much) deprecated? Because if you need it that means for your driver dma_fence_add_cb is broken, which means a _lot_ of things don't work. Like dma_buf poll (compositors have patches to start using that), and I think drm/scheduler also becomes rather unhappy.
I'm starting to page back in how this works.. fence cb's aren't broken (which is also why dma_fence_wait() was not completely broken), because in retire_submits() we call dma_fence_is_signaled(submit->hw_fence).
But the reason that the custom wait function cleans up a tiny bit of jank is that the wait_queue_head_t gets signaled earlier, before we start iterating the submits and doing all that retire_submit() stuff (unpin/unref bo's, etc). I suppose I could just split things up to call dma_fence_signal() earlier, and *then* do the retire_submits() stuff.
Yeah reducing the latency there sounds like a good idea. -Daniel
Hmm, no, turns out that isn't the problem.. or, well, it is probably a good idea to call drm_fence_signal() earlier. But it seems like waking up from wait_event_* is faster than wake_up_state(wait->task, TASK_NORMAL). I suppose the wake_up_state() approach still needs for the scheduler to get around to schedule the runnable task.
As far as I know wake_up_state() tries to run the thread on the CPU it was scheduled last, while wait_event_* makes the thread run on the CPU who issues the wake by default.
And yes I've also noticed this already and it was one of the reason why I suggested to use a wait_queue instead of the hand wired dma_fence_wait implementation.
So for now, I'm going back to my own wait function (plus earlier drm_fence_signal())
Before removing dma_fence_opps::wait(), I guess we want to re-think dma_fence_default_wait().. but I think that would require a dma_fence_context base class (rather than just a raw integer).
Uh that's not great ... can't we fix this instead of papering over it in drivers? Aside from maybe different wakeup flags it all is supposed to work exactly the same underneath, and whether using a wait queue or not really shouldn't matter.
Well it would have been nicer if we used the existing infrastructure instead of re-inventing stuff for dma_fence, but that chance is long gone.
And you don't need a dma_fence_context base class, but rather just a flag in the dma_fence_ops if you want to change the behavior.
Hmm, I was thinking dma_fence_context to have a place for the wait_queue_head, but I guess that could also be per-dma_fence