On Thu, Apr 4, 2013 at 6:49 PM, Peter Zijlstra peterz@infradead.org wrote:
On Thu, 2013-04-04 at 15:31 +0200, Daniel Vetter wrote:
We've discussed this approach of using (rt-prio, age) instead of just age to determine the the "oldness" of a task for deadlock-breaking with -EAGAIN. The problem is that through PI boosting or normal rt-prio changes while tasks are trying to acquire ww_mutexes you can create acyclic loops in the blocked-on graph of ww_mutexes after the fact and so cause deadlocks. So we've convinced ourselves that this approche doesn't work.
Could you pretty please draw me a picture, I'm having trouble seeing what you mean.
AFAICT as long as you boost Y while its the lock owner things should work out, no?
So this one here took a bit of pondering. But I think you'll like the conclusion.
Now the deadlock detection works because it impose a dag onto all lockers which clearly denotes who'll wait and who will get wounded. The problem with using (rt_prio, ww_age/ticket) instead of just the ticket si that it's ambigous in the face of PI boosting. E.g. - rt_prio of A > rt_prio of B, but - task A is younger than task B
Before boosting task A wins, but after boosting (when only the age matters) task B wins, since it's now older. So now when B comes around and tries to grab a lock A currently holds, it'll also block.
Now the cool thing with TASK_KILLABLE (hey, I start to appreciate it, bear with me!) is that this is no problem - it limits the length of any chain of blocked tasks to just one link of blocked tasks: - If the length is currently one it cannot be extended - the blocked task will set the PF_GTFO flag on the current holder, and since that would be checked before blocking on another ww_mutex the chain can't grow past one blocked task. - If the current holder itself is blocked on another ww_mutex it'll be in TASK_DEADLOCK state and get woken up.
In both case the current holder of the lock will either continue with what it's been doing (PI-boosted ofc) until it unlocks everything or hits the PF_GTFO check and backs off from all currently held locks. RT mission accomplished I think since the wait time for the highest-prio task for grabbing a lock is bounded by the lock hold time for the completely uncontended case. And within a given rt-prio the normal wait/wound algorithm will resolve conflicts (and ensure forward progress).
So I'm now rather convinced that with the TASK_DEADLOCK implementation we can make (rt_prio, age) lock ordering work. But it's not an artifact of consistent PI boosting (I've raced down that alley first trying to prove it to no avail), but the fact that lock holder kicking through PF_GTFO naturally limits blocked task chains on ww_mutexes to just one link. That in turn makes any post-PI-boosted considerations moot and so can't result in havoc due to a PI-boost having flipped an edge in the lock_er_ ordering DAG (we sort tasks with wait/wound, not locks, hence it's not a lock ordering DAG).
Please poke holes into this argument, I've probably missed a sublety somewhere ... -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch