This is a small fallout from a work to allow batching WW mutex locks and unlocks.
Our Wound-Wait mutexes actually don't use the Wound-Wait algorithm but the Wait-Die algorithm. One could perhaps rename those mutexes tree-wide to "Wait-Die mutexes" or "Deadlock Avoidance mutexes". Another approach suggested here is to implement also the "Wound-Wait" algorithm as a per-WW-class choice, as it has advantages in some cases. See for example
http://www.mathcs.emory.edu/~cheung/Courses/554/Syllabus/8-recv+serial/deadl...
Now Wound-Wait is a preemptive algorithm, and the preemption is implemented using a lazy scheme: If a wounded transaction is about to go to sleep on a contended WW mutex, we return -EDEADLK. That is sufficient for deadlock prevention. Since with WW mutexes we also require the aborted transaction to sleep waiting to lock the WW mutex it was aborted on, this choice also provides a suitable WW mutex to sleep on. If we were to return -EDEADLK on the first WW mutex lock after the transaction was wounded whether the WW mutex was contended or not, the transaction might frequently be restarted without a wait, which is far from optimal. Note also that with the lazy preemption scheme, contrary to Wait-Die there will be no rollbacks on lock contention of locks held by a transaction that has completed its locking sequence.
The modeset locks are then changed from Wait-Die to Wound-Wait since the typical locking pattern of those locks very well matches the criterion for a substantial reduction in the number of rollbacks. For reservation objects, the benefit is more unclear at this point and they remain using Wait-Die.
Cc: Ingo Molnar mingo@redhat.com Cc: Jonathan Corbet corbet@lwn.net Cc: Gustavo Padovan gustavo@padovan.org Cc: Maarten Lankhorst maarten.lankhorst@linux.intel.com Cc: Sean Paul seanpaul@chromium.org Cc: David Airlie airlied@linux.ie Cc: Davidlohr Bueso dave@stgolabs.net Cc: "Paul E. McKenney" paulmck@linux.vnet.ibm.com Cc: Josh Triplett josh@joshtriplett.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Kate Stewart kstewart@linuxfoundation.org Cc: Philippe Ombredanne pombredanne@nexb.com Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: linux-doc@vger.kernel.org Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org
v2: Updated DEFINE_WW_CLASS() API, minor code- and comment fixes as detailed in each patch. v3: Included cleanup patch from Peter Zijlstra. Documentation fixes and a small correctness fix. v4: Reworked the correctness fix.
From: Peter Ziljstra peterz@infradead.org
Make the WW mutex code more readable by adding comments, splitting up functions and pointing out that we're actually using the Wait-Die algorithm.
Cc: Ingo Molnar mingo@redhat.com Cc: Jonathan Corbet corbet@lwn.net Cc: Gustavo Padovan gustavo@padovan.org Cc: Maarten Lankhorst maarten.lankhorst@linux.intel.com Cc: Sean Paul seanpaul@chromium.org Cc: David Airlie airlied@linux.ie Cc: Davidlohr Bueso dave@stgolabs.net Cc: "Paul E. McKenney" paulmck@linux.vnet.ibm.com Cc: Josh Triplett josh@joshtriplett.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Kate Stewart kstewart@linuxfoundation.org Cc: Philippe Ombredanne pombredanne@nexb.com Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: linux-doc@vger.kernel.org Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Co-authored-by: Thomas Hellstrom thellstrom@vmware.com Signed-off-by: Thomas Hellstrom thellstrom@vmware.com --- Documentation/locking/ww-mutex-design.txt | 12 +- include/linux/ww_mutex.h | 28 ++--- kernel/locking/mutex.c | 202 ++++++++++++++++++------------ 3 files changed, 145 insertions(+), 97 deletions(-)
diff --git a/Documentation/locking/ww-mutex-design.txt b/Documentation/locking/ww-mutex-design.txt index 34c3a1b50b9a..2fd7f2a2af21 100644 --- a/Documentation/locking/ww-mutex-design.txt +++ b/Documentation/locking/ww-mutex-design.txt @@ -32,10 +32,10 @@ the oldest task) wins, and the one with the higher reservation id (i.e. the younger task) unlocks all of the buffers that it has already locked, and then tries again.
-In the RDBMS literature this deadlock handling approach is called wait/wound: +In the RDBMS literature this deadlock handling approach is called wait/die: The older tasks waits until it can acquire the contended lock. The younger tasks needs to back off and drop all the locks it is currently holding, i.e. the -younger task is wounded. +younger task dies.
Concepts -------- @@ -56,9 +56,9 @@ Furthermore there are three different class of w/w lock acquire functions:
* Normal lock acquisition with a context, using ww_mutex_lock.
-* Slowpath lock acquisition on the contending lock, used by the wounded task - after having dropped all already acquired locks. These functions have the - _slow postfix. +* Slowpath lock acquisition on the contending lock, used by the task that just + killed its transaction after having dropped all already acquired locks. + These functions have the _slow postfix.
From a simple semantics point-of-view the _slow functions are not strictly required, since simply calling the normal ww_mutex_lock functions on the @@ -220,7 +220,7 @@ mutexes are a natural fit for such a case for two reasons:
Note that this approach differs in two important ways from the above methods: - Since the list of objects is dynamically constructed (and might very well be - different when retrying due to hitting the -EDEADLK wound condition) there's + different when retrying due to hitting the -EDEADLK die condition) there's no need to keep any object on a persistent list when it's not locked. We can therefore move the list_head into the object itself. - On the other hand the dynamic object list construction also means that the -EALREADY return diff --git a/include/linux/ww_mutex.h b/include/linux/ww_mutex.h index 39fda195bf78..f82fce2229c8 100644 --- a/include/linux/ww_mutex.h +++ b/include/linux/ww_mutex.h @@ -6,7 +6,7 @@ * * Copyright (C) 2004, 2005, 2006 Red Hat, Inc., Ingo Molnar mingo@redhat.com * - * Wound/wait implementation: + * Wait/Die implementation: * Copyright (C) 2013 Canonical Ltd. * * This file contains the main data structure and API definitions. @@ -28,9 +28,9 @@ struct ww_class { struct ww_acquire_ctx { struct task_struct *task; unsigned long stamp; - unsigned acquired; + unsigned int acquired; #ifdef CONFIG_DEBUG_MUTEXES - unsigned done_acquire; + unsigned int done_acquire; struct ww_class *ww_class; struct ww_mutex *contending_lock; #endif @@ -38,8 +38,8 @@ struct ww_acquire_ctx { struct lockdep_map dep_map; #endif #ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH - unsigned deadlock_inject_interval; - unsigned deadlock_inject_countdown; + unsigned int deadlock_inject_interval; + unsigned int deadlock_inject_countdown; #endif };
@@ -102,7 +102,7 @@ static inline void ww_mutex_init(struct ww_mutex *lock, * * Context-based w/w mutex acquiring can be done in any order whatsoever within * a given lock class. Deadlocks will be detected and handled with the - * wait/wound logic. + * wait/die logic. * * Mixing of context-based w/w mutex acquiring and single w/w mutex locking can * result in undetected deadlocks and is so forbidden. Mixing different contexts @@ -195,13 +195,13 @@ static inline void ww_acquire_fini(struct ww_acquire_ctx *ctx) * Lock the w/w mutex exclusively for this task. * * Deadlocks within a given w/w class of locks are detected and handled with the - * wait/wound algorithm. If the lock isn't immediately avaiable this function + * wait/die algorithm. If the lock isn't immediately available this function * will either sleep until it is (wait case). Or it selects the current context - * for backing off by returning -EDEADLK (wound case). Trying to acquire the + * for backing off by returning -EDEADLK (die case). Trying to acquire the * same lock with the same context twice is also detected and signalled by * returning -EALREADY. Returns 0 if the mutex was successfully acquired. * - * In the wound case the caller must release all currently held w/w mutexes for + * In the die case the caller must release all currently held w/w mutexes for * the given context and then wait for this contending lock to be available by * calling ww_mutex_lock_slow. Alternatively callers can opt to not acquire this * lock and proceed with trying to acquire further w/w mutexes (e.g. when @@ -226,14 +226,14 @@ extern int /* __must_check */ ww_mutex_lock(struct ww_mutex *lock, struct ww_acq * Lock the w/w mutex exclusively for this task. * * Deadlocks within a given w/w class of locks are detected and handled with the - * wait/wound algorithm. If the lock isn't immediately avaiable this function + * wait/die algorithm. If the lock isn't immediately available this function * will either sleep until it is (wait case). Or it selects the current context - * for backing off by returning -EDEADLK (wound case). Trying to acquire the + * for backing off by returning -EDEADLK (die case). Trying to acquire the * same lock with the same context twice is also detected and signalled by * returning -EALREADY. Returns 0 if the mutex was successfully acquired. If a * signal arrives while waiting for the lock then this function returns -EINTR. * - * In the wound case the caller must release all currently held w/w mutexes for + * In the die case the caller must release all currently held w/w mutexes for * the given context and then wait for this contending lock to be available by * calling ww_mutex_lock_slow_interruptible. Alternatively callers can opt to * not acquire this lock and proceed with trying to acquire further w/w mutexes @@ -256,7 +256,7 @@ extern int __must_check ww_mutex_lock_interruptible(struct ww_mutex *lock, * @lock: the mutex to be acquired * @ctx: w/w acquire context * - * Acquires a w/w mutex with the given context after a wound case. This function + * Acquires a w/w mutex with the given context after a die case. This function * will sleep until the lock becomes available. * * The caller must have released all w/w mutexes already acquired with the @@ -290,7 +290,7 @@ ww_mutex_lock_slow(struct ww_mutex *lock, struct ww_acquire_ctx *ctx) * @lock: the mutex to be acquired * @ctx: w/w acquire context * - * Acquires a w/w mutex with the given context after a wound case. This function + * Acquires a w/w mutex with the given context after a die case. This function * will sleep until the lock becomes available and returns 0 when the lock has * been acquired. If a signal arrives while waiting for the lock then this * function returns -EINTR. diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c index 2048359f33d2..412b4fc08235 100644 --- a/kernel/locking/mutex.c +++ b/kernel/locking/mutex.c @@ -243,6 +243,17 @@ void __sched mutex_lock(struct mutex *lock) EXPORT_SYMBOL(mutex_lock); #endif
+/* + * Wait-Die: + * The newer transactions are killed when: + * It (the new transaction) makes a request for a lock being held + * by an older transaction. + */ + +/* + * Associate the ww_mutex @ww with the context @ww_ctx under which we acquired + * it. + */ static __always_inline void ww_mutex_lock_acquired(struct ww_mutex *ww, struct ww_acquire_ctx *ww_ctx) { @@ -281,26 +292,53 @@ ww_mutex_lock_acquired(struct ww_mutex *ww, struct ww_acquire_ctx *ww_ctx) DEBUG_LOCKS_WARN_ON(ww_ctx->ww_class != ww->ww_class); #endif ww_ctx->acquired++; + ww->ctx = ww_ctx; }
+/* + * Determine if context @a is 'after' context @b. IOW, @a is a younger + * transaction than @b and depending on algorithm either needs to wait for + * @b or die. + */ static inline bool __sched __ww_ctx_stamp_after(struct ww_acquire_ctx *a, struct ww_acquire_ctx *b) { - return a->stamp - b->stamp <= LONG_MAX && - (a->stamp != b->stamp || a > b); + + return (signed long)(a->stamp - b->stamp) > 0; +} + +/* + * Wait-Die; wake a younger waiter context (when locks held) such that it can + * die. + * + * Among waiters with context, only the first one can have other locks acquired + * already (ctx->acquired > 0), because __ww_mutex_add_waiter() and + * __ww_mutex_check_kill() wake any but the earliest context. + */ +static bool __sched +__ww_mutex_die(struct mutex *lock, struct mutex_waiter *waiter, + struct ww_acquire_ctx *ww_ctx) +{ + if (waiter->ww_ctx->acquired > 0 && + __ww_ctx_stamp_after(waiter->ww_ctx, ww_ctx)) { + debug_mutex_wake_waiter(lock, waiter); + wake_up_process(waiter->task); + } + + return true; }
/* - * Wake up any waiters that may have to back off when the lock is held by the - * given context. + * We just acquired @lock under @ww_ctx, if there are later contexts waiting + * behind us on the wait-list, check if they need to die. * - * Due to the invariants on the wait list, this can only affect the first - * waiter with a context. + * See __ww_mutex_add_waiter() for the list-order construction; basically the + * list is ordered by stamp, smallest (oldest) first. * * The current task must not be on the wait list. */ static void __sched -__ww_mutex_wakeup_for_backoff(struct mutex *lock, struct ww_acquire_ctx *ww_ctx) +__ww_mutex_check_waiters(struct mutex *lock, struct ww_acquire_ctx *ww_ctx) { struct mutex_waiter *cur;
@@ -310,30 +348,23 @@ __ww_mutex_wakeup_for_backoff(struct mutex *lock, struct ww_acquire_ctx *ww_ctx) if (!cur->ww_ctx) continue;
- if (cur->ww_ctx->acquired > 0 && - __ww_ctx_stamp_after(cur->ww_ctx, ww_ctx)) { - debug_mutex_wake_waiter(lock, cur); - wake_up_process(cur->task); - } - - break; + if (__ww_mutex_die(lock, cur, ww_ctx)) + break; } }
/* - * After acquiring lock with fastpath or when we lost out in contested - * slowpath, set ctx and wake up any waiters so they can recheck. + * After acquiring lock with fastpath, where we do not hold wait_lock, set ctx + * and wake up any waiters so they can recheck. */ static __always_inline void ww_mutex_set_context_fastpath(struct ww_mutex *lock, struct ww_acquire_ctx *ctx) { ww_mutex_lock_acquired(lock, ctx);
- lock->ctx = ctx; - /* * The lock->ctx update should be visible on all cores before - * the atomic read is done, otherwise contended waiters might be + * the WAITERS check is done, otherwise contended waiters might be * missed. The contended waiters will either see ww_ctx == NULL * and keep spinning, or it will acquire wait_lock, add itself * to waiter list and sleep. @@ -347,29 +378,14 @@ ww_mutex_set_context_fastpath(struct ww_mutex *lock, struct ww_acquire_ctx *ctx) return;
/* - * Uh oh, we raced in fastpath, wake up everyone in this case, - * so they can see the new lock->ctx. + * Uh oh, we raced in fastpath, check if any of the waiters need to + * die. */ spin_lock(&lock->base.wait_lock); - __ww_mutex_wakeup_for_backoff(&lock->base, ctx); + __ww_mutex_check_waiters(&lock->base, ctx); spin_unlock(&lock->base.wait_lock); }
-/* - * After acquiring lock in the slowpath set ctx. - * - * Unlike for the fast path, the caller ensures that waiters are woken up where - * necessary. - * - * Callers must hold the mutex wait_lock. - */ -static __always_inline void -ww_mutex_set_context_slowpath(struct ww_mutex *lock, struct ww_acquire_ctx *ctx) -{ - ww_mutex_lock_acquired(lock, ctx); - lock->ctx = ctx; -} - #ifdef CONFIG_MUTEX_SPIN_ON_OWNER
static inline @@ -645,37 +661,73 @@ void __sched ww_mutex_unlock(struct ww_mutex *lock) } EXPORT_SYMBOL(ww_mutex_unlock);
+ +static __always_inline int __sched +__ww_mutex_kill(struct mutex *lock, struct ww_acquire_ctx *ww_ctx) +{ + if (ww_ctx->acquired > 0) { +#ifdef CONFIG_DEBUG_MUTEXES + struct ww_mutex *ww; + + ww = container_of(lock, struct ww_mutex, base); + DEBUG_LOCKS_WARN_ON(ww_ctx->contending_lock); + ww_ctx->contending_lock = ww; +#endif + return -EDEADLK; + } + + return 0; +} + + +/* + * Check whether we need to kill the transaction for the current lock acquire. + * + * Wait-Die: If we're trying to acquire a lock already held by an older + * context, kill ourselves. + * + * Since __ww_mutex_add_waiter() orders the wait-list on stamp, we only have to + * look at waiters before us in the wait-list. + */ static inline int __sched -__ww_mutex_lock_check_stamp(struct mutex *lock, struct mutex_waiter *waiter, - struct ww_acquire_ctx *ctx) +__ww_mutex_check_kill(struct mutex *lock, struct mutex_waiter *waiter, + struct ww_acquire_ctx *ctx) { struct ww_mutex *ww = container_of(lock, struct ww_mutex, base); struct ww_acquire_ctx *hold_ctx = READ_ONCE(ww->ctx); struct mutex_waiter *cur;
+ if (ctx->acquired == 0) + return 0; + if (hold_ctx && __ww_ctx_stamp_after(ctx, hold_ctx)) - goto deadlock; + return __ww_mutex_kill(lock, ctx);
/* * If there is a waiter in front of us that has a context, then its - * stamp is earlier than ours and we must back off. + * stamp is earlier than ours and we must kill ourself. */ cur = waiter; list_for_each_entry_continue_reverse(cur, &lock->wait_list, list) { - if (cur->ww_ctx) - goto deadlock; + if (!cur->ww_ctx) + continue; + + return __ww_mutex_kill(lock, ctx); }
return 0; - -deadlock: -#ifdef CONFIG_DEBUG_MUTEXES - DEBUG_LOCKS_WARN_ON(ctx->contending_lock); - ctx->contending_lock = ww; -#endif - return -EDEADLK; }
+/* + * Add @waiter to the wait-list, keep the wait-list ordered by stamp, smallest + * first. Such that older contexts are preferred to acquire the lock over + * younger contexts. + * + * Waiters without context are interspersed in FIFO order. + * + * Furthermore, for Wait-Die kill ourself immediately when possible (there are + * older contexts already waiting) to avoid unnecessary waiting. + */ static inline int __sched __ww_mutex_add_waiter(struct mutex_waiter *waiter, struct mutex *lock, @@ -692,7 +744,7 @@ __ww_mutex_add_waiter(struct mutex_waiter *waiter, /* * Add the waiter before the first waiter with a higher stamp. * Waiters without a context are skipped to avoid starving - * them. + * them. Wait-Die waiters may die here. */ pos = &lock->wait_list; list_for_each_entry_reverse(cur, &lock->wait_list, list) { @@ -700,34 +752,27 @@ __ww_mutex_add_waiter(struct mutex_waiter *waiter, continue;
if (__ww_ctx_stamp_after(ww_ctx, cur->ww_ctx)) { - /* Back off immediately if necessary. */ - if (ww_ctx->acquired > 0) { -#ifdef CONFIG_DEBUG_MUTEXES - struct ww_mutex *ww; + /* + * Wait-Die: if we find an older context waiting, there + * is no point in queueing behind it, as we'd have to + * die the moment it would acquire the lock. + */ + int ret = __ww_mutex_kill(lock, ww_ctx);
- ww = container_of(lock, struct ww_mutex, base); - DEBUG_LOCKS_WARN_ON(ww_ctx->contending_lock); - ww_ctx->contending_lock = ww; -#endif - return -EDEADLK; - } + if (ret) + return ret;
break; }
pos = &cur->list;
- /* - * Wake up the waiter so that it gets a chance to back - * off. - */ - if (cur->ww_ctx->acquired > 0) { - debug_mutex_wake_waiter(lock, cur); - wake_up_process(cur->task); - } + /* Wait-Die: ensure younger waiters die. */ + __ww_mutex_die(lock, cur, ww_ctx); }
list_add_tail(&waiter->list, pos); + return 0; }
@@ -771,7 +816,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, */ if (__mutex_trylock(lock)) { if (use_ww_ctx && ww_ctx) - __ww_mutex_wakeup_for_backoff(lock, ww_ctx); + __ww_mutex_check_waiters(lock, ww_ctx);
goto skip_wait; } @@ -789,10 +834,13 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, waiter.ww_ctx = MUTEX_POISON_WW_CTX; #endif } else { - /* Add in stamp order, waking up waiters that must back off. */ + /* + * Add in stamp order, waking up waiters that must kill + * themselves. + */ ret = __ww_mutex_add_waiter(&waiter, lock, ww_ctx); if (ret) - goto err_early_backoff; + goto err_early_kill;
waiter.ww_ctx = ww_ctx; } @@ -814,7 +862,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, goto acquired;
/* - * Check for signals and wound conditions while holding + * Check for signals and kill conditions while holding * wait_lock. This ensures the lock cancellation is ordered * against mutex_unlock() and wake-ups do not go missing. */ @@ -823,8 +871,8 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, goto err; }
- if (use_ww_ctx && ww_ctx && ww_ctx->acquired > 0) { - ret = __ww_mutex_lock_check_stamp(lock, &waiter, ww_ctx); + if (use_ww_ctx && ww_ctx) { + ret = __ww_mutex_check_kill(lock, &waiter, ww_ctx); if (ret) goto err; } @@ -869,7 +917,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, lock_acquired(&lock->dep_map, ip);
if (use_ww_ctx && ww_ctx) - ww_mutex_set_context_slowpath(ww, ww_ctx); + ww_mutex_lock_acquired(ww, ww_ctx);
spin_unlock(&lock->wait_lock); preempt_enable(); @@ -878,7 +926,7 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, err: __set_current_state(TASK_RUNNING); mutex_remove_waiter(lock, &waiter, current); -err_early_backoff: +err_early_kill: spin_unlock(&lock->wait_lock); debug_mutex_free_waiter(&waiter); mutex_release(&lock->dep_map, 1, ip);
On Tue, Jun 19, 2018 at 10:24:43AM +0200, Thomas Hellstrom wrote:
From: Peter Ziljstra peterz@infradead.org
Make the WW mutex code more readable by adding comments, splitting up functions and pointing out that we're actually using the Wait-Die algorithm.
Cc: Ingo Molnar mingo@redhat.com Cc: Jonathan Corbet corbet@lwn.net Cc: Gustavo Padovan gustavo@padovan.org Cc: Maarten Lankhorst maarten.lankhorst@linux.intel.com Cc: Sean Paul seanpaul@chromium.org Cc: David Airlie airlied@linux.ie Cc: Davidlohr Bueso dave@stgolabs.net Cc: "Paul E. McKenney" paulmck@linux.vnet.ibm.com Cc: Josh Triplett josh@joshtriplett.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Kate Stewart kstewart@linuxfoundation.org Cc: Philippe Ombredanne pombredanne@nexb.com Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: linux-doc@vger.kernel.org Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Co-authored-by: Thomas Hellstrom thellstrom@vmware.com Signed-off-by: Thomas Hellstrom thellstrom@vmware.com
Documentation/locking/ww-mutex-design.txt | 12 +- include/linux/ww_mutex.h | 28 ++--- kernel/locking/mutex.c | 202 ++++++++++++++++++------------ 3 files changed, 145 insertions(+), 97 deletions(-)
Acked-by: Peter Zijlstra (Intel) peterz@infradead.org
On 06/19/2018 11:44 AM, Peter Zijlstra wrote:
On Tue, Jun 19, 2018 at 10:24:43AM +0200, Thomas Hellstrom wrote:
From: Peter Ziljstra peterz@infradead.org
Make the WW mutex code more readable by adding comments, splitting up functions and pointing out that we're actually using the Wait-Die algorithm.
Cc: Ingo Molnar mingo@redhat.com Cc: Jonathan Corbet corbet@lwn.net Cc: Gustavo Padovan gustavo@padovan.org Cc: Maarten Lankhorst maarten.lankhorst@linux.intel.com Cc: Sean Paul seanpaul@chromium.org Cc: David Airlie airlied@linux.ie Cc: Davidlohr Bueso dave@stgolabs.net Cc: "Paul E. McKenney" paulmck@linux.vnet.ibm.com Cc: Josh Triplett josh@joshtriplett.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Kate Stewart kstewart@linuxfoundation.org Cc: Philippe Ombredanne pombredanne@nexb.com Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: linux-doc@vger.kernel.org Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Co-authored-by: Thomas Hellstrom thellstrom@vmware.com Signed-off-by: Thomas Hellstrom thellstrom@vmware.com
Documentation/locking/ww-mutex-design.txt | 12 +- include/linux/ww_mutex.h | 28 ++--- kernel/locking/mutex.c | 202 ++++++++++++++++++------------ 3 files changed, 145 insertions(+), 97 deletions(-)
Acked-by: Peter Zijlstra (Intel) peterz@infradead.org
Hi Peter,
Do you want to add a SOB, since you're the main author?
Thomas
On Tue, Jun 19, 2018 at 12:44:52PM +0200, Thomas Hellstrom wrote:
On 06/19/2018 11:44 AM, Peter Zijlstra wrote:
On Tue, Jun 19, 2018 at 10:24:43AM +0200, Thomas Hellstrom wrote:
From: Peter Ziljstra peterz@infradead.org
Make the WW mutex code more readable by adding comments, splitting up functions and pointing out that we're actually using the Wait-Die algorithm.
Cc: Ingo Molnar mingo@redhat.com Cc: Jonathan Corbet corbet@lwn.net Cc: Gustavo Padovan gustavo@padovan.org Cc: Maarten Lankhorst maarten.lankhorst@linux.intel.com Cc: Sean Paul seanpaul@chromium.org Cc: David Airlie airlied@linux.ie Cc: Davidlohr Bueso dave@stgolabs.net Cc: "Paul E. McKenney" paulmck@linux.vnet.ibm.com Cc: Josh Triplett josh@joshtriplett.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Kate Stewart kstewart@linuxfoundation.org Cc: Philippe Ombredanne pombredanne@nexb.com Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: linux-doc@vger.kernel.org Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Co-authored-by: Thomas Hellstrom thellstrom@vmware.com Signed-off-by: Thomas Hellstrom thellstrom@vmware.com
Documentation/locking/ww-mutex-design.txt | 12 +- include/linux/ww_mutex.h | 28 ++--- kernel/locking/mutex.c | 202 ++++++++++++++++++------------ 3 files changed, 145 insertions(+), 97 deletions(-)
Acked-by: Peter Zijlstra (Intel) peterz@infradead.org
Hi Peter,
Do you want to add a SOB, since you're the main author?
Sure, here goes:
Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org
The current Wound-Wait mutex algorithm is actually not Wound-Wait but Wait-Die. Implement also Wound-Wait as a per-ww-class choice. Wound-Wait is, contrary to Wait-Die a preemptive algorithm and is known to generate fewer backoffs. Testing reveals that this is true if the number of simultaneous contending transactions is small. As the number of simultaneous contending threads increases, Wait-Wound becomes inferior to Wait-Die in terms of elapsed time. Possibly due to the larger number of held locks of sleeping transactions.
Update documentation and callers.
Timings using git://people.freedesktop.org/~thomash/ww_mutex_test tag patch-18-06-15
Each thread runs 100000 batches of lock / unlock 800 ww mutexes randomly chosen out of 100000. Four core Intel x86_64:
Algorithm #threads Rollbacks time Wound-Wait 4 ~100 ~17s. Wait-Die 4 ~150000 ~19s. Wound-Wait 16 ~360000 ~109s. Wait-Die 16 ~450000 ~82s.
Cc: Ingo Molnar mingo@redhat.com Cc: Jonathan Corbet corbet@lwn.net Cc: Gustavo Padovan gustavo@padovan.org Cc: Maarten Lankhorst maarten.lankhorst@linux.intel.com Cc: Sean Paul seanpaul@chromium.org Cc: David Airlie airlied@linux.ie Cc: Davidlohr Bueso dave@stgolabs.net Cc: "Paul E. McKenney" paulmck@linux.vnet.ibm.com Cc: Josh Triplett josh@joshtriplett.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Kate Stewart kstewart@linuxfoundation.org Cc: Philippe Ombredanne pombredanne@nexb.com Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: linux-doc@vger.kernel.org Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Co-authored-by: Peter Zijlstra peterz@infradead.org Signed-off-by: Thomas Hellstrom thellstrom@vmware.com
--- v2: * Update API according to review comment by Greg Kroah-Hartman. * Address review comments by Peter Zijlstra: - Avoid _Bool in composites - Fix typo - Use __mutex_owner() where applicable - Rely on built-in barriers for the main loop exit condition, struct ww_acquire_ctx::wounded. Update code comments. - Explain unlocked use of list_empty(). v3: * Adapt to and incorporate cleanup by Peter Zijlstra * Remove unlocked use of list_empty(). v4: * Move code related to adding a waiter to the lock waiter list to a separate function. --- Documentation/locking/ww-mutex-design.txt | 57 +++++++++-- drivers/dma-buf/reservation.c | 2 +- drivers/gpu/drm/drm_modeset_lock.c | 2 +- include/linux/ww_mutex.h | 17 ++- kernel/locking/locktorture.c | 2 +- kernel/locking/mutex.c | 165 +++++++++++++++++++++++++++--- kernel/locking/test-ww_mutex.c | 2 +- lib/locking-selftest.c | 2 +- 8 files changed, 213 insertions(+), 36 deletions(-)
diff --git a/Documentation/locking/ww-mutex-design.txt b/Documentation/locking/ww-mutex-design.txt index 2fd7f2a2af21..f0ed7c30e695 100644 --- a/Documentation/locking/ww-mutex-design.txt +++ b/Documentation/locking/ww-mutex-design.txt @@ -1,4 +1,4 @@ -Wait/Wound Deadlock-Proof Mutex Design +Wound/Wait Deadlock-Proof Mutex Design ======================================
Please read mutex-design.txt first, as it applies to wait/wound mutexes too. @@ -32,10 +32,26 @@ the oldest task) wins, and the one with the higher reservation id (i.e. the younger task) unlocks all of the buffers that it has already locked, and then tries again.
-In the RDBMS literature this deadlock handling approach is called wait/die: -The older tasks waits until it can acquire the contended lock. The younger tasks -needs to back off and drop all the locks it is currently holding, i.e. the -younger task dies. +In the RDBMS literature, a reservation ticket is associated with a transaction. +and the deadlock handling approach is called Wait-Die. The name is based on +the actions of a locking thread when it encounters an already locked mutex. +If the transaction holding the lock is younger, the locking transaction waits. +If the transaction holding the lock is older, the locking transaction backs off +and dies. Hence Wait-Die. +There is also another algorithm called Wound-Wait: +If the transaction holding the lock is younger, the locking transaction +wounds the transaction holding the lock, requesting it to die. +If the transaction holding the lock is older, it waits for the other +transaction. Hence Wound-Wait. +The two algorithms are both fair in that a transaction will eventually succeed. +However, the Wound-Wait algorithm is typically stated to generate fewer backoffs +compared to Wait-Die, but is, on the other hand, associated with more work than +Wait-Die when recovering from a backoff. Wound-Wait is also a preemptive +algorithm in that transactions are wounded by other transactions, and that +requires a reliable way to pick up up the wounded condition and preempt the +running transaction. Note that this is not the same as process preemption. A +Wound-Wait transaction is considered preempted when it dies (returning +-EDEADLK) following a wound.
Concepts -------- @@ -47,10 +63,12 @@ Acquire context: To ensure eventual forward progress it is important the a task trying to acquire locks doesn't grab a new reservation id, but keeps the one it acquired when starting the lock acquisition. This ticket is stored in the acquire context. Furthermore the acquire context keeps track of debugging state -to catch w/w mutex interface abuse. +to catch w/w mutex interface abuse. An acquire context is representing a +transaction.
W/w class: In contrast to normal mutexes the lock class needs to be explicit for -w/w mutexes, since it is required to initialize the acquire context. +w/w mutexes, since it is required to initialize the acquire context. The lock +class also specifies what algorithm to use, Wound-Wait or Wait-Die.
Furthermore there are three different class of w/w lock acquire functions:
@@ -90,6 +108,12 @@ provided. Usage -----
+The algorithm (Wait-Die vs Wound-Wait) is chosen by using either +DEFINE_WW_CLASS() (Wound-Wait) or DEFINE_WD_CLASS() (Wait-Die) +As a rough rule of thumb, use Wound-Wait iff you +expect the number of simultaneous competing transactions to be typically small, +and you want to reduce the number of rollbacks. + Three different ways to acquire locks within the same w/w class. Common definitions for methods #1 and #2:
@@ -312,12 +336,23 @@ Design: We maintain the following invariants for the wait list: (1) Waiters with an acquire context are sorted by stamp order; waiters without an acquire context are interspersed in FIFO order. - (2) Among waiters with contexts, only the first one can have other locks - acquired already (ctx->acquired > 0). Note that this waiter may come - after other waiters without contexts in the list. + (2) For Wait-Die, among waiters with contexts, only the first one can have + other locks acquired already (ctx->acquired > 0). Note that this waiter + may come after other waiters without contexts in the list. + + The Wound-Wait preemption is implemented with a lazy-preemption scheme: + The wounded status of the transaction is checked only when there is + contention for a new lock and hence a true chance of deadlock. In that + situation, if the transaction is wounded, it backs off, clears the + wounded status and retries. A great benefit of implementing preemption in + this way is that the wounded transaction can identify a contending lock to + wait for before restarting the transaction. Just blindly restarting the + transaction would likely make the transaction end up in a situation where + it would have to back off again.
In general, not much contention is expected. The locks are typically used to - serialize access to resources for devices. + serialize access to resources for devices, and optimization focus should + therefore be directed towards the uncontended cases.
Lockdep: Special care has been taken to warn for as many cases of api abuse diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c index 314eb1071cce..20bf90f4ee63 100644 --- a/drivers/dma-buf/reservation.c +++ b/drivers/dma-buf/reservation.c @@ -46,7 +46,7 @@ * write-side updates. */
-DEFINE_WW_CLASS(reservation_ww_class); +DEFINE_WD_CLASS(reservation_ww_class); EXPORT_SYMBOL(reservation_ww_class);
struct lock_class_key reservation_seqcount_class; diff --git a/drivers/gpu/drm/drm_modeset_lock.c b/drivers/gpu/drm/drm_modeset_lock.c index 8a5100685875..638be2eb67b4 100644 --- a/drivers/gpu/drm/drm_modeset_lock.c +++ b/drivers/gpu/drm/drm_modeset_lock.c @@ -70,7 +70,7 @@ * lists and lookup data structures. */
-static DEFINE_WW_CLASS(crtc_ww_class); +static DEFINE_WD_CLASS(crtc_ww_class);
/** * drm_modeset_lock_all - take all modeset locks diff --git a/include/linux/ww_mutex.h b/include/linux/ww_mutex.h index f82fce2229c8..3af7c0e03be5 100644 --- a/include/linux/ww_mutex.h +++ b/include/linux/ww_mutex.h @@ -8,6 +8,8 @@ * * Wait/Die implementation: * Copyright (C) 2013 Canonical Ltd. + * Choice of algorithm: + * Copyright (C) 2018 WMWare Inc. * * This file contains the main data structure and API definitions. */ @@ -23,12 +25,15 @@ struct ww_class { struct lock_class_key mutex_key; const char *acquire_name; const char *mutex_name; + unsigned int is_wait_die; };
struct ww_acquire_ctx { struct task_struct *task; unsigned long stamp; unsigned int acquired; + unsigned short wounded; + unsigned short is_wait_die; #ifdef CONFIG_DEBUG_MUTEXES unsigned int done_acquire; struct ww_class *ww_class; @@ -58,17 +63,21 @@ struct ww_mutex { # define __WW_CLASS_MUTEX_INITIALIZER(lockname, class) #endif
-#define __WW_CLASS_INITIALIZER(ww_class) \ +#define __WW_CLASS_INITIALIZER(ww_class, _is_wait_die) \ { .stamp = ATOMIC_LONG_INIT(0) \ , .acquire_name = #ww_class "_acquire" \ - , .mutex_name = #ww_class "_mutex" } + , .mutex_name = #ww_class "_mutex" \ + , .is_wait_die = _is_wait_die }
#define __WW_MUTEX_INITIALIZER(lockname, class) \ { .base = __MUTEX_INITIALIZER(lockname.base) \ __WW_CLASS_MUTEX_INITIALIZER(lockname, class) }
+#define DEFINE_WD_CLASS(classname) \ + struct ww_class classname = __WW_CLASS_INITIALIZER(classname, 1) + #define DEFINE_WW_CLASS(classname) \ - struct ww_class classname = __WW_CLASS_INITIALIZER(classname) + struct ww_class classname = __WW_CLASS_INITIALIZER(classname, 0)
#define DEFINE_WW_MUTEX(mutexname, ww_class) \ struct ww_mutex mutexname = __WW_MUTEX_INITIALIZER(mutexname, ww_class) @@ -123,6 +132,8 @@ static inline void ww_acquire_init(struct ww_acquire_ctx *ctx, ctx->task = current; ctx->stamp = atomic_long_inc_return_relaxed(&ww_class->stamp); ctx->acquired = 0; + ctx->wounded = false; + ctx->is_wait_die = ww_class->is_wait_die; #ifdef CONFIG_DEBUG_MUTEXES ctx->ww_class = ww_class; ctx->done_acquire = 0; diff --git a/kernel/locking/locktorture.c b/kernel/locking/locktorture.c index 6850ffd69125..907e0325892c 100644 --- a/kernel/locking/locktorture.c +++ b/kernel/locking/locktorture.c @@ -365,7 +365,7 @@ static struct lock_torture_ops mutex_lock_ops = { };
#include <linux/ww_mutex.h> -static DEFINE_WW_CLASS(torture_ww_class); +static DEFINE_WD_CLASS(torture_ww_class); static DEFINE_WW_MUTEX(torture_ww_mutex_0, &torture_ww_class); static DEFINE_WW_MUTEX(torture_ww_mutex_1, &torture_ww_class); static DEFINE_WW_MUTEX(torture_ww_mutex_2, &torture_ww_class); diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c index 412b4fc08235..8ca83a5e3d24 100644 --- a/kernel/locking/mutex.c +++ b/kernel/locking/mutex.c @@ -172,6 +172,21 @@ static inline bool __mutex_waiter_is_first(struct mutex *lock, struct mutex_wait return list_first_entry(&lock->wait_list, struct mutex_waiter, list) == waiter; }
+/* + * Add @waiter to a given location in the lock wait_list and set the + * FLAG_WAITERS flag if it's the first waiter. + */ +static void __sched +__mutex_add_waiter(struct mutex *lock, struct mutex_waiter *waiter, + struct list_head *list) +{ + debug_mutex_add_waiter(lock, waiter, current); + + list_add_tail(&waiter->list, list); + if (__mutex_waiter_is_first(lock, waiter)) + __mutex_set_flag(lock, MUTEX_FLAG_WAITERS); +} + /* * Give up ownership to a specific task, when @task = NULL, this is equivalent * to a regular unlock. Sets PICKUP on a handoff, clears HANDOF, preserves @@ -248,6 +263,11 @@ EXPORT_SYMBOL(mutex_lock); * The newer transactions are killed when: * It (the new transaction) makes a request for a lock being held * by an older transaction. + * + * Wound-Wait: + * The newer transactions are wounded when: + * An older transaction makes a request for a lock being held by + * the newer transaction. */
/* @@ -319,6 +339,9 @@ static bool __sched __ww_mutex_die(struct mutex *lock, struct mutex_waiter *waiter, struct ww_acquire_ctx *ww_ctx) { + if (!ww_ctx->is_wait_die) + return false; + if (waiter->ww_ctx->acquired > 0 && __ww_ctx_stamp_after(waiter->ww_ctx, ww_ctx)) { debug_mutex_wake_waiter(lock, waiter); @@ -328,13 +351,65 @@ __ww_mutex_die(struct mutex *lock, struct mutex_waiter *waiter, return true; }
+/* + * Wound-Wait; wound a younger @hold_ctx if it holds the lock. + * + * Wound the lock holder if there are waiters with older transactions than + * the lock holders. Even if multiple waiters may wound the lock holder, + * it's sufficient that only one does. + */ +static bool __ww_mutex_wound(struct mutex *lock, + struct ww_acquire_ctx *ww_ctx, + struct ww_acquire_ctx *hold_ctx) +{ + struct task_struct *owner = __mutex_owner(lock); + + lockdep_assert_held(&lock->wait_lock); + + /* + * Possible through __ww_mutex_add_waiter() when we race with + * ww_mutex_set_context_fastpath(). In that case we'll get here again + * through __ww_mutex_check_waiters(). + */ + if (!hold_ctx) + return false; + + /* + * Can have !owner because of __mutex_unlock_slowpath(), but if owner, + * it cannot go away because we'll have FLAG_WAITERS set and hold + * wait_lock. + */ + if (!owner) + return false; + + if (ww_ctx->acquired > 0 && __ww_ctx_stamp_after(hold_ctx, ww_ctx)) { + hold_ctx->wounded = 1; + + /* + * wake_up_process() paired with set_current_state() + * inserts sufficient barriers to make sure @owner either sees + * it's wounded in __ww_mutex_lock_check_stamp() or has a + * wakeup pending to re-read the wounded state. + */ + if (owner != current) + wake_up_process(owner); + + return true; + } + + return false; +} + /* * We just acquired @lock under @ww_ctx, if there are later contexts waiting - * behind us on the wait-list, check if they need to die. + * behind us on the wait-list, check if they need to die, or wound us. * * See __ww_mutex_add_waiter() for the list-order construction; basically the * list is ordered by stamp, smallest (oldest) first. * + * This relies on never mixing wait-die/wound-wait on the same wait-list; + * which is currently ensured by that being a ww_class property. + * * The current task must not be on the wait list. */ static void __sched @@ -348,7 +423,8 @@ __ww_mutex_check_waiters(struct mutex *lock, struct ww_acquire_ctx *ww_ctx) if (!cur->ww_ctx) continue;
- if (__ww_mutex_die(lock, cur, ww_ctx)) + if (__ww_mutex_die(lock, cur, ww_ctx) || + __ww_mutex_wound(lock, cur->ww_ctx, ww_ctx)) break; } } @@ -369,17 +445,23 @@ ww_mutex_set_context_fastpath(struct ww_mutex *lock, struct ww_acquire_ctx *ctx) * and keep spinning, or it will acquire wait_lock, add itself * to waiter list and sleep. */ - smp_mb(); /* ^^^ */ + smp_mb(); /* See comments above and below. */
/* - * Check if lock is contended, if not there is nobody to wake up + * [W] ww->ctx = ctx [W] MUTEX_FLAG_WAITERS + * MB MB + * [R] MUTEX_FLAG_WAITERS [R] ww->ctx + * + * The memory barrier above pairs with the memory barrier in + * __ww_mutex_add_waiter() and makes sure we either observe ww->ctx + * and/or !empty list. */ if (likely(!(atomic_long_read(&lock->base.owner) & MUTEX_FLAG_WAITERS))) return;
/* * Uh oh, we raced in fastpath, check if any of the waiters need to - * die. + * die or wound us. */ spin_lock(&lock->base.wait_lock); __ww_mutex_check_waiters(&lock->base, ctx); @@ -681,7 +763,9 @@ __ww_mutex_kill(struct mutex *lock, struct ww_acquire_ctx *ww_ctx)
/* - * Check whether we need to kill the transaction for the current lock acquire. + * Check the wound condition for the current lock acquire. + * + * Wound-Wait: If we're wounded, kill ourself. * * Wait-Die: If we're trying to acquire a lock already held by an older * context, kill ourselves. @@ -700,6 +784,13 @@ __ww_mutex_check_kill(struct mutex *lock, struct mutex_waiter *waiter, if (ctx->acquired == 0) return 0;
+ if (!ctx->is_wait_die) { + if (ctx->wounded) + return __ww_mutex_kill(lock, ctx); + + return 0; + } + if (hold_ctx && __ww_ctx_stamp_after(ctx, hold_ctx)) return __ww_mutex_kill(lock, ctx);
@@ -726,7 +817,8 @@ __ww_mutex_check_kill(struct mutex *lock, struct mutex_waiter *waiter, * Waiters without context are interspersed in FIFO order. * * Furthermore, for Wait-Die kill ourself immediately when possible (there are - * older contexts already waiting) to avoid unnecessary waiting. + * older contexts already waiting) to avoid unnecessary waiting and for + * Wound-Wait ensure we wound the owning context when it is younger. */ static inline int __sched __ww_mutex_add_waiter(struct mutex_waiter *waiter, @@ -735,16 +827,21 @@ __ww_mutex_add_waiter(struct mutex_waiter *waiter, { struct mutex_waiter *cur; struct list_head *pos; + bool is_wait_die;
if (!ww_ctx) { - list_add_tail(&waiter->list, &lock->wait_list); + __mutex_add_waiter(lock, waiter, &lock->wait_list); return 0; }
+ is_wait_die = ww_ctx->is_wait_die; + /* * Add the waiter before the first waiter with a higher stamp. * Waiters without a context are skipped to avoid starving - * them. Wait-Die waiters may die here. + * them. Wait-Die waiters may die here. Wound-Wait waiters + * never die here, but they are sorted in stamp order and + * may wound the lock holder. */ pos = &lock->wait_list; list_for_each_entry_reverse(cur, &lock->wait_list, list) { @@ -757,10 +854,12 @@ __ww_mutex_add_waiter(struct mutex_waiter *waiter, * is no point in queueing behind it, as we'd have to * die the moment it would acquire the lock. */ - int ret = __ww_mutex_kill(lock, ww_ctx); + if (is_wait_die) { + int ret = __ww_mutex_kill(lock, ww_ctx);
- if (ret) - return ret; + if (ret) + return ret; + }
break; } @@ -771,7 +870,23 @@ __ww_mutex_add_waiter(struct mutex_waiter *waiter, __ww_mutex_die(lock, cur, ww_ctx); }
- list_add_tail(&waiter->list, pos); + __mutex_add_waiter(lock, waiter, pos); + + /* + * Wound-Wait: if we're blocking on a mutex owned by a younger context, + * wound that such that we might proceed. + */ + if (!is_wait_die) { + struct ww_mutex *ww = container_of(lock, struct ww_mutex, base); + + /* + * See ww_mutex_set_context_fastpath(). Orders setting + * MUTEX_FLAG_WAITERS vs the ww->ctx load, + * such that either we or the fastpath will wound @ww->ctx. + */ + smp_mb(); + __ww_mutex_wound(lock, ww_ctx, ww->ctx); + }
return 0; } @@ -795,6 +910,14 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, if (use_ww_ctx && ww_ctx) { if (unlikely(ww_ctx == READ_ONCE(ww->ctx))) return -EALREADY; + + /* + * Reset the wounded flag after a kill. No other process can + * race and wound us here since they can't have a valid owner + * pointer if we don't have any locks held. + */ + if (ww_ctx->acquired == 0) + ww_ctx->wounded = 0; }
preempt_disable(); @@ -828,7 +951,8 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
if (!use_ww_ctx) { /* add waiting tasks to the end of the waitqueue (FIFO): */ - list_add_tail(&waiter.list, &lock->wait_list); + __mutex_add_waiter(lock, &waiter, &lock->wait_list); +
#ifdef CONFIG_DEBUG_MUTEXES waiter.ww_ctx = MUTEX_POISON_WW_CTX; @@ -847,9 +971,6 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
waiter.task = current;
- if (__mutex_waiter_is_first(lock, &waiter)) - __mutex_set_flag(lock, MUTEX_FLAG_WAITERS); - set_current_state(state); for (;;) { /* @@ -906,6 +1027,16 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, acquired: __set_current_state(TASK_RUNNING);
+ if (use_ww_ctx && ww_ctx) { + /* + * Wound-Wait; we stole the lock (!first_waiter), check the + * waiters as anyone might want to wound us. + */ + if (!ww_ctx->is_wait_die && + !__mutex_waiter_is_first(lock, &waiter)) + __ww_mutex_check_waiters(lock, ww_ctx); + } + mutex_remove_waiter(lock, &waiter, current); if (likely(list_empty(&lock->wait_list))) __mutex_clear_flag(lock, MUTEX_FLAGS); diff --git a/kernel/locking/test-ww_mutex.c b/kernel/locking/test-ww_mutex.c index 0e4cd64ad2c0..5b915b370d5a 100644 --- a/kernel/locking/test-ww_mutex.c +++ b/kernel/locking/test-ww_mutex.c @@ -26,7 +26,7 @@ #include <linux/slab.h> #include <linux/ww_mutex.h>
-static DEFINE_WW_CLASS(ww_class); +static DEFINE_WD_CLASS(ww_class); struct workqueue_struct *wq;
struct test_mutex { diff --git a/lib/locking-selftest.c b/lib/locking-selftest.c index b5c1293ce147..1e1bbf171eca 100644 --- a/lib/locking-selftest.c +++ b/lib/locking-selftest.c @@ -29,7 +29,7 @@ */ static unsigned int debug_locks_verbose;
-static DEFINE_WW_CLASS(ww_lockdep); +static DEFINE_WD_CLASS(ww_lockdep);
static int __init setup_debug_locks_verbose(char *str) {
On Tue, Jun 19, 2018 at 10:24:44AM +0200, Thomas Hellstrom wrote:
The current Wound-Wait mutex algorithm is actually not Wound-Wait but Wait-Die. Implement also Wound-Wait as a per-ww-class choice. Wound-Wait is, contrary to Wait-Die a preemptive algorithm and is known to generate fewer backoffs. Testing reveals that this is true if the number of simultaneous contending transactions is small. As the number of simultaneous contending threads increases, Wait-Wound becomes inferior to Wait-Die in terms of elapsed time. Possibly due to the larger number of held locks of sleeping transactions.
Update documentation and callers.
Timings using git://people.freedesktop.org/~thomash/ww_mutex_test tag patch-18-06-15
Each thread runs 100000 batches of lock / unlock 800 ww mutexes randomly chosen out of 100000. Four core Intel x86_64:
Algorithm #threads Rollbacks time Wound-Wait 4 ~100 ~17s. Wait-Die 4 ~150000 ~19s. Wound-Wait 16 ~360000 ~109s. Wait-Die 16 ~450000 ~82s.
Cc: Ingo Molnar mingo@redhat.com Cc: Jonathan Corbet corbet@lwn.net Cc: Gustavo Padovan gustavo@padovan.org Cc: Maarten Lankhorst maarten.lankhorst@linux.intel.com Cc: Sean Paul seanpaul@chromium.org Cc: David Airlie airlied@linux.ie Cc: Davidlohr Bueso dave@stgolabs.net Cc: "Paul E. McKenney" paulmck@linux.vnet.ibm.com Cc: Josh Triplett josh@joshtriplett.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Kate Stewart kstewart@linuxfoundation.org Cc: Philippe Ombredanne pombredanne@nexb.com Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: linux-doc@vger.kernel.org Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Co-authored-by: Peter Zijlstra peterz@infradead.org Signed-off-by: Thomas Hellstrom thellstrom@vmware.com
Documentation/locking/ww-mutex-design.txt | 57 +++++++++-- drivers/dma-buf/reservation.c | 2 +- drivers/gpu/drm/drm_modeset_lock.c | 2 +- include/linux/ww_mutex.h | 17 ++- kernel/locking/locktorture.c | 2 +- kernel/locking/mutex.c | 165 +++++++++++++++++++++++++++--- kernel/locking/test-ww_mutex.c | 2 +- lib/locking-selftest.c | 2 +- 8 files changed, 213 insertions(+), 36 deletions(-)
Acked-by: Peter Zijlstra (Intel) peterz@infradead.org
So, I guess this is to do w/ the magic of merge commits, but it looks like the hunk changing the crtc_ww_class got lost:
~/src/linux master git show --pretty=short 08295b3b5beec9aac0f7a9db86f0fc3792039da3 drivers/gpu/drm/drm_modeset_lock.c commit 08295b3b5beec9aac0f7a9db86f0fc3792039da3 Author: Thomas Hellstrom thellstrom@vmware.com
locking: Implement an algorithm choice for Wound-Wait mutexes
diff --git a/drivers/gpu/drm/drm_modeset_lock.c b/drivers/gpu/drm/drm_modeset_lock.c index 8a5100685875..638be2eb67b4 100644 --- a/drivers/gpu/drm/drm_modeset_lock.c +++ b/drivers/gpu/drm/drm_modeset_lock.c @@ -70,7 +70,7 @@ * lists and lookup data structures. */
-static DEFINE_WW_CLASS(crtc_ww_class); +static DEFINE_WD_CLASS(crtc_ww_class);
/** * drm_modeset_lock_all - take all modeset locks ~/src/linux master git log --pretty=format:%H drivers/gpu/drm/drm_modeset_lock.c | grep 08295b3b5beec9aac0f7a9db86f0fc3792039da3 ~/src/linux master 1
BR, -R
On Tue, Jun 19, 2018 at 4:29 AM Thomas Hellstrom thellstrom@vmware.com wrote:
The current Wound-Wait mutex algorithm is actually not Wound-Wait but Wait-Die. Implement also Wound-Wait as a per-ww-class choice. Wound-Wait is, contrary to Wait-Die a preemptive algorithm and is known to generate fewer backoffs. Testing reveals that this is true if the number of simultaneous contending transactions is small. As the number of simultaneous contending threads increases, Wait-Wound becomes inferior to Wait-Die in terms of elapsed time. Possibly due to the larger number of held locks of sleeping transactions.
Update documentation and callers.
Timings using git://people.freedesktop.org/~thomash/ww_mutex_test tag patch-18-06-15
Each thread runs 100000 batches of lock / unlock 800 ww mutexes randomly chosen out of 100000. Four core Intel x86_64:
Algorithm #threads Rollbacks time Wound-Wait 4 ~100 ~17s. Wait-Die 4 ~150000 ~19s. Wound-Wait 16 ~360000 ~109s. Wait-Die 16 ~450000 ~82s.
Cc: Ingo Molnar mingo@redhat.com Cc: Jonathan Corbet corbet@lwn.net Cc: Gustavo Padovan gustavo@padovan.org Cc: Maarten Lankhorst maarten.lankhorst@linux.intel.com Cc: Sean Paul seanpaul@chromium.org Cc: David Airlie airlied@linux.ie Cc: Davidlohr Bueso dave@stgolabs.net Cc: "Paul E. McKenney" paulmck@linux.vnet.ibm.com Cc: Josh Triplett josh@joshtriplett.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Kate Stewart kstewart@linuxfoundation.org Cc: Philippe Ombredanne pombredanne@nexb.com Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: linux-doc@vger.kernel.org Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Co-authored-by: Peter Zijlstra peterz@infradead.org Signed-off-by: Thomas Hellstrom thellstrom@vmware.com
v2:
- Update API according to review comment by Greg Kroah-Hartman.
- Address review comments by Peter Zijlstra:
- Avoid _Bool in composites
- Fix typo
- Use __mutex_owner() where applicable
- Rely on built-in barriers for the main loop exit condition, struct ww_acquire_ctx::wounded. Update code comments.
- Explain unlocked use of list_empty().
v3:
- Adapt to and incorporate cleanup by Peter Zijlstra
- Remove unlocked use of list_empty().
v4:
- Move code related to adding a waiter to the lock waiter list to a separate function.
Documentation/locking/ww-mutex-design.txt | 57 +++++++++-- drivers/dma-buf/reservation.c | 2 +- drivers/gpu/drm/drm_modeset_lock.c | 2 +- include/linux/ww_mutex.h | 17 ++- kernel/locking/locktorture.c | 2 +- kernel/locking/mutex.c | 165 +++++++++++++++++++++++++++--- kernel/locking/test-ww_mutex.c | 2 +- lib/locking-selftest.c | 2 +- 8 files changed, 213 insertions(+), 36 deletions(-)
diff --git a/Documentation/locking/ww-mutex-design.txt b/Documentation/locking/ww-mutex-design.txt index 2fd7f2a2af21..f0ed7c30e695 100644 --- a/Documentation/locking/ww-mutex-design.txt +++ b/Documentation/locking/ww-mutex-design.txt @@ -1,4 +1,4 @@ -Wait/Wound Deadlock-Proof Mutex Design
+Wound/Wait Deadlock-Proof Mutex Design
Please read mutex-design.txt first, as it applies to wait/wound mutexes too. @@ -32,10 +32,26 @@ the oldest task) wins, and the one with the higher reservation id (i.e. the younger task) unlocks all of the buffers that it has already locked, and then tries again.
-In the RDBMS literature this deadlock handling approach is called wait/die: -The older tasks waits until it can acquire the contended lock. The younger tasks -needs to back off and drop all the locks it is currently holding, i.e. the -younger task dies. +In the RDBMS literature, a reservation ticket is associated with a transaction. +and the deadlock handling approach is called Wait-Die. The name is based on +the actions of a locking thread when it encounters an already locked mutex. +If the transaction holding the lock is younger, the locking transaction waits. +If the transaction holding the lock is older, the locking transaction backs off +and dies. Hence Wait-Die. +There is also another algorithm called Wound-Wait: +If the transaction holding the lock is younger, the locking transaction +wounds the transaction holding the lock, requesting it to die. +If the transaction holding the lock is older, it waits for the other +transaction. Hence Wound-Wait. +The two algorithms are both fair in that a transaction will eventually succeed. +However, the Wound-Wait algorithm is typically stated to generate fewer backoffs +compared to Wait-Die, but is, on the other hand, associated with more work than +Wait-Die when recovering from a backoff. Wound-Wait is also a preemptive +algorithm in that transactions are wounded by other transactions, and that +requires a reliable way to pick up up the wounded condition and preempt the +running transaction. Note that this is not the same as process preemption. A +Wound-Wait transaction is considered preempted when it dies (returning +-EDEADLK) following a wound.
Concepts
@@ -47,10 +63,12 @@ Acquire context: To ensure eventual forward progress it is important the a task trying to acquire locks doesn't grab a new reservation id, but keeps the one it acquired when starting the lock acquisition. This ticket is stored in the acquire context. Furthermore the acquire context keeps track of debugging state -to catch w/w mutex interface abuse. +to catch w/w mutex interface abuse. An acquire context is representing a +transaction.
W/w class: In contrast to normal mutexes the lock class needs to be explicit for -w/w mutexes, since it is required to initialize the acquire context. +w/w mutexes, since it is required to initialize the acquire context. The lock +class also specifies what algorithm to use, Wound-Wait or Wait-Die.
Furthermore there are three different class of w/w lock acquire functions:
@@ -90,6 +108,12 @@ provided. Usage
+The algorithm (Wait-Die vs Wound-Wait) is chosen by using either +DEFINE_WW_CLASS() (Wound-Wait) or DEFINE_WD_CLASS() (Wait-Die) +As a rough rule of thumb, use Wound-Wait iff you +expect the number of simultaneous competing transactions to be typically small, +and you want to reduce the number of rollbacks.
Three different ways to acquire locks within the same w/w class. Common definitions for methods #1 and #2:
@@ -312,12 +336,23 @@ Design: We maintain the following invariants for the wait list: (1) Waiters with an acquire context are sorted by stamp order; waiters without an acquire context are interspersed in FIFO order.
- (2) Among waiters with contexts, only the first one can have other locks
acquired already (ctx->acquired > 0). Note that this waiter may come
after other waiters without contexts in the list.
(2) For Wait-Die, among waiters with contexts, only the first one can have
other locks acquired already (ctx->acquired > 0). Note that this waiter
may come after other waiters without contexts in the list.
The Wound-Wait preemption is implemented with a lazy-preemption scheme:
The wounded status of the transaction is checked only when there is
contention for a new lock and hence a true chance of deadlock. In that
situation, if the transaction is wounded, it backs off, clears the
wounded status and retries. A great benefit of implementing preemption in
this way is that the wounded transaction can identify a contending lock to
wait for before restarting the transaction. Just blindly restarting the
transaction would likely make the transaction end up in a situation where
it would have to back off again.
In general, not much contention is expected. The locks are typically used to
- serialize access to resources for devices.
- serialize access to resources for devices, and optimization focus should
- therefore be directed towards the uncontended cases.
Lockdep: Special care has been taken to warn for as many cases of api abuse diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c index 314eb1071cce..20bf90f4ee63 100644 --- a/drivers/dma-buf/reservation.c +++ b/drivers/dma-buf/reservation.c @@ -46,7 +46,7 @@
- write-side updates.
*/
-DEFINE_WW_CLASS(reservation_ww_class); +DEFINE_WD_CLASS(reservation_ww_class); EXPORT_SYMBOL(reservation_ww_class);
struct lock_class_key reservation_seqcount_class; diff --git a/drivers/gpu/drm/drm_modeset_lock.c b/drivers/gpu/drm/drm_modeset_lock.c index 8a5100685875..638be2eb67b4 100644 --- a/drivers/gpu/drm/drm_modeset_lock.c +++ b/drivers/gpu/drm/drm_modeset_lock.c @@ -70,7 +70,7 @@
- lists and lookup data structures.
*/
-static DEFINE_WW_CLASS(crtc_ww_class); +static DEFINE_WD_CLASS(crtc_ww_class);
/**
- drm_modeset_lock_all - take all modeset locks
diff --git a/include/linux/ww_mutex.h b/include/linux/ww_mutex.h index f82fce2229c8..3af7c0e03be5 100644 --- a/include/linux/ww_mutex.h +++ b/include/linux/ww_mutex.h @@ -8,6 +8,8 @@
- Wait/Die implementation:
- Copyright (C) 2013 Canonical Ltd.
- Choice of algorithm:
*/
- Copyright (C) 2018 WMWare Inc.
- This file contains the main data structure and API definitions.
@@ -23,12 +25,15 @@ struct ww_class { struct lock_class_key mutex_key; const char *acquire_name; const char *mutex_name;
unsigned int is_wait_die;
};
struct ww_acquire_ctx { struct task_struct *task; unsigned long stamp; unsigned int acquired;
unsigned short wounded;
unsigned short is_wait_die;
#ifdef CONFIG_DEBUG_MUTEXES unsigned int done_acquire; struct ww_class *ww_class; @@ -58,17 +63,21 @@ struct ww_mutex { # define __WW_CLASS_MUTEX_INITIALIZER(lockname, class) #endif
-#define __WW_CLASS_INITIALIZER(ww_class) \ +#define __WW_CLASS_INITIALIZER(ww_class, _is_wait_die) \ { .stamp = ATOMIC_LONG_INIT(0) \ , .acquire_name = #ww_class "_acquire" \
, .mutex_name = #ww_class "_mutex" }
, .mutex_name = #ww_class "_mutex" \
, .is_wait_die = _is_wait_die }
#define __WW_MUTEX_INITIALIZER(lockname, class) \ { .base = __MUTEX_INITIALIZER(lockname.base) \ __WW_CLASS_MUTEX_INITIALIZER(lockname, class) }
+#define DEFINE_WD_CLASS(classname) \
struct ww_class classname = __WW_CLASS_INITIALIZER(classname, 1)
#define DEFINE_WW_CLASS(classname) \
struct ww_class classname = __WW_CLASS_INITIALIZER(classname)
struct ww_class classname = __WW_CLASS_INITIALIZER(classname, 0)
#define DEFINE_WW_MUTEX(mutexname, ww_class) \ struct ww_mutex mutexname = __WW_MUTEX_INITIALIZER(mutexname, ww_class) @@ -123,6 +132,8 @@ static inline void ww_acquire_init(struct ww_acquire_ctx *ctx, ctx->task = current; ctx->stamp = atomic_long_inc_return_relaxed(&ww_class->stamp); ctx->acquired = 0;
ctx->wounded = false;
ctx->is_wait_die = ww_class->is_wait_die;
#ifdef CONFIG_DEBUG_MUTEXES ctx->ww_class = ww_class; ctx->done_acquire = 0; diff --git a/kernel/locking/locktorture.c b/kernel/locking/locktorture.c index 6850ffd69125..907e0325892c 100644 --- a/kernel/locking/locktorture.c +++ b/kernel/locking/locktorture.c @@ -365,7 +365,7 @@ static struct lock_torture_ops mutex_lock_ops = { };
#include <linux/ww_mutex.h> -static DEFINE_WW_CLASS(torture_ww_class); +static DEFINE_WD_CLASS(torture_ww_class); static DEFINE_WW_MUTEX(torture_ww_mutex_0, &torture_ww_class); static DEFINE_WW_MUTEX(torture_ww_mutex_1, &torture_ww_class); static DEFINE_WW_MUTEX(torture_ww_mutex_2, &torture_ww_class); diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c index 412b4fc08235..8ca83a5e3d24 100644 --- a/kernel/locking/mutex.c +++ b/kernel/locking/mutex.c @@ -172,6 +172,21 @@ static inline bool __mutex_waiter_is_first(struct mutex *lock, struct mutex_wait return list_first_entry(&lock->wait_list, struct mutex_waiter, list) == waiter; }
+/*
- Add @waiter to a given location in the lock wait_list and set the
- FLAG_WAITERS flag if it's the first waiter.
- */
+static void __sched +__mutex_add_waiter(struct mutex *lock, struct mutex_waiter *waiter,
struct list_head *list)
+{
debug_mutex_add_waiter(lock, waiter, current);
list_add_tail(&waiter->list, list);
if (__mutex_waiter_is_first(lock, waiter))
__mutex_set_flag(lock, MUTEX_FLAG_WAITERS);
+}
/*
- Give up ownership to a specific task, when @task = NULL, this is equivalent
- to a regular unlock. Sets PICKUP on a handoff, clears HANDOF, preserves
@@ -248,6 +263,11 @@ EXPORT_SYMBOL(mutex_lock);
- The newer transactions are killed when:
It (the new transaction) makes a request for a lock being held
by an older transaction.
- Wound-Wait:
- The newer transactions are wounded when:
An older transaction makes a request for a lock being held by
*/
the newer transaction.
/* @@ -319,6 +339,9 @@ static bool __sched __ww_mutex_die(struct mutex *lock, struct mutex_waiter *waiter, struct ww_acquire_ctx *ww_ctx) {
if (!ww_ctx->is_wait_die)
return false;
if (waiter->ww_ctx->acquired > 0 && __ww_ctx_stamp_after(waiter->ww_ctx, ww_ctx)) { debug_mutex_wake_waiter(lock, waiter);
@@ -328,13 +351,65 @@ __ww_mutex_die(struct mutex *lock, struct mutex_waiter *waiter, return true; }
+/*
- Wound-Wait; wound a younger @hold_ctx if it holds the lock.
- Wound the lock holder if there are waiters with older transactions than
- the lock holders. Even if multiple waiters may wound the lock holder,
- it's sufficient that only one does.
- */
+static bool __ww_mutex_wound(struct mutex *lock,
struct ww_acquire_ctx *ww_ctx,
struct ww_acquire_ctx *hold_ctx)
+{
struct task_struct *owner = __mutex_owner(lock);
lockdep_assert_held(&lock->wait_lock);
/*
* Possible through __ww_mutex_add_waiter() when we race with
* ww_mutex_set_context_fastpath(). In that case we'll get here again
* through __ww_mutex_check_waiters().
*/
if (!hold_ctx)
return false;
/*
* Can have !owner because of __mutex_unlock_slowpath(), but if owner,
* it cannot go away because we'll have FLAG_WAITERS set and hold
* wait_lock.
*/
if (!owner)
return false;
if (ww_ctx->acquired > 0 && __ww_ctx_stamp_after(hold_ctx, ww_ctx)) {
hold_ctx->wounded = 1;
/*
* wake_up_process() paired with set_current_state()
* inserts sufficient barriers to make sure @owner either sees
* it's wounded in __ww_mutex_lock_check_stamp() or has a
* wakeup pending to re-read the wounded state.
*/
if (owner != current)
wake_up_process(owner);
return true;
}
return false;
+}
/*
- We just acquired @lock under @ww_ctx, if there are later contexts waiting
- behind us on the wait-list, check if they need to die.
- behind us on the wait-list, check if they need to die, or wound us.
- See __ww_mutex_add_waiter() for the list-order construction; basically the
- list is ordered by stamp, smallest (oldest) first.
- This relies on never mixing wait-die/wound-wait on the same wait-list;
- which is currently ensured by that being a ww_class property.
*/
- The current task must not be on the wait list.
static void __sched @@ -348,7 +423,8 @@ __ww_mutex_check_waiters(struct mutex *lock, struct ww_acquire_ctx *ww_ctx) if (!cur->ww_ctx) continue;
if (__ww_mutex_die(lock, cur, ww_ctx))
if (__ww_mutex_die(lock, cur, ww_ctx) ||
__ww_mutex_wound(lock, cur->ww_ctx, ww_ctx)) break; }
} @@ -369,17 +445,23 @@ ww_mutex_set_context_fastpath(struct ww_mutex *lock, struct ww_acquire_ctx *ctx) * and keep spinning, or it will acquire wait_lock, add itself * to waiter list and sleep. */
smp_mb(); /* ^^^ */
smp_mb(); /* See comments above and below. */ /*
* Check if lock is contended, if not there is nobody to wake up
* [W] ww->ctx = ctx [W] MUTEX_FLAG_WAITERS
* MB MB
* [R] MUTEX_FLAG_WAITERS [R] ww->ctx
*
* The memory barrier above pairs with the memory barrier in
* __ww_mutex_add_waiter() and makes sure we either observe ww->ctx
* and/or !empty list. */ if (likely(!(atomic_long_read(&lock->base.owner) & MUTEX_FLAG_WAITERS))) return; /* * Uh oh, we raced in fastpath, check if any of the waiters need to
* die.
* die or wound us. */ spin_lock(&lock->base.wait_lock); __ww_mutex_check_waiters(&lock->base, ctx);
@@ -681,7 +763,9 @@ __ww_mutex_kill(struct mutex *lock, struct ww_acquire_ctx *ww_ctx)
/*
- Check whether we need to kill the transaction for the current lock acquire.
- Check the wound condition for the current lock acquire.
- Wound-Wait: If we're wounded, kill ourself.
- Wait-Die: If we're trying to acquire a lock already held by an older
context, kill ourselves.
@@ -700,6 +784,13 @@ __ww_mutex_check_kill(struct mutex *lock, struct mutex_waiter *waiter, if (ctx->acquired == 0) return 0;
if (!ctx->is_wait_die) {
if (ctx->wounded)
return __ww_mutex_kill(lock, ctx);
return 0;
}
if (hold_ctx && __ww_ctx_stamp_after(ctx, hold_ctx)) return __ww_mutex_kill(lock, ctx);
@@ -726,7 +817,8 @@ __ww_mutex_check_kill(struct mutex *lock, struct mutex_waiter *waiter,
- Waiters without context are interspersed in FIFO order.
- Furthermore, for Wait-Die kill ourself immediately when possible (there are
- older contexts already waiting) to avoid unnecessary waiting.
- older contexts already waiting) to avoid unnecessary waiting and for
*/
- Wound-Wait ensure we wound the owning context when it is younger.
static inline int __sched __ww_mutex_add_waiter(struct mutex_waiter *waiter, @@ -735,16 +827,21 @@ __ww_mutex_add_waiter(struct mutex_waiter *waiter, { struct mutex_waiter *cur; struct list_head *pos;
bool is_wait_die; if (!ww_ctx) {
list_add_tail(&waiter->list, &lock->wait_list);
__mutex_add_waiter(lock, waiter, &lock->wait_list); return 0; }
is_wait_die = ww_ctx->is_wait_die;
/* * Add the waiter before the first waiter with a higher stamp. * Waiters without a context are skipped to avoid starving
* them. Wait-Die waiters may die here.
* them. Wait-Die waiters may die here. Wound-Wait waiters
* never die here, but they are sorted in stamp order and
* may wound the lock holder. */ pos = &lock->wait_list; list_for_each_entry_reverse(cur, &lock->wait_list, list) {
@@ -757,10 +854,12 @@ __ww_mutex_add_waiter(struct mutex_waiter *waiter, * is no point in queueing behind it, as we'd have to * die the moment it would acquire the lock. */
int ret = __ww_mutex_kill(lock, ww_ctx);
if (is_wait_die) {
int ret = __ww_mutex_kill(lock, ww_ctx);
if (ret)
return ret;
if (ret)
return ret;
} break; }
@@ -771,7 +870,23 @@ __ww_mutex_add_waiter(struct mutex_waiter *waiter, __ww_mutex_die(lock, cur, ww_ctx); }
list_add_tail(&waiter->list, pos);
__mutex_add_waiter(lock, waiter, pos);
/*
* Wound-Wait: if we're blocking on a mutex owned by a younger context,
* wound that such that we might proceed.
*/
if (!is_wait_die) {
struct ww_mutex *ww = container_of(lock, struct ww_mutex, base);
/*
* See ww_mutex_set_context_fastpath(). Orders setting
* MUTEX_FLAG_WAITERS vs the ww->ctx load,
* such that either we or the fastpath will wound @ww->ctx.
*/
smp_mb();
__ww_mutex_wound(lock, ww_ctx, ww->ctx);
} return 0;
} @@ -795,6 +910,14 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, if (use_ww_ctx && ww_ctx) { if (unlikely(ww_ctx == READ_ONCE(ww->ctx))) return -EALREADY;
/*
* Reset the wounded flag after a kill. No other process can
* race and wound us here since they can't have a valid owner
* pointer if we don't have any locks held.
*/
if (ww_ctx->acquired == 0)
ww_ctx->wounded = 0; } preempt_disable();
@@ -828,7 +951,8 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
if (!use_ww_ctx) { /* add waiting tasks to the end of the waitqueue (FIFO): */
list_add_tail(&waiter.list, &lock->wait_list);
__mutex_add_waiter(lock, &waiter, &lock->wait_list);
#ifdef CONFIG_DEBUG_MUTEXES waiter.ww_ctx = MUTEX_POISON_WW_CTX; @@ -847,9 +971,6 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
waiter.task = current;
if (__mutex_waiter_is_first(lock, &waiter))
__mutex_set_flag(lock, MUTEX_FLAG_WAITERS);
set_current_state(state); for (;;) { /*
@@ -906,6 +1027,16 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, acquired: __set_current_state(TASK_RUNNING);
if (use_ww_ctx && ww_ctx) {
/*
* Wound-Wait; we stole the lock (!first_waiter), check the
* waiters as anyone might want to wound us.
*/
if (!ww_ctx->is_wait_die &&
!__mutex_waiter_is_first(lock, &waiter))
__ww_mutex_check_waiters(lock, ww_ctx);
}
mutex_remove_waiter(lock, &waiter, current); if (likely(list_empty(&lock->wait_list))) __mutex_clear_flag(lock, MUTEX_FLAGS);
diff --git a/kernel/locking/test-ww_mutex.c b/kernel/locking/test-ww_mutex.c index 0e4cd64ad2c0..5b915b370d5a 100644 --- a/kernel/locking/test-ww_mutex.c +++ b/kernel/locking/test-ww_mutex.c @@ -26,7 +26,7 @@ #include <linux/slab.h> #include <linux/ww_mutex.h>
-static DEFINE_WW_CLASS(ww_class); +static DEFINE_WD_CLASS(ww_class); struct workqueue_struct *wq;
struct test_mutex { diff --git a/lib/locking-selftest.c b/lib/locking-selftest.c index b5c1293ce147..1e1bbf171eca 100644 --- a/lib/locking-selftest.c +++ b/lib/locking-selftest.c @@ -29,7 +29,7 @@ */ static unsigned int debug_locks_verbose;
-static DEFINE_WW_CLASS(ww_lockdep); +static DEFINE_WD_CLASS(ww_lockdep);
static int __init setup_debug_locks_verbose(char *str) { -- 2.14.3
Hi,
On Wed, 2019-01-16 at 09:24 -0500, Rob Clark wrote:
So, I guess this is to do w/ the magic of merge commits, but it looks like the hunk changing the crtc_ww_class got lost:
So what happened here is that this commit changed it to DEFINE_WD_CLASS and the following commit changed it back again to DEFINE_WW_CLASS so that the algorithm change and only that came with the last commit. Apparently the history of that line got lost when merging since the merge didn't change it; git blame doesn't show it changed, but when looking at the commit history in gitk it's all clear. I guess that's a feature of git.
The modeset locks now use true wound-wait rather than wait-die.
/Thomas
~/src/linux master git show --pretty=short 08295b3b5beec9aac0f7a9db86f0fc3792039da3 drivers/gpu/drm/drm_modeset_lock.c commit 08295b3b5beec9aac0f7a9db86f0fc3792039da3 Author: Thomas Hellstrom thellstrom@vmware.com
locking: Implement an algorithm choice for Wound-Wait mutexes
diff --git a/drivers/gpu/drm/drm_modeset_lock.c b/drivers/gpu/drm/drm_modeset_lock.c index 8a5100685875..638be2eb67b4 100644 --- a/drivers/gpu/drm/drm_modeset_lock.c +++ b/drivers/gpu/drm/drm_modeset_lock.c @@ -70,7 +70,7 @@
- lists and lookup data structures.
*/
-static DEFINE_WW_CLASS(crtc_ww_class); +static DEFINE_WD_CLASS(crtc_ww_class);
/**
- drm_modeset_lock_all - take all modeset locks
~/src/linux master git log --pretty=format:%H drivers/gpu/drm/drm_modeset_lock.c | grep 08295b3b5beec9aac0f7a9db86f0fc3792039da3 ~/src/linux master 1
BR, -R
On Tue, Jun 19, 2018 at 4:29 AM Thomas Hellstrom < thellstrom@vmware.com> wrote:
The current Wound-Wait mutex algorithm is actually not Wound-Wait but Wait-Die. Implement also Wound-Wait as a per-ww-class choice. Wound-Wait is, contrary to Wait-Die a preemptive algorithm and is known to generate fewer backoffs. Testing reveals that this is true if the number of simultaneous contending transactions is small. As the number of simultaneous contending threads increases, Wait- Wound becomes inferior to Wait-Die in terms of elapsed time. Possibly due to the larger number of held locks of sleeping transactions.
Update documentation and callers.
Timings using git://people.freedesktop.org/~thomash/ww_mutex_test tag patch-18-06-15
Each thread runs 100000 batches of lock / unlock 800 ww mutexes randomly chosen out of 100000. Four core Intel x86_64:
Algorithm #threads Rollbacks time Wound-Wait 4 ~100 ~17s. Wait-Die 4 ~150000 ~19s. Wound-Wait 16 ~360000 ~109s. Wait-Die 16 ~450000 ~82s.
Cc: Ingo Molnar mingo@redhat.com Cc: Jonathan Corbet corbet@lwn.net Cc: Gustavo Padovan gustavo@padovan.org Cc: Maarten Lankhorst maarten.lankhorst@linux.intel.com Cc: Sean Paul seanpaul@chromium.org Cc: David Airlie airlied@linux.ie Cc: Davidlohr Bueso dave@stgolabs.net Cc: "Paul E. McKenney" paulmck@linux.vnet.ibm.com Cc: Josh Triplett josh@joshtriplett.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Kate Stewart kstewart@linuxfoundation.org Cc: Philippe Ombredanne pombredanne@nexb.com Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: linux-doc@vger.kernel.org Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Co-authored-by: Peter Zijlstra peterz@infradead.org Signed-off-by: Thomas Hellstrom thellstrom@vmware.com
v2:
- Update API according to review comment by Greg Kroah-Hartman.
- Address review comments by Peter Zijlstra:
- Avoid _Bool in composites
- Fix typo
- Use __mutex_owner() where applicable
- Rely on built-in barriers for the main loop exit condition, struct ww_acquire_ctx::wounded. Update code comments.
- Explain unlocked use of list_empty().
v3:
- Adapt to and incorporate cleanup by Peter Zijlstra
- Remove unlocked use of list_empty().
v4:
- Move code related to adding a waiter to the lock waiter list to a separate function.
Documentation/locking/ww-mutex-design.txt | 57 +++++++++-- drivers/dma-buf/reservation.c | 2 +- drivers/gpu/drm/drm_modeset_lock.c | 2 +- include/linux/ww_mutex.h | 17 ++- kernel/locking/locktorture.c | 2 +- kernel/locking/mutex.c | 165 +++++++++++++++++++++++++++--- kernel/locking/test-ww_mutex.c | 2 +- lib/locking-selftest.c | 2 +- 8 files changed, 213 insertions(+), 36 deletions(-)
diff --git a/Documentation/locking/ww-mutex-design.txt b/Documentation/locking/ww-mutex-design.txt index 2fd7f2a2af21..f0ed7c30e695 100644 --- a/Documentation/locking/ww-mutex-design.txt +++ b/Documentation/locking/ww-mutex-design.txt @@ -1,4 +1,4 @@ -Wait/Wound Deadlock-Proof Mutex Design
+Wound/Wait Deadlock-Proof Mutex Design
Please read mutex-design.txt first, as it applies to wait/wound mutexes too. @@ -32,10 +32,26 @@ the oldest task) wins, and the one with the higher reservation id (i.e. the younger task) unlocks all of the buffers that it has already locked, and then tries again.
-In the RDBMS literature this deadlock handling approach is called wait/die: -The older tasks waits until it can acquire the contended lock. The younger tasks -needs to back off and drop all the locks it is currently holding, i.e. the -younger task dies. +In the RDBMS literature, a reservation ticket is associated with a transaction. +and the deadlock handling approach is called Wait-Die. The name is based on +the actions of a locking thread when it encounters an already locked mutex. +If the transaction holding the lock is younger, the locking transaction waits. +If the transaction holding the lock is older, the locking transaction backs off +and dies. Hence Wait-Die. +There is also another algorithm called Wound-Wait: +If the transaction holding the lock is younger, the locking transaction +wounds the transaction holding the lock, requesting it to die. +If the transaction holding the lock is older, it waits for the other +transaction. Hence Wound-Wait. +The two algorithms are both fair in that a transaction will eventually succeed. +However, the Wound-Wait algorithm is typically stated to generate fewer backoffs +compared to Wait-Die, but is, on the other hand, associated with more work than +Wait-Die when recovering from a backoff. Wound-Wait is also a preemptive +algorithm in that transactions are wounded by other transactions, and that +requires a reliable way to pick up up the wounded condition and preempt the +running transaction. Note that this is not the same as process preemption. A +Wound-Wait transaction is considered preempted when it dies (returning +-EDEADLK) following a wound.
Concepts
@@ -47,10 +63,12 @@ Acquire context: To ensure eventual forward progress it is important the a task trying to acquire locks doesn't grab a new reservation id, but keeps the one it acquired when starting the lock acquisition. This ticket is stored in the acquire context. Furthermore the acquire context keeps track of debugging state -to catch w/w mutex interface abuse. +to catch w/w mutex interface abuse. An acquire context is representing a +transaction.
W/w class: In contrast to normal mutexes the lock class needs to be explicit for -w/w mutexes, since it is required to initialize the acquire context. +w/w mutexes, since it is required to initialize the acquire context. The lock +class also specifies what algorithm to use, Wound-Wait or Wait- Die.
Furthermore there are three different class of w/w lock acquire functions:
@@ -90,6 +108,12 @@ provided. Usage
+The algorithm (Wait-Die vs Wound-Wait) is chosen by using either +DEFINE_WW_CLASS() (Wound-Wait) or DEFINE_WD_CLASS() (Wait-Die) +As a rough rule of thumb, use Wound-Wait iff you +expect the number of simultaneous competing transactions to be typically small, +and you want to reduce the number of rollbacks.
Three different ways to acquire locks within the same w/w class. Common definitions for methods #1 and #2:
@@ -312,12 +336,23 @@ Design: We maintain the following invariants for the wait list: (1) Waiters with an acquire context are sorted by stamp order; waiters without an acquire context are interspersed in FIFO order.
- (2) Among waiters with contexts, only the first one can have
other locks
acquired already (ctx->acquired > 0). Note that this waiter
may come
after other waiters without contexts in the list.
- (2) For Wait-Die, among waiters with contexts, only the first
one can have
other locks acquired already (ctx->acquired > 0). Note that
this waiter
may come after other waiters without contexts in the list.
- The Wound-Wait preemption is implemented with a lazy-preemption
scheme:
- The wounded status of the transaction is checked only when there
is
- contention for a new lock and hence a true chance of deadlock.
In that
- situation, if the transaction is wounded, it backs off, clears
the
- wounded status and retries. A great benefit of implementing
preemption in
- this way is that the wounded transaction can identify a
contending lock to
- wait for before restarting the transaction. Just blindly
restarting the
- transaction would likely make the transaction end up in a
situation where
it would have to back off again.
In general, not much contention is expected. The locks are
typically used to
- serialize access to resources for devices.
- serialize access to resources for devices, and optimization
focus should
- therefore be directed towards the uncontended cases.
Lockdep: Special care has been taken to warn for as many cases of api abuse diff --git a/drivers/dma-buf/reservation.c b/drivers/dma- buf/reservation.c index 314eb1071cce..20bf90f4ee63 100644 --- a/drivers/dma-buf/reservation.c +++ b/drivers/dma-buf/reservation.c @@ -46,7 +46,7 @@
- write-side updates.
*/
-DEFINE_WW_CLASS(reservation_ww_class); +DEFINE_WD_CLASS(reservation_ww_class); EXPORT_SYMBOL(reservation_ww_class);
struct lock_class_key reservation_seqcount_class; diff --git a/drivers/gpu/drm/drm_modeset_lock.c b/drivers/gpu/drm/drm_modeset_lock.c index 8a5100685875..638be2eb67b4 100644 --- a/drivers/gpu/drm/drm_modeset_lock.c +++ b/drivers/gpu/drm/drm_modeset_lock.c @@ -70,7 +70,7 @@
- lists and lookup data structures.
*/
-static DEFINE_WW_CLASS(crtc_ww_class); +static DEFINE_WD_CLASS(crtc_ww_class);
/**
- drm_modeset_lock_all - take all modeset locks
diff --git a/include/linux/ww_mutex.h b/include/linux/ww_mutex.h index f82fce2229c8..3af7c0e03be5 100644 --- a/include/linux/ww_mutex.h +++ b/include/linux/ww_mutex.h @@ -8,6 +8,8 @@
- Wait/Die implementation:
- Copyright (C) 2013 Canonical Ltd.
- Choice of algorithm:
*/
- Copyright (C) 2018 WMWare Inc.
- This file contains the main data structure and API definitions.
@@ -23,12 +25,15 @@ struct ww_class { struct lock_class_key mutex_key; const char *acquire_name; const char *mutex_name;
unsigned int is_wait_die;
};
struct ww_acquire_ctx { struct task_struct *task; unsigned long stamp; unsigned int acquired;
unsigned short wounded;
unsigned short is_wait_die;
#ifdef CONFIG_DEBUG_MUTEXES unsigned int done_acquire; struct ww_class *ww_class; @@ -58,17 +63,21 @@ struct ww_mutex { # define __WW_CLASS_MUTEX_INITIALIZER(lockname, class) #endif
-#define __WW_CLASS_INITIALIZER(ww_class) \ +#define __WW_CLASS_INITIALIZER(ww_class, _is_wait_die) \ { .stamp = ATOMIC_LONG_INIT(0) \ , .acquire_name = #ww_class "_acquire" \
, .mutex_name = #ww_class "_mutex" }
, .mutex_name = #ww_class "_mutex" \
, .is_wait_die = _is_wait_die }
#define __WW_MUTEX_INITIALIZER(lockname, class) \ { .base = __MUTEX_INITIALIZER(lockname.base) \ __WW_CLASS_MUTEX_INITIALIZER(lockname, class) }
+#define DEFINE_WD_CLASS(classname) \
struct ww_class classname =
__WW_CLASS_INITIALIZER(classname, 1)
#define DEFINE_WW_CLASS(classname) \
struct ww_class classname =
__WW_CLASS_INITIALIZER(classname)
struct ww_class classname =
__WW_CLASS_INITIALIZER(classname, 0)
#define DEFINE_WW_MUTEX(mutexname, ww_class) \ struct ww_mutex mutexname = __WW_MUTEX_INITIALIZER(mutexname, ww_class) @@ -123,6 +132,8 @@ static inline void ww_acquire_init(struct ww_acquire_ctx *ctx, ctx->task = current; ctx->stamp = atomic_long_inc_return_relaxed(&ww_class-
stamp);
ctx->acquired = 0;
ctx->wounded = false;
ctx->is_wait_die = ww_class->is_wait_die;
#ifdef CONFIG_DEBUG_MUTEXES ctx->ww_class = ww_class; ctx->done_acquire = 0; diff --git a/kernel/locking/locktorture.c b/kernel/locking/locktorture.c index 6850ffd69125..907e0325892c 100644 --- a/kernel/locking/locktorture.c +++ b/kernel/locking/locktorture.c @@ -365,7 +365,7 @@ static struct lock_torture_ops mutex_lock_ops = { };
#include <linux/ww_mutex.h> -static DEFINE_WW_CLASS(torture_ww_class); +static DEFINE_WD_CLASS(torture_ww_class); static DEFINE_WW_MUTEX(torture_ww_mutex_0, &torture_ww_class); static DEFINE_WW_MUTEX(torture_ww_mutex_1, &torture_ww_class); static DEFINE_WW_MUTEX(torture_ww_mutex_2, &torture_ww_class); diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c index 412b4fc08235..8ca83a5e3d24 100644 --- a/kernel/locking/mutex.c +++ b/kernel/locking/mutex.c @@ -172,6 +172,21 @@ static inline bool __mutex_waiter_is_first(struct mutex *lock, struct mutex_wait return list_first_entry(&lock->wait_list, struct mutex_waiter, list) == waiter; }
+/*
- Add @waiter to a given location in the lock wait_list and set
the
- FLAG_WAITERS flag if it's the first waiter.
- */
+static void __sched +__mutex_add_waiter(struct mutex *lock, struct mutex_waiter *waiter,
struct list_head *list)
+{
debug_mutex_add_waiter(lock, waiter, current);
list_add_tail(&waiter->list, list);
if (__mutex_waiter_is_first(lock, waiter))
__mutex_set_flag(lock, MUTEX_FLAG_WAITERS);
+}
/*
- Give up ownership to a specific task, when @task = NULL, this
is equivalent
- to a regular unlock. Sets PICKUP on a handoff, clears HANDOF,
preserves @@ -248,6 +263,11 @@ EXPORT_SYMBOL(mutex_lock);
- The newer transactions are killed when:
It (the new transaction) makes a request for a lock being
held
by an older transaction.
- Wound-Wait:
- The newer transactions are wounded when:
An older transaction makes a request for a lock being held
by
*/
the newer transaction.
/* @@ -319,6 +339,9 @@ static bool __sched __ww_mutex_die(struct mutex *lock, struct mutex_waiter *waiter, struct ww_acquire_ctx *ww_ctx) {
if (!ww_ctx->is_wait_die)
return false;
if (waiter->ww_ctx->acquired > 0 && __ww_ctx_stamp_after(waiter->ww_ctx,
ww_ctx)) { debug_mutex_wake_waiter(lock, waiter); @@ -328,13 +351,65 @@ __ww_mutex_die(struct mutex *lock, struct mutex_waiter *waiter, return true; }
+/*
- Wound-Wait; wound a younger @hold_ctx if it holds the lock.
- Wound the lock holder if there are waiters with older
transactions than
- the lock holders. Even if multiple waiters may wound the lock
holder,
- it's sufficient that only one does.
- */
+static bool __ww_mutex_wound(struct mutex *lock,
struct ww_acquire_ctx *ww_ctx,
struct ww_acquire_ctx *hold_ctx)
+{
struct task_struct *owner = __mutex_owner(lock);
lockdep_assert_held(&lock->wait_lock);
/*
* Possible through __ww_mutex_add_waiter() when we race
with
* ww_mutex_set_context_fastpath(). In that case we'll get
here again
* through __ww_mutex_check_waiters().
*/
if (!hold_ctx)
return false;
/*
* Can have !owner because of __mutex_unlock_slowpath(),
but if owner,
* it cannot go away because we'll have FLAG_WAITERS set
and hold
* wait_lock.
*/
if (!owner)
return false;
if (ww_ctx->acquired > 0 && __ww_ctx_stamp_after(hold_ctx,
ww_ctx)) {
hold_ctx->wounded = 1;
/*
* wake_up_process() paired with
set_current_state()
* inserts sufficient barriers to make sure @owner
either sees
* it's wounded in __ww_mutex_lock_check_stamp() or
has a
* wakeup pending to re-read the wounded state.
*/
if (owner != current)
wake_up_process(owner);
return true;
}
return false;
+}
/*
- We just acquired @lock under @ww_ctx, if there are later
contexts waiting
- behind us on the wait-list, check if they need to die.
- behind us on the wait-list, check if they need to die, or wound
us.
- See __ww_mutex_add_waiter() for the list-order construction;
basically the
- list is ordered by stamp, smallest (oldest) first.
- This relies on never mixing wait-die/wound-wait on the same
wait-list;
- which is currently ensured by that being a ww_class property.
*/
- The current task must not be on the wait list.
static void __sched @@ -348,7 +423,8 @@ __ww_mutex_check_waiters(struct mutex *lock, struct ww_acquire_ctx *ww_ctx) if (!cur->ww_ctx) continue;
if (__ww_mutex_die(lock, cur, ww_ctx))
if (__ww_mutex_die(lock, cur, ww_ctx) ||
__ww_mutex_wound(lock, cur->ww_ctx, ww_ctx)) break; }
} @@ -369,17 +445,23 @@ ww_mutex_set_context_fastpath(struct ww_mutex *lock, struct ww_acquire_ctx *ctx) * and keep spinning, or it will acquire wait_lock, add itself * to waiter list and sleep. */
smp_mb(); /* ^^^ */
smp_mb(); /* See comments above and below. */ /*
* Check if lock is contended, if not there is nobody to
wake up
* [W] ww->ctx = ctx [W] MUTEX_FLAG_WAITERS
* MB MB
* [R] MUTEX_FLAG_WAITERS [R] ww->ctx
*
* The memory barrier above pairs with the memory barrier
in
* __ww_mutex_add_waiter() and makes sure we either observe
ww->ctx
* and/or !empty list. */ if (likely(!(atomic_long_read(&lock->base.owner) &
MUTEX_FLAG_WAITERS))) return;
/* * Uh oh, we raced in fastpath, check if any of the waiters
need to
* die.
* die or wound us. */ spin_lock(&lock->base.wait_lock); __ww_mutex_check_waiters(&lock->base, ctx);
@@ -681,7 +763,9 @@ __ww_mutex_kill(struct mutex *lock, struct ww_acquire_ctx *ww_ctx)
/*
- Check whether we need to kill the transaction for the current
lock acquire.
- Check the wound condition for the current lock acquire.
- Wound-Wait: If we're wounded, kill ourself.
- Wait-Die: If we're trying to acquire a lock already held by an
older
context, kill ourselves.
@@ -700,6 +784,13 @@ __ww_mutex_check_kill(struct mutex *lock, struct mutex_waiter *waiter, if (ctx->acquired == 0) return 0;
if (!ctx->is_wait_die) {
if (ctx->wounded)
return __ww_mutex_kill(lock, ctx);
return 0;
}
if (hold_ctx && __ww_ctx_stamp_after(ctx, hold_ctx)) return __ww_mutex_kill(lock, ctx);
@@ -726,7 +817,8 @@ __ww_mutex_check_kill(struct mutex *lock, struct mutex_waiter *waiter,
- Waiters without context are interspersed in FIFO order.
- Furthermore, for Wait-Die kill ourself immediately when
possible (there are
- older contexts already waiting) to avoid unnecessary waiting.
- older contexts already waiting) to avoid unnecessary waiting
and for
- Wound-Wait ensure we wound the owning context when it is
younger. */ static inline int __sched __ww_mutex_add_waiter(struct mutex_waiter *waiter, @@ -735,16 +827,21 @@ __ww_mutex_add_waiter(struct mutex_waiter *waiter, { struct mutex_waiter *cur; struct list_head *pos;
bool is_wait_die; if (!ww_ctx) {
list_add_tail(&waiter->list, &lock->wait_list);
__mutex_add_waiter(lock, waiter, &lock->wait_list); return 0; }
is_wait_die = ww_ctx->is_wait_die;
/* * Add the waiter before the first waiter with a higher
stamp. * Waiters without a context are skipped to avoid starving
* them. Wait-Die waiters may die here.
* them. Wait-Die waiters may die here. Wound-Wait waiters
* never die here, but they are sorted in stamp order and
* may wound the lock holder. */ pos = &lock->wait_list; list_for_each_entry_reverse(cur, &lock->wait_list, list) {
@@ -757,10 +854,12 @@ __ww_mutex_add_waiter(struct mutex_waiter *waiter, * is no point in queueing behind it, as we'd have to * die the moment it would acquire the lock. */
int ret = __ww_mutex_kill(lock, ww_ctx);
if (is_wait_die) {
int ret = __ww_mutex_kill(lock,
ww_ctx);
if (ret)
return ret;
if (ret)
return ret;
} break; }
@@ -771,7 +870,23 @@ __ww_mutex_add_waiter(struct mutex_waiter *waiter, __ww_mutex_die(lock, cur, ww_ctx); }
list_add_tail(&waiter->list, pos);
__mutex_add_waiter(lock, waiter, pos);
/*
* Wound-Wait: if we're blocking on a mutex owned by a
younger context,
* wound that such that we might proceed.
*/
if (!is_wait_die) {
struct ww_mutex *ww = container_of(lock, struct
ww_mutex, base);
/*
* See ww_mutex_set_context_fastpath(). Orders
setting
* MUTEX_FLAG_WAITERS vs the ww->ctx load,
* such that either we or the fastpath will wound
@ww->ctx.
*/
smp_mb();
__ww_mutex_wound(lock, ww_ctx, ww->ctx);
} return 0;
} @@ -795,6 +910,14 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, if (use_ww_ctx && ww_ctx) { if (unlikely(ww_ctx == READ_ONCE(ww->ctx))) return -EALREADY;
/*
* Reset the wounded flag after a kill. No other
process can
* race and wound us here since they can't have a
valid owner
* pointer if we don't have any locks held.
*/
if (ww_ctx->acquired == 0)
ww_ctx->wounded = 0; } preempt_disable();
@@ -828,7 +951,8 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
if (!use_ww_ctx) { /* add waiting tasks to the end of the waitqueue
(FIFO): */
list_add_tail(&waiter.list, &lock->wait_list);
__mutex_add_waiter(lock, &waiter, &lock-
wait_list);
#ifdef CONFIG_DEBUG_MUTEXES waiter.ww_ctx = MUTEX_POISON_WW_CTX; @@ -847,9 +971,6 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
waiter.task = current;
if (__mutex_waiter_is_first(lock, &waiter))
__mutex_set_flag(lock, MUTEX_FLAG_WAITERS);
set_current_state(state); for (;;) { /*
@@ -906,6 +1027,16 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass, acquired: __set_current_state(TASK_RUNNING);
if (use_ww_ctx && ww_ctx) {
/*
* Wound-Wait; we stole the lock (!first_waiter),
check the
* waiters as anyone might want to wound us.
*/
if (!ww_ctx->is_wait_die &&
!__mutex_waiter_is_first(lock, &waiter))
__ww_mutex_check_waiters(lock, ww_ctx);
}
mutex_remove_waiter(lock, &waiter, current); if (likely(list_empty(&lock->wait_list))) __mutex_clear_flag(lock, MUTEX_FLAGS);
diff --git a/kernel/locking/test-ww_mutex.c b/kernel/locking/test- ww_mutex.c index 0e4cd64ad2c0..5b915b370d5a 100644 --- a/kernel/locking/test-ww_mutex.c +++ b/kernel/locking/test-ww_mutex.c @@ -26,7 +26,7 @@ #include <linux/slab.h> #include <linux/ww_mutex.h>
-static DEFINE_WW_CLASS(ww_class); +static DEFINE_WD_CLASS(ww_class); struct workqueue_struct *wq;
struct test_mutex { diff --git a/lib/locking-selftest.c b/lib/locking-selftest.c index b5c1293ce147..1e1bbf171eca 100644 --- a/lib/locking-selftest.c +++ b/lib/locking-selftest.c @@ -29,7 +29,7 @@ */ static unsigned int debug_locks_verbose;
-static DEFINE_WW_CLASS(ww_lockdep); +static DEFINE_WD_CLASS(ww_lockdep);
static int __init setup_debug_locks_verbose(char *str) { -- 2.14.3
On Wed, Jan 16, 2019 at 11:49 AM Thomas Hellstrom thellstrom@vmware.com wrote:
Hi,
On Wed, 2019-01-16 at 09:24 -0500, Rob Clark wrote:
So, I guess this is to do w/ the magic of merge commits, but it looks like the hunk changing the crtc_ww_class got lost:
So what happened here is that this commit changed it to DEFINE_WD_CLASS and the following commit changed it back again to DEFINE_WW_CLASS so that the algorithm change and only that came with the last commit. Apparently the history of that line got lost when merging since the merge didn't change it; git blame doesn't show it changed, but when looking at the commit history in gitk it's all clear. I guess that's a feature of git.
The modeset locks now use true wound-wait rather than wait-die.
ahh, ok, that makes sense. Thanks for clearing up my confusion.
BR, -R
I suspect you want this through the DRM tree? Ingo are you OK with that?
linaro-mm-sig@lists.linaro.org