When assiging and testing taskstats in taskstats_exit() there's a race when writing and reading sig->stats when a thread-group with more than one thread exits:
cpu0: thread catches fatal signal and whole thread-group gets taken down do_exit() do_group_exit() taskstats_exit() taskstats_tgid_alloc() The tasks reads sig->stats holding sighand lock seeing garbage.
cpu1: task calls exit_group() do_exit() do_group_exit() taskstats_exit() taskstats_tgid_alloc() The task takes sighand lock and assigns new stats to sig->stats.
Fix this by using READ_ONCE() and smp_store_release().
Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com Fixes: 34ec12349c8a ("taskstats: cleanup ->signal->stats allocation") Cc: stable@vger.kernel.org Signed-off-by: Christian Brauner christian.brauner@ubuntu.com Reviewed-by: Dmitry Vyukov dvyukov@google.com Link: https://lore.kernel.org/r/20191006235216.7483-1-christian.brauner@ubuntu.com --- /* v1 */ Link: https://lore.kernel.org/r/20191005112806.13960-1-christian.brauner@ubuntu.co...
/* v2 */ - Dmitry Vyukov dvyukov@google.com, Marco Elver elver@google.com: - fix the original double-checked locking using memory barriers
/* v3 */ - Andrea Parri parri.andrea@gmail.com: - document memory barriers to make checkpatch happy --- kernel/taskstats.c | 21 ++++++++++++--------- 1 file changed, 12 insertions(+), 9 deletions(-)
diff --git a/kernel/taskstats.c b/kernel/taskstats.c index 13a0f2e6ebc2..978d7931fb65 100644 --- a/kernel/taskstats.c +++ b/kernel/taskstats.c @@ -554,24 +554,27 @@ static int taskstats_user_cmd(struct sk_buff *skb, struct genl_info *info) static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk) { struct signal_struct *sig = tsk->signal; - struct taskstats *stats; + struct taskstats *stats_new, *stats;
- if (sig->stats || thread_group_empty(tsk)) - goto ret; + /* Pairs with smp_store_release() below. */ + stats = READ_ONCE(sig->stats); + if (stats || thread_group_empty(tsk)) + return stats;
/* No problem if kmem_cache_zalloc() fails */ - stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL); + stats_new = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
spin_lock_irq(&tsk->sighand->siglock); if (!sig->stats) { - sig->stats = stats; - stats = NULL; + /* Pairs with READ_ONCE() above. */ + smp_store_release(&sig->stats, stats_new); + stats_new = NULL; } spin_unlock_irq(&tsk->sighand->siglock);
- if (stats) - kmem_cache_free(taskstats_cache, stats); -ret: + if (stats_new) + kmem_cache_free(taskstats_cache, stats_new); + return sig->stats; }
On Mon, Oct 07, 2019 at 01:01:17PM +0200, Christian Brauner wrote:
When assiging and testing taskstats in taskstats_exit() there's a race when writing and reading sig->stats when a thread-group with more than one thread exits:
cpu0: thread catches fatal signal and whole thread-group gets taken down do_exit() do_group_exit() taskstats_exit() taskstats_tgid_alloc() The tasks reads sig->stats holding sighand lock seeing garbage.
You meant "without holding sighand lock" here, right?
cpu1: task calls exit_group() do_exit() do_group_exit() taskstats_exit() taskstats_tgid_alloc() The task takes sighand lock and assigns new stats to sig->stats.
Fix this by using READ_ONCE() and smp_store_release().
Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com Fixes: 34ec12349c8a ("taskstats: cleanup ->signal->stats allocation") Cc: stable@vger.kernel.org Signed-off-by: Christian Brauner christian.brauner@ubuntu.com Reviewed-by: Dmitry Vyukov dvyukov@google.com Link: https://lore.kernel.org/r/20191006235216.7483-1-christian.brauner@ubuntu.com
/* v1 */ Link: https://lore.kernel.org/r/20191005112806.13960-1-christian.brauner@ubuntu.co...
/* v2 */
- Dmitry Vyukov dvyukov@google.com, Marco Elver elver@google.com:
- fix the original double-checked locking using memory barriers
/* v3 */
- Andrea Parri parri.andrea@gmail.com:
- document memory barriers to make checkpatch happy
kernel/taskstats.c | 21 ++++++++++++--------- 1 file changed, 12 insertions(+), 9 deletions(-)
diff --git a/kernel/taskstats.c b/kernel/taskstats.c index 13a0f2e6ebc2..978d7931fb65 100644 --- a/kernel/taskstats.c +++ b/kernel/taskstats.c @@ -554,24 +554,27 @@ static int taskstats_user_cmd(struct sk_buff *skb, struct genl_info *info) static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk) { struct signal_struct *sig = tsk->signal;
- struct taskstats *stats;
- struct taskstats *stats_new, *stats;
- if (sig->stats || thread_group_empty(tsk))
goto ret;
- /* Pairs with smp_store_release() below. */
- stats = READ_ONCE(sig->stats);
This pairing suggests that the READ_ONCE() is heading an address dependency, but I fail to identify it: what is the target memory access of such a (putative) dependency?
- if (stats || thread_group_empty(tsk))
return stats;
/* No problem if kmem_cache_zalloc() fails */
- stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
- stats_new = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
spin_lock_irq(&tsk->sighand->siglock); if (!sig->stats) {
sig->stats = stats;
stats = NULL;
/* Pairs with READ_ONCE() above. */
smp_store_release(&sig->stats, stats_new);
This is intended to 'order' the _zalloc() (zero initializazion) before the update of sig->stats, right? what else am I missing?
Thanks, Andrea
} spin_unlock_irq(&tsk->sighand->siglock);stats_new = NULL;
- if (stats)
kmem_cache_free(taskstats_cache, stats);
-ret:
- if (stats_new)
kmem_cache_free(taskstats_cache, stats_new);
- return sig->stats;
} -- 2.23.0
On Mon, Oct 07, 2019 at 03:18:04PM +0200, Andrea Parri wrote:
On Mon, Oct 07, 2019 at 01:01:17PM +0200, Christian Brauner wrote:
When assiging and testing taskstats in taskstats_exit() there's a race when writing and reading sig->stats when a thread-group with more than one thread exits:
cpu0: thread catches fatal signal and whole thread-group gets taken down do_exit() do_group_exit() taskstats_exit() taskstats_tgid_alloc() The tasks reads sig->stats holding sighand lock seeing garbage.
You meant "without holding sighand lock" here, right?
Correct, thanks for noticing!
cpu1: task calls exit_group() do_exit() do_group_exit() taskstats_exit() taskstats_tgid_alloc() The task takes sighand lock and assigns new stats to sig->stats.
Fix this by using READ_ONCE() and smp_store_release().
Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com Fixes: 34ec12349c8a ("taskstats: cleanup ->signal->stats allocation") Cc: stable@vger.kernel.org Signed-off-by: Christian Brauner christian.brauner@ubuntu.com Reviewed-by: Dmitry Vyukov dvyukov@google.com Link: https://lore.kernel.org/r/20191006235216.7483-1-christian.brauner@ubuntu.com
/* v1 */ Link: https://lore.kernel.org/r/20191005112806.13960-1-christian.brauner@ubuntu.co...
/* v2 */
- Dmitry Vyukov dvyukov@google.com, Marco Elver elver@google.com:
- fix the original double-checked locking using memory barriers
/* v3 */
- Andrea Parri parri.andrea@gmail.com:
- document memory barriers to make checkpatch happy
kernel/taskstats.c | 21 ++++++++++++--------- 1 file changed, 12 insertions(+), 9 deletions(-)
diff --git a/kernel/taskstats.c b/kernel/taskstats.c index 13a0f2e6ebc2..978d7931fb65 100644 --- a/kernel/taskstats.c +++ b/kernel/taskstats.c @@ -554,24 +554,27 @@ static int taskstats_user_cmd(struct sk_buff *skb, struct genl_info *info) static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk) { struct signal_struct *sig = tsk->signal;
- struct taskstats *stats;
- struct taskstats *stats_new, *stats;
- if (sig->stats || thread_group_empty(tsk))
goto ret;
- /* Pairs with smp_store_release() below. */
- stats = READ_ONCE(sig->stats);
This pairing suggests that the READ_ONCE() is heading an address dependency, but I fail to identify it: what is the target memory access of such a (putative) dependency?
- if (stats || thread_group_empty(tsk))
return stats;
/* No problem if kmem_cache_zalloc() fails */
- stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
- stats_new = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
spin_lock_irq(&tsk->sighand->siglock); if (!sig->stats) {
sig->stats = stats;
stats = NULL;
/* Pairs with READ_ONCE() above. */
smp_store_release(&sig->stats, stats_new);
This is intended to 'order' the _zalloc() (zero initializazion) before the update of sig->stats, right? what else am I missing?
Right, I should've mentioned that. I'll change the comment. But I thought this also paired with smp_read_barrier_depends() that's placed alongside READ_ONCE()?
Christian
On Mon, Oct 7, 2019 at 3:18 PM Andrea Parri parri.andrea@gmail.com wrote:
On Mon, Oct 07, 2019 at 01:01:17PM +0200, Christian Brauner wrote:
When assiging and testing taskstats in taskstats_exit() there's a race when writing and reading sig->stats when a thread-group with more than one thread exits:
cpu0: thread catches fatal signal and whole thread-group gets taken down do_exit() do_group_exit() taskstats_exit() taskstats_tgid_alloc() The tasks reads sig->stats holding sighand lock seeing garbage.
You meant "without holding sighand lock" here, right?
cpu1: task calls exit_group() do_exit() do_group_exit() taskstats_exit() taskstats_tgid_alloc() The task takes sighand lock and assigns new stats to sig->stats.
Fix this by using READ_ONCE() and smp_store_release().
Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com Fixes: 34ec12349c8a ("taskstats: cleanup ->signal->stats allocation") Cc: stable@vger.kernel.org Signed-off-by: Christian Brauner christian.brauner@ubuntu.com Reviewed-by: Dmitry Vyukov dvyukov@google.com Link: https://lore.kernel.org/r/20191006235216.7483-1-christian.brauner@ubuntu.com
/* v1 */ Link: https://lore.kernel.org/r/20191005112806.13960-1-christian.brauner@ubuntu.co...
/* v2 */
- Dmitry Vyukov dvyukov@google.com, Marco Elver elver@google.com:
- fix the original double-checked locking using memory barriers
/* v3 */
- Andrea Parri parri.andrea@gmail.com:
- document memory barriers to make checkpatch happy
kernel/taskstats.c | 21 ++++++++++++--------- 1 file changed, 12 insertions(+), 9 deletions(-)
diff --git a/kernel/taskstats.c b/kernel/taskstats.c index 13a0f2e6ebc2..978d7931fb65 100644 --- a/kernel/taskstats.c +++ b/kernel/taskstats.c @@ -554,24 +554,27 @@ static int taskstats_user_cmd(struct sk_buff *skb, struct genl_info *info) static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk) { struct signal_struct *sig = tsk->signal;
struct taskstats *stats;
struct taskstats *stats_new, *stats;
if (sig->stats || thread_group_empty(tsk))
goto ret;
/* Pairs with smp_store_release() below. */
stats = READ_ONCE(sig->stats);
This pairing suggests that the READ_ONCE() is heading an address dependency, but I fail to identify it: what is the target memory access of such a (putative) dependency?
I would assume callers of this function access *stats. So the dependency is between loading stats and accessing *stats.
if (stats || thread_group_empty(tsk))
return stats; /* No problem if kmem_cache_zalloc() fails */
stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
stats_new = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL); spin_lock_irq(&tsk->sighand->siglock); if (!sig->stats) {
sig->stats = stats;
stats = NULL;
/* Pairs with READ_ONCE() above. */
smp_store_release(&sig->stats, stats_new);
This is intended to 'order' the _zalloc() (zero initializazion) before the update of sig->stats, right? what else am I missing?
Thanks, Andrea
stats_new = NULL; } spin_unlock_irq(&tsk->sighand->siglock);
if (stats)
kmem_cache_free(taskstats_cache, stats);
-ret:
if (stats_new)
kmem_cache_free(taskstats_cache, stats_new);
return sig->stats;
}
-- 2.23.0
On Mon, Oct 07, 2019 at 03:50:47PM +0200, Dmitry Vyukov wrote:
On Mon, Oct 7, 2019 at 3:18 PM Andrea Parri parri.andrea@gmail.com wrote:
On Mon, Oct 07, 2019 at 01:01:17PM +0200, Christian Brauner wrote:
When assiging and testing taskstats in taskstats_exit() there's a race when writing and reading sig->stats when a thread-group with more than one thread exits:
cpu0: thread catches fatal signal and whole thread-group gets taken down do_exit() do_group_exit() taskstats_exit() taskstats_tgid_alloc() The tasks reads sig->stats holding sighand lock seeing garbage.
You meant "without holding sighand lock" here, right?
cpu1: task calls exit_group() do_exit() do_group_exit() taskstats_exit() taskstats_tgid_alloc() The task takes sighand lock and assigns new stats to sig->stats.
Fix this by using READ_ONCE() and smp_store_release().
Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com Fixes: 34ec12349c8a ("taskstats: cleanup ->signal->stats allocation") Cc: stable@vger.kernel.org Signed-off-by: Christian Brauner christian.brauner@ubuntu.com Reviewed-by: Dmitry Vyukov dvyukov@google.com Link: https://lore.kernel.org/r/20191006235216.7483-1-christian.brauner@ubuntu.com
/* v1 */ Link: https://lore.kernel.org/r/20191005112806.13960-1-christian.brauner@ubuntu.co...
/* v2 */
- Dmitry Vyukov dvyukov@google.com, Marco Elver elver@google.com:
- fix the original double-checked locking using memory barriers
/* v3 */
- Andrea Parri parri.andrea@gmail.com:
- document memory barriers to make checkpatch happy
kernel/taskstats.c | 21 ++++++++++++--------- 1 file changed, 12 insertions(+), 9 deletions(-)
diff --git a/kernel/taskstats.c b/kernel/taskstats.c index 13a0f2e6ebc2..978d7931fb65 100644 --- a/kernel/taskstats.c +++ b/kernel/taskstats.c @@ -554,24 +554,27 @@ static int taskstats_user_cmd(struct sk_buff *skb, struct genl_info *info) static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk) { struct signal_struct *sig = tsk->signal;
struct taskstats *stats;
struct taskstats *stats_new, *stats;
if (sig->stats || thread_group_empty(tsk))
goto ret;
/* Pairs with smp_store_release() below. */
stats = READ_ONCE(sig->stats);
This pairing suggests that the READ_ONCE() is heading an address dependency, but I fail to identify it: what is the target memory access of such a (putative) dependency?
I would assume callers of this function access *stats. So the dependency is between loading stats and accessing *stats.
Right, but why READ_ONCE() and not smp_load_acquire here?
Christian
On Mon, Oct 7, 2019 at 3:55 PM Christian Brauner christian.brauner@ubuntu.com wrote:
On Mon, Oct 07, 2019 at 03:50:47PM +0200, Dmitry Vyukov wrote:
On Mon, Oct 7, 2019 at 3:18 PM Andrea Parri parri.andrea@gmail.com wrote:
On Mon, Oct 07, 2019 at 01:01:17PM +0200, Christian Brauner wrote:
When assiging and testing taskstats in taskstats_exit() there's a race when writing and reading sig->stats when a thread-group with more than one thread exits:
cpu0: thread catches fatal signal and whole thread-group gets taken down do_exit() do_group_exit() taskstats_exit() taskstats_tgid_alloc() The tasks reads sig->stats holding sighand lock seeing garbage.
You meant "without holding sighand lock" here, right?
cpu1: task calls exit_group() do_exit() do_group_exit() taskstats_exit() taskstats_tgid_alloc() The task takes sighand lock and assigns new stats to sig->stats.
Fix this by using READ_ONCE() and smp_store_release().
Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com Fixes: 34ec12349c8a ("taskstats: cleanup ->signal->stats allocation") Cc: stable@vger.kernel.org Signed-off-by: Christian Brauner christian.brauner@ubuntu.com Reviewed-by: Dmitry Vyukov dvyukov@google.com Link: https://lore.kernel.org/r/20191006235216.7483-1-christian.brauner@ubuntu.com
/* v1 */ Link: https://lore.kernel.org/r/20191005112806.13960-1-christian.brauner@ubuntu.co...
/* v2 */
- Dmitry Vyukov dvyukov@google.com, Marco Elver elver@google.com:
- fix the original double-checked locking using memory barriers
/* v3 */
- Andrea Parri parri.andrea@gmail.com:
- document memory barriers to make checkpatch happy
kernel/taskstats.c | 21 ++++++++++++--------- 1 file changed, 12 insertions(+), 9 deletions(-)
diff --git a/kernel/taskstats.c b/kernel/taskstats.c index 13a0f2e6ebc2..978d7931fb65 100644 --- a/kernel/taskstats.c +++ b/kernel/taskstats.c @@ -554,24 +554,27 @@ static int taskstats_user_cmd(struct sk_buff *skb, struct genl_info *info) static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk) { struct signal_struct *sig = tsk->signal;
struct taskstats *stats;
struct taskstats *stats_new, *stats;
if (sig->stats || thread_group_empty(tsk))
goto ret;
/* Pairs with smp_store_release() below. */
stats = READ_ONCE(sig->stats);
This pairing suggests that the READ_ONCE() is heading an address dependency, but I fail to identify it: what is the target memory access of such a (putative) dependency?
I would assume callers of this function access *stats. So the dependency is between loading stats and accessing *stats.
Right, but why READ_ONCE() and not smp_load_acquire here?
Because if all memory accesses we need to order have data dependency between them, READ_ONCE is enough and is cheaper on some archs (e.g. ARM). In our case there is a data dependency between loading of stats and accessing *stats (only Alpha could reorder that, other arches can't load via a pointer before loading the pointer itself (sic!)).
On Mon, Oct 07, 2019 at 04:08:41PM +0200, Dmitry Vyukov wrote:
On Mon, Oct 7, 2019 at 3:55 PM Christian Brauner christian.brauner@ubuntu.com wrote:
On Mon, Oct 07, 2019 at 03:50:47PM +0200, Dmitry Vyukov wrote:
On Mon, Oct 7, 2019 at 3:18 PM Andrea Parri parri.andrea@gmail.com wrote:
On Mon, Oct 07, 2019 at 01:01:17PM +0200, Christian Brauner wrote:
When assiging and testing taskstats in taskstats_exit() there's a race when writing and reading sig->stats when a thread-group with more than one thread exits:
cpu0: thread catches fatal signal and whole thread-group gets taken down do_exit() do_group_exit() taskstats_exit() taskstats_tgid_alloc() The tasks reads sig->stats holding sighand lock seeing garbage.
You meant "without holding sighand lock" here, right?
cpu1: task calls exit_group() do_exit() do_group_exit() taskstats_exit() taskstats_tgid_alloc() The task takes sighand lock and assigns new stats to sig->stats.
Fix this by using READ_ONCE() and smp_store_release().
Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com Fixes: 34ec12349c8a ("taskstats: cleanup ->signal->stats allocation") Cc: stable@vger.kernel.org Signed-off-by: Christian Brauner christian.brauner@ubuntu.com Reviewed-by: Dmitry Vyukov dvyukov@google.com Link: https://lore.kernel.org/r/20191006235216.7483-1-christian.brauner@ubuntu.com
/* v1 */ Link: https://lore.kernel.org/r/20191005112806.13960-1-christian.brauner@ubuntu.co...
/* v2 */
- Dmitry Vyukov dvyukov@google.com, Marco Elver elver@google.com:
- fix the original double-checked locking using memory barriers
/* v3 */
- Andrea Parri parri.andrea@gmail.com:
- document memory barriers to make checkpatch happy
kernel/taskstats.c | 21 ++++++++++++--------- 1 file changed, 12 insertions(+), 9 deletions(-)
diff --git a/kernel/taskstats.c b/kernel/taskstats.c index 13a0f2e6ebc2..978d7931fb65 100644 --- a/kernel/taskstats.c +++ b/kernel/taskstats.c @@ -554,24 +554,27 @@ static int taskstats_user_cmd(struct sk_buff *skb, struct genl_info *info) static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk) { struct signal_struct *sig = tsk->signal;
struct taskstats *stats;
struct taskstats *stats_new, *stats;
if (sig->stats || thread_group_empty(tsk))
goto ret;
/* Pairs with smp_store_release() below. */
stats = READ_ONCE(sig->stats);
This pairing suggests that the READ_ONCE() is heading an address dependency, but I fail to identify it: what is the target memory access of such a (putative) dependency?
I would assume callers of this function access *stats. So the dependency is between loading stats and accessing *stats.
Right, but why READ_ONCE() and not smp_load_acquire here?
Because if all memory accesses we need to order have data dependency between them, READ_ONCE is enough and is cheaper on some archs (e.g. ARM). In our case there is a data dependency between loading of stats and accessing *stats (only Alpha could reorder that, other arches can't load via a pointer before loading the pointer itself (sic!)).
Right, the except-Alpha-clause is well-known...
Christian
static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk) { struct signal_struct *sig = tsk->signal;
struct taskstats *stats;
struct taskstats *stats_new, *stats;
if (sig->stats || thread_group_empty(tsk))
goto ret;
/* Pairs with smp_store_release() below. */
stats = READ_ONCE(sig->stats);
This pairing suggests that the READ_ONCE() is heading an address dependency, but I fail to identify it: what is the target memory access of such a (putative) dependency?
I would assume callers of this function access *stats. So the dependency is between loading stats and accessing *stats.
AFAICT, the only caller of the function in 5.4-rc2 is taskstats_exit(), which 'casts' the return value to a boolean (so I really don't see how any address dependency could be carried over/relied upon here).
Andrea
On Mon, Oct 7, 2019 at 4:14 PM Andrea Parri parri.andrea@gmail.com wrote:
static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk) { struct signal_struct *sig = tsk->signal;
struct taskstats *stats;
struct taskstats *stats_new, *stats;
if (sig->stats || thread_group_empty(tsk))
goto ret;
/* Pairs with smp_store_release() below. */
stats = READ_ONCE(sig->stats);
This pairing suggests that the READ_ONCE() is heading an address dependency, but I fail to identify it: what is the target memory access of such a (putative) dependency?
I would assume callers of this function access *stats. So the dependency is between loading stats and accessing *stats.
AFAICT, the only caller of the function in 5.4-rc2 is taskstats_exit(), which 'casts' the return value to a boolean (so I really don't see how any address dependency could be carried over/relied upon here).
This does not make sense.
But later taskstats_exit does:
memcpy(stats, tsk->signal->stats, sizeof(*stats));
Perhaps it's supposed to use stats returned by taskstats_tgid_alloc?
On Mon, Oct 07, 2019 at 04:18:26PM +0200, Dmitry Vyukov wrote:
On Mon, Oct 7, 2019 at 4:14 PM Andrea Parri parri.andrea@gmail.com wrote:
static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk) { struct signal_struct *sig = tsk->signal;
struct taskstats *stats;
struct taskstats *stats_new, *stats;
if (sig->stats || thread_group_empty(tsk))
goto ret;
/* Pairs with smp_store_release() below. */
stats = READ_ONCE(sig->stats);
This pairing suggests that the READ_ONCE() is heading an address dependency, but I fail to identify it: what is the target memory access of such a (putative) dependency?
I would assume callers of this function access *stats. So the dependency is between loading stats and accessing *stats.
AFAICT, the only caller of the function in 5.4-rc2 is taskstats_exit(), which 'casts' the return value to a boolean (so I really don't see how any address dependency could be carried over/relied upon here).
This does not make sense.
But later taskstats_exit does:
memcpy(stats, tsk->signal->stats, sizeof(*stats));
Perhaps it's supposed to use stats returned by taskstats_tgid_alloc?
Seems reasonable to me. If so, replacing the READ_ONCE() in question with an smp_load_acquire() might be the solution. Thoughts?
Andrea
On Tue, Oct 08, 2019 at 04:20:35PM +0200, Andrea Parri wrote:
On Mon, Oct 07, 2019 at 04:18:26PM +0200, Dmitry Vyukov wrote:
On Mon, Oct 7, 2019 at 4:14 PM Andrea Parri parri.andrea@gmail.com wrote:
static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk) { struct signal_struct *sig = tsk->signal;
struct taskstats *stats;
struct taskstats *stats_new, *stats;
if (sig->stats || thread_group_empty(tsk))
goto ret;
/* Pairs with smp_store_release() below. */
stats = READ_ONCE(sig->stats);
This pairing suggests that the READ_ONCE() is heading an address dependency, but I fail to identify it: what is the target memory access of such a (putative) dependency?
I would assume callers of this function access *stats. So the dependency is between loading stats and accessing *stats.
AFAICT, the only caller of the function in 5.4-rc2 is taskstats_exit(), which 'casts' the return value to a boolean (so I really don't see how any address dependency could be carried over/relied upon here).
This does not make sense.
But later taskstats_exit does:
memcpy(stats, tsk->signal->stats, sizeof(*stats));
Perhaps it's supposed to use stats returned by taskstats_tgid_alloc?
Seems reasonable to me. If so, replacing the READ_ONCE() in question with an smp_load_acquire() might be the solution. Thoughts?
I've done that already in my tree yesterday. I can resend for another review if you'd prefer.
Christian
On Tue, Oct 08, 2019 at 04:24:14PM +0200, Christian Brauner wrote:
On Tue, Oct 08, 2019 at 04:20:35PM +0200, Andrea Parri wrote:
On Mon, Oct 07, 2019 at 04:18:26PM +0200, Dmitry Vyukov wrote:
On Mon, Oct 7, 2019 at 4:14 PM Andrea Parri parri.andrea@gmail.com wrote:
> static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk) > { > struct signal_struct *sig = tsk->signal; > - struct taskstats *stats; > + struct taskstats *stats_new, *stats; > > - if (sig->stats || thread_group_empty(tsk)) > - goto ret; > + /* Pairs with smp_store_release() below. */ > + stats = READ_ONCE(sig->stats);
This pairing suggests that the READ_ONCE() is heading an address dependency, but I fail to identify it: what is the target memory access of such a (putative) dependency?
I would assume callers of this function access *stats. So the dependency is between loading stats and accessing *stats.
AFAICT, the only caller of the function in 5.4-rc2 is taskstats_exit(), which 'casts' the return value to a boolean (so I really don't see how any address dependency could be carried over/relied upon here).
This does not make sense.
But later taskstats_exit does:
memcpy(stats, tsk->signal->stats, sizeof(*stats));
Perhaps it's supposed to use stats returned by taskstats_tgid_alloc?
Seems reasonable to me. If so, replacing the READ_ONCE() in question with an smp_load_acquire() might be the solution. Thoughts?
I've done that already in my tree yesterday. I can resend for another review if you'd prefer.
Oh nice! No need to resend of course. ;D FWIW, I can check it if you let me know the particular branch/commit (guessing that's somewhere in git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git, yes?).
Thanks, Andrea
On Tue, Oct 08, 2019 at 05:26:59PM +0200, Andrea Parri wrote:
On Tue, Oct 08, 2019 at 04:24:14PM +0200, Christian Brauner wrote:
On Tue, Oct 08, 2019 at 04:20:35PM +0200, Andrea Parri wrote:
On Mon, Oct 07, 2019 at 04:18:26PM +0200, Dmitry Vyukov wrote:
On Mon, Oct 7, 2019 at 4:14 PM Andrea Parri parri.andrea@gmail.com wrote:
> > static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk) > > { > > struct signal_struct *sig = tsk->signal; > > - struct taskstats *stats; > > + struct taskstats *stats_new, *stats; > > > > - if (sig->stats || thread_group_empty(tsk)) > > - goto ret; > > + /* Pairs with smp_store_release() below. */ > > + stats = READ_ONCE(sig->stats); > > This pairing suggests that the READ_ONCE() is heading an address > dependency, but I fail to identify it: what is the target memory > access of such a (putative) dependency?
I would assume callers of this function access *stats. So the dependency is between loading stats and accessing *stats.
AFAICT, the only caller of the function in 5.4-rc2 is taskstats_exit(), which 'casts' the return value to a boolean (so I really don't see how any address dependency could be carried over/relied upon here).
This does not make sense.
But later taskstats_exit does:
memcpy(stats, tsk->signal->stats, sizeof(*stats));
Perhaps it's supposed to use stats returned by taskstats_tgid_alloc?
Seems reasonable to me. If so, replacing the READ_ONCE() in question with an smp_load_acquire() might be the solution. Thoughts?
I've done that already in my tree yesterday. I can resend for another review if you'd prefer.
Oh nice! No need to resend of course. ;D FWIW, I can check it if you let me know the particular branch/commit (guessing that's somewhere in git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git, yes?).
Oh ups, yeah of course :) https://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git/log/?h=tas...
Thanks! Christian
Oh ups, yeah of course :) https://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git/log/?h=tas...
You forgot to update the commit msg. It looks good to me modulo that.
Thanks, Andrea
When assiging and testing taskstats in taskstats_exit() there's a race when writing and reading sig->stats when a thread-group with more than one thread exits:
cpu0: thread catches fatal signal and whole thread-group gets taken down do_exit() do_group_exit() taskstats_exit() taskstats_tgid_alloc() The tasks reads sig->stats without holding sighand lock seeing garbage.
cpu1: task calls exit_group() do_exit() do_group_exit() taskstats_exit() taskstats_tgid_alloc() The task takes sighand lock and assigns new stats to sig->stats.
Fix this by using smp_load_acquire() and smp_store_release().
Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com Fixes: 34ec12349c8a ("taskstats: cleanup ->signal->stats allocation") Cc: stable@vger.kernel.org Signed-off-by: Christian Brauner christian.brauner@ubuntu.com Reviewed-by: Dmitry Vyukov dvyukov@google.com --- /* v1 */ Link: https://lore.kernel.org/r/20191005112806.13960-1-christian.brauner@ubuntu.co...
/* v2 */ Link: https://lore.kernel.org/r/20191006235216.7483-1-christian.brauner@ubuntu.com - Dmitry Vyukov dvyukov@google.com, Marco Elver elver@google.com: - fix the original double-checked locking using memory barriers
/* v3 */ Link: https://lore.kernel.org/r/20191007110117.1096-1-christian.brauner@ubuntu.com... - Andrea Parri parri.andrea@gmail.com: - document memory barriers to make checkpatch happy
/* v4 */ - Andrea Parri parri.andrea@gmail.com: - use smp_load_acquire(), not READ_ONCE() - update commit message --- kernel/taskstats.c | 24 +++++++++++++++--------- 1 file changed, 15 insertions(+), 9 deletions(-)
diff --git a/kernel/taskstats.c b/kernel/taskstats.c index 13a0f2e6ebc2..e6b45315607a 100644 --- a/kernel/taskstats.c +++ b/kernel/taskstats.c @@ -554,24 +554,30 @@ static int taskstats_user_cmd(struct sk_buff *skb, struct genl_info *info) static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk) { struct signal_struct *sig = tsk->signal; - struct taskstats *stats; + struct taskstats *stats_new, *stats;
- if (sig->stats || thread_group_empty(tsk)) - goto ret; + /* Pairs with smp_store_release() below. */ + stats = smp_load_acquire(sig->stats); + if (stats || thread_group_empty(tsk)) + return stats;
/* No problem if kmem_cache_zalloc() fails */ - stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL); + stats_new = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
spin_lock_irq(&tsk->sighand->siglock); if (!sig->stats) { - sig->stats = stats; - stats = NULL; + /* + * Pairs with smp_store_release() above and order the + * kmem_cache_zalloc(). + */ + smp_store_release(&sig->stats, stats_new); + stats_new = NULL; } spin_unlock_irq(&tsk->sighand->siglock);
- if (stats) - kmem_cache_free(taskstats_cache, stats); -ret: + if (stats_new) + kmem_cache_free(taskstats_cache, stats_new); + return sig->stats; }
On Wed, Oct 09, 2019 at 01:31:34PM +0200, Christian Brauner wrote:
When assiging and testing taskstats in taskstats_exit() there's a race when writing and reading sig->stats when a thread-group with more than one thread exits:
cpu0: thread catches fatal signal and whole thread-group gets taken down do_exit() do_group_exit() taskstats_exit() taskstats_tgid_alloc() The tasks reads sig->stats without holding sighand lock seeing garbage.
cpu1: task calls exit_group() do_exit() do_group_exit() taskstats_exit() taskstats_tgid_alloc() The task takes sighand lock and assigns new stats to sig->stats.
Fix this by using smp_load_acquire() and smp_store_release().
Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com Fixes: 34ec12349c8a ("taskstats: cleanup ->signal->stats allocation") Cc: stable@vger.kernel.org Signed-off-by: Christian Brauner christian.brauner@ubuntu.com Reviewed-by: Dmitry Vyukov dvyukov@google.com
_sigh_, let me resend since i fcked this one up.
Christian
When assiging and testing taskstats in taskstats_exit() there's a race when writing and reading sig->stats when a thread-group with more than one thread exits:
cpu0: thread catches fatal signal and whole thread-group gets taken down do_exit() do_group_exit() taskstats_exit() taskstats_tgid_alloc() The tasks reads sig->stats without holding sighand lock seeing garbage.
cpu1: task calls exit_group() do_exit() do_group_exit() taskstats_exit() taskstats_tgid_alloc() The task takes sighand lock and assigns new stats to sig->stats.
Fix this by using smp_load_acquire() and smp_store_release().
Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com Fixes: 34ec12349c8a ("taskstats: cleanup ->signal->stats allocation") Cc: stable@vger.kernel.org Signed-off-by: Christian Brauner christian.brauner@ubuntu.com Reviewed-by: Dmitry Vyukov dvyukov@google.com --- /* v1 */ Link: https://lore.kernel.org/r/20191005112806.13960-1-christian.brauner@ubuntu.co...
/* v2 */ Link: https://lore.kernel.org/r/20191006235216.7483-1-christian.brauner@ubuntu.com - Dmitry Vyukov dvyukov@google.com, Marco Elver elver@google.com: - fix the original double-checked locking using memory barriers
/* v3 */ Link: https://lore.kernel.org/r/20191007110117.1096-1-christian.brauner@ubuntu.com - Andrea Parri parri.andrea@gmail.com: - document memory barriers to make checkpatch happy
/* v4 */ Link: https://lore.kernel.org/r/20191009113134.5171-1-christian.brauner@ubuntu.com - Andrea Parri parri.andrea@gmail.com: - use smp_load_acquire(), not READ_ONCE() - update commit message
/* v5 */ - Andrea Parri parri.andrea@gmail.com: - fix typo in smp_load_acquire() --- kernel/taskstats.c | 24 +++++++++++++++--------- 1 file changed, 15 insertions(+), 9 deletions(-)
diff --git a/kernel/taskstats.c b/kernel/taskstats.c index 13a0f2e6ebc2..6e18fdc4f7c8 100644 --- a/kernel/taskstats.c +++ b/kernel/taskstats.c @@ -554,24 +554,30 @@ static int taskstats_user_cmd(struct sk_buff *skb, struct genl_info *info) static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk) { struct signal_struct *sig = tsk->signal; - struct taskstats *stats; + struct taskstats *stats_new, *stats;
- if (sig->stats || thread_group_empty(tsk)) - goto ret; + /* Pairs with smp_store_release() below. */ + stats = smp_load_acquire(&sig->stats); + if (stats || thread_group_empty(tsk)) + return stats;
/* No problem if kmem_cache_zalloc() fails */ - stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL); + stats_new = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
spin_lock_irq(&tsk->sighand->siglock); if (!sig->stats) { - sig->stats = stats; - stats = NULL; + /* + * Pairs with smp_store_release() above and order the + * kmem_cache_zalloc(). + */ + smp_store_release(&sig->stats, stats_new); + stats_new = NULL; } spin_unlock_irq(&tsk->sighand->siglock);
- if (stats) - kmem_cache_free(taskstats_cache, stats); -ret: + if (stats_new) + kmem_cache_free(taskstats_cache, stats_new); + return sig->stats; }
On Wed, Oct 09, 2019 at 01:48:09PM +0200, Christian Brauner wrote:
When assiging and testing taskstats in taskstats_exit() there's a race when writing and reading sig->stats when a thread-group with more than one thread exits:
cpu0: thread catches fatal signal and whole thread-group gets taken down do_exit() do_group_exit() taskstats_exit() taskstats_tgid_alloc() The tasks reads sig->stats without holding sighand lock seeing garbage.
cpu1: task calls exit_group() do_exit() do_group_exit() taskstats_exit() taskstats_tgid_alloc() The task takes sighand lock and assigns new stats to sig->stats.
Fix this by using smp_load_acquire() and smp_store_release().
Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com Fixes: 34ec12349c8a ("taskstats: cleanup ->signal->stats allocation") Cc: stable@vger.kernel.org Signed-off-by: Christian Brauner christian.brauner@ubuntu.com Reviewed-by: Dmitry Vyukov dvyukov@google.com
Reviewed-by: Andrea Parri parri.andrea@gmail.com
Thanks, Andrea
/* v1 */ Link: https://lore.kernel.org/r/20191005112806.13960-1-christian.brauner@ubuntu.co...
/* v2 */ Link: https://lore.kernel.org/r/20191006235216.7483-1-christian.brauner@ubuntu.com
- Dmitry Vyukov dvyukov@google.com, Marco Elver elver@google.com:
- fix the original double-checked locking using memory barriers
/* v3 */ Link: https://lore.kernel.org/r/20191007110117.1096-1-christian.brauner@ubuntu.com
- Andrea Parri parri.andrea@gmail.com:
- document memory barriers to make checkpatch happy
/* v4 */ Link: https://lore.kernel.org/r/20191009113134.5171-1-christian.brauner@ubuntu.com
- Andrea Parri parri.andrea@gmail.com:
- use smp_load_acquire(), not READ_ONCE()
- update commit message
/* v5 */
- Andrea Parri parri.andrea@gmail.com:
- fix typo in smp_load_acquire()
kernel/taskstats.c | 24 +++++++++++++++--------- 1 file changed, 15 insertions(+), 9 deletions(-)
diff --git a/kernel/taskstats.c b/kernel/taskstats.c index 13a0f2e6ebc2..6e18fdc4f7c8 100644 --- a/kernel/taskstats.c +++ b/kernel/taskstats.c @@ -554,24 +554,30 @@ static int taskstats_user_cmd(struct sk_buff *skb, struct genl_info *info) static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk) { struct signal_struct *sig = tsk->signal;
- struct taskstats *stats;
- struct taskstats *stats_new, *stats;
- if (sig->stats || thread_group_empty(tsk))
goto ret;
- /* Pairs with smp_store_release() below. */
- stats = smp_load_acquire(&sig->stats);
- if (stats || thread_group_empty(tsk))
return stats;
/* No problem if kmem_cache_zalloc() fails */
- stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
- stats_new = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
spin_lock_irq(&tsk->sighand->siglock); if (!sig->stats) {
sig->stats = stats;
stats = NULL;
/*
* Pairs with smp_store_release() above and order the
* kmem_cache_zalloc().
*/
smp_store_release(&sig->stats, stats_new);
} spin_unlock_irq(&tsk->sighand->siglock);stats_new = NULL;
- if (stats)
kmem_cache_free(taskstats_cache, stats);
-ret:
- if (stats_new)
kmem_cache_free(taskstats_cache, stats_new);
- return sig->stats;
} -- 2.23.0
On Wed, Oct 09, 2019 at 01:48:09PM +0200, Christian Brauner wrote:
When assiging and testing taskstats in taskstats_exit() there's a race when writing and reading sig->stats when a thread-group with more than one thread exits:
cpu0: thread catches fatal signal and whole thread-group gets taken down do_exit() do_group_exit() taskstats_exit() taskstats_tgid_alloc() The tasks reads sig->stats without holding sighand lock seeing garbage.
cpu1: task calls exit_group() do_exit() do_group_exit() taskstats_exit() taskstats_tgid_alloc() The task takes sighand lock and assigns new stats to sig->stats.
Fix this by using smp_load_acquire() and smp_store_release().
Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com Fixes: 34ec12349c8a ("taskstats: cleanup ->signal->stats allocation") Cc: stable@vger.kernel.org Signed-off-by: Christian Brauner christian.brauner@ubuntu.com Reviewed-by: Dmitry Vyukov dvyukov@google.com
Applied to: https://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git/log/?h=tas...
Merged into: https://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git/log/?h=fix...
Targeting v5.4-rc3.
Thanks! Christian
When assiging and testing taskstats in taskstats_exit() there's a race when writing and reading sig->stats when a thread-group with more than one thread exits:
cpu0: thread catches fatal signal and whole thread-group gets taken down do_exit() do_group_exit() taskstats_exit() taskstats_tgid_alloc() The tasks reads sig->stats without holding sighand lock.
cpu1: task calls exit_group() do_exit() do_group_exit() taskstats_exit() taskstats_tgid_alloc() The task takes sighand lock and assigns new stats to sig->stats.
The first approach used smp_load_acquire() and smp_store_release(). However, after having discussed this it seems that the data dependency for kmem_cache_alloc() would be fixed by WRITE_ONCE(). Furthermore, the smp_load_acquire() would only manage to order the stats check before the thread_group_empty() check. So it seems just using READ_ONCE() and WRITE_ONCE() will do the job and I wanted to bring this up for discussion at least.
Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com Fixes: 34ec12349c8a ("taskstats: cleanup ->signal->stats allocation") Cc: Will Deacon will@kernel.org Cc: Marco Elver elver@google.com Cc: Andrea Parri parri.andrea@gmail.com Cc: Dmitry Vyukov dvyukov@google.com Cc: stable@vger.kernel.org Signed-off-by: Christian Brauner christian.brauner@ubuntu.com --- /* v1 */ Link: https://lore.kernel.org/r/20191005112806.13960-1-christian.brauner@ubuntu.co...
/* v2 */ Link: https://lore.kernel.org/r/20191006235216.7483-1-christian.brauner@ubuntu.com - Dmitry Vyukov dvyukov@google.com, Marco Elver elver@google.com: - fix the original double-checked locking using memory barriers
/* v3 */ Link: https://lore.kernel.org/r/20191007110117.1096-1-christian.brauner@ubuntu.com - Andrea Parri parri.andrea@gmail.com: - document memory barriers to make checkpatch happy
/* v4 */ Link: https://lore.kernel.org/r/20191009113134.5171-1-christian.brauner@ubuntu.com - Andrea Parri parri.andrea@gmail.com: - use smp_load_acquire(), not READ_ONCE() - update commit message
/* v5 */ Link: https://lore.kernel.org/r/20191009114809.8643-1-christian.brauner@ubuntu.com - Andrea Parri parri.andrea@gmail.com: - fix typo in smp_load_acquire()
/* v6 */ - Christian Brauner christian.brauner@ubuntu.com: - bring up READ_ONCE()/WRITE_ONCE() approach for discussion --- kernel/taskstats.c | 26 +++++++++++++++----------- 1 file changed, 15 insertions(+), 11 deletions(-)
diff --git a/kernel/taskstats.c b/kernel/taskstats.c index 13a0f2e6ebc2..111bb4139aa2 100644 --- a/kernel/taskstats.c +++ b/kernel/taskstats.c @@ -554,25 +554,29 @@ static int taskstats_user_cmd(struct sk_buff *skb, struct genl_info *info) static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk) { struct signal_struct *sig = tsk->signal; - struct taskstats *stats; + struct taskstats *stats_new, *stats;
- if (sig->stats || thread_group_empty(tsk)) - goto ret; + /* Pairs with WRITE_ONCE() below. */ + stats = READ_ONCE(sig->stats); + if (stats || thread_group_empty(tsk)) + return stats;
/* No problem if kmem_cache_zalloc() fails */ - stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL); + stats_new = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
spin_lock_irq(&tsk->sighand->siglock); - if (!sig->stats) { - sig->stats = stats; - stats = NULL; + if (!stats) { + stats = stats_new; + /* Pairs with READ_ONCE() above. */ + WRITE_ONCE(sig->stats, stats_new); + stats_new = NULL; } spin_unlock_irq(&tsk->sighand->siglock);
- if (stats) - kmem_cache_free(taskstats_cache, stats); -ret: - return sig->stats; + if (stats_new) + kmem_cache_free(taskstats_cache, stats_new); + + return stats; }
/* Send pid data out on exit */
On 21/10/2019 13.33, Christian Brauner wrote:
The first approach used smp_load_acquire() and smp_store_release(). However, after having discussed this it seems that the data dependency for kmem_cache_alloc() would be fixed by WRITE_ONCE(). Furthermore, the smp_load_acquire() would only manage to order the stats check before the thread_group_empty() check. So it seems just using READ_ONCE() and WRITE_ONCE() will do the job and I wanted to bring this up for discussion at least.
/* v6 */
- Christian Brauner christian.brauner@ubuntu.com:
- bring up READ_ONCE()/WRITE_ONCE() approach for discussion
kernel/taskstats.c | 26 +++++++++++++++----------- 1 file changed, 15 insertions(+), 11 deletions(-)
diff --git a/kernel/taskstats.c b/kernel/taskstats.c index 13a0f2e6ebc2..111bb4139aa2 100644 --- a/kernel/taskstats.c +++ b/kernel/taskstats.c @@ -554,25 +554,29 @@ static int taskstats_user_cmd(struct sk_buff *skb, struct genl_info *info) static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk) { struct signal_struct *sig = tsk->signal;
- struct taskstats *stats;
- struct taskstats *stats_new, *stats;
- if (sig->stats || thread_group_empty(tsk))
goto ret;
- /* Pairs with WRITE_ONCE() below. */
- stats = READ_ONCE(sig->stats);
- if (stats || thread_group_empty(tsk))
return stats;
/* No problem if kmem_cache_zalloc() fails */
- stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
- stats_new = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
spin_lock_irq(&tsk->sighand->siglock);
- if (!sig->stats) {
sig->stats = stats;
stats = NULL;
- if (!stats) {
stats = stats_new;
/* Pairs with READ_ONCE() above. */
WRITE_ONCE(sig->stats, stats_new);
stats_new = NULL;
No idea about the memory ordering issues, but don't you need to load/check sig->stats again? Otherwise it seems that two threads might both see !sig->stats, both allocate a stats_new, and both unconditionally in turn assign their stats_new to sig->stats. Then the first assignment ends up becoming a memory leak (and any writes through that pointer done by the caller end up in /dev/null...)
Rasmus
On Mon, Oct 21, 2019 at 02:19:01PM +0200, Rasmus Villemoes wrote:
On 21/10/2019 13.33, Christian Brauner wrote:
The first approach used smp_load_acquire() and smp_store_release(). However, after having discussed this it seems that the data dependency for kmem_cache_alloc() would be fixed by WRITE_ONCE(). Furthermore, the smp_load_acquire() would only manage to order the stats check before the thread_group_empty() check. So it seems just using READ_ONCE() and WRITE_ONCE() will do the job and I wanted to bring this up for discussion at least.
/* v6 */
- Christian Brauner christian.brauner@ubuntu.com:
- bring up READ_ONCE()/WRITE_ONCE() approach for discussion
kernel/taskstats.c | 26 +++++++++++++++----------- 1 file changed, 15 insertions(+), 11 deletions(-)
diff --git a/kernel/taskstats.c b/kernel/taskstats.c index 13a0f2e6ebc2..111bb4139aa2 100644 --- a/kernel/taskstats.c +++ b/kernel/taskstats.c @@ -554,25 +554,29 @@ static int taskstats_user_cmd(struct sk_buff *skb, struct genl_info *info) static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk) { struct signal_struct *sig = tsk->signal;
- struct taskstats *stats;
- struct taskstats *stats_new, *stats;
- if (sig->stats || thread_group_empty(tsk))
goto ret;
- /* Pairs with WRITE_ONCE() below. */
- stats = READ_ONCE(sig->stats);
- if (stats || thread_group_empty(tsk))
return stats;
/* No problem if kmem_cache_zalloc() fails */
- stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
- stats_new = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
spin_lock_irq(&tsk->sighand->siglock);
- if (!sig->stats) {
sig->stats = stats;
stats = NULL;
- if (!stats) {
stats = stats_new;
/* Pairs with READ_ONCE() above. */
WRITE_ONCE(sig->stats, stats_new);
stats_new = NULL;
No idea about the memory ordering issues, but don't you need to load/check sig->stats again? Otherwise it seems that two threads might both see !sig->stats, both allocate a stats_new, and both unconditionally in turn assign their stats_new to sig->stats. Then the first assignment ends up becoming a memory leak (and any writes through that pointer done by the caller end up in /dev/null...)
Trigger hand too fast. I guess you're thinking sm like:
diff --git a/kernel/taskstats.c b/kernel/taskstats.c index 13a0f2e6ebc2..c4e1ed11e785 100644 --- a/kernel/taskstats.c +++ b/kernel/taskstats.c @@ -554,25 +554,27 @@ static int taskstats_user_cmd(struct sk_buff *skb, struct genl_info *info) static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk) { struct signal_struct *sig = tsk->signal; - struct taskstats *stats; + struct taskstats *stats_new, *stats;
- if (sig->stats || thread_group_empty(tsk)) - goto ret; + stats = READ_ONCE(sig->stats); + if (stats || thread_group_empty(tsk)) + return stats;
- /* No problem if kmem_cache_zalloc() fails */ - stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL); + stats_new = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
spin_lock_irq(&tsk->sighand->siglock); - if (!sig->stats) { - sig->stats = stats; - stats = NULL; + stats = READ_ONCE(sig->stats); + if (!stats) { + stats = stats_new; + WRITE_ONCE(sig->stats, stats_new); + stats_new = NULL; } spin_unlock_irq(&tsk->sighand->siglock);
- if (stats) - kmem_cache_free(taskstats_cache, stats); -ret: - return sig->stats; + if (stats_new) + kmem_cache_free(taskstats_cache, stats_new); + + return stats; }
On Mon, Oct 21, 2019 at 03:04:18PM +0200, Christian Brauner wrote:
On Mon, Oct 21, 2019 at 02:19:01PM +0200, Rasmus Villemoes wrote:
On 21/10/2019 13.33, Christian Brauner wrote:
The first approach used smp_load_acquire() and smp_store_release(). However, after having discussed this it seems that the data dependency for kmem_cache_alloc() would be fixed by WRITE_ONCE(). Furthermore, the smp_load_acquire() would only manage to order the stats check before the thread_group_empty() check. So it seems just using READ_ONCE() and WRITE_ONCE() will do the job and I wanted to bring this up for discussion at least.
/* v6 */
- Christian Brauner christian.brauner@ubuntu.com:
- bring up READ_ONCE()/WRITE_ONCE() approach for discussion
kernel/taskstats.c | 26 +++++++++++++++----------- 1 file changed, 15 insertions(+), 11 deletions(-)
diff --git a/kernel/taskstats.c b/kernel/taskstats.c index 13a0f2e6ebc2..111bb4139aa2 100644 --- a/kernel/taskstats.c +++ b/kernel/taskstats.c @@ -554,25 +554,29 @@ static int taskstats_user_cmd(struct sk_buff *skb, struct genl_info *info) static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk) { struct signal_struct *sig = tsk->signal;
- struct taskstats *stats;
- struct taskstats *stats_new, *stats;
- if (sig->stats || thread_group_empty(tsk))
goto ret;
- /* Pairs with WRITE_ONCE() below. */
- stats = READ_ONCE(sig->stats);
- if (stats || thread_group_empty(tsk))
return stats;
/* No problem if kmem_cache_zalloc() fails */
- stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
- stats_new = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
spin_lock_irq(&tsk->sighand->siglock);
- if (!sig->stats) {
sig->stats = stats;
stats = NULL;
- if (!stats) {
stats = stats_new;
/* Pairs with READ_ONCE() above. */
WRITE_ONCE(sig->stats, stats_new);
stats_new = NULL;
No idea about the memory ordering issues, but don't you need to load/check sig->stats again? Otherwise it seems that two threads might both see !sig->stats, both allocate a stats_new, and both unconditionally in turn assign their stats_new to sig->stats. Then the first assignment ends up becoming a memory leak (and any writes through that pointer done by the caller end up in /dev/null...)
Trigger hand too fast. I guess you're thinking sm like:
diff --git a/kernel/taskstats.c b/kernel/taskstats.c index 13a0f2e6ebc2..c4e1ed11e785 100644 --- a/kernel/taskstats.c +++ b/kernel/taskstats.c @@ -554,25 +554,27 @@ static int taskstats_user_cmd(struct sk_buff *skb, struct genl_info *info) static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk) { struct signal_struct *sig = tsk->signal;
- struct taskstats *stats;
- struct taskstats *stats_new, *stats;
- if (sig->stats || thread_group_empty(tsk))
goto ret;
- stats = READ_ONCE(sig->stats);
This probably wants to be an acquire, since both the memcpy() later on in taskstats_exit() and the accesses in {b,x}acct_add_tsk() appear to read from the taskstats structure without the sighand->siglock held and therefore may miss zeroed allocation from the zalloc() below, I think.
- if (stats || thread_group_empty(tsk))
return stats;
- /* No problem if kmem_cache_zalloc() fails */
- stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
- stats_new = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
spin_lock_irq(&tsk->sighand->siglock);
- if (!sig->stats) {
sig->stats = stats;
stats = NULL;
- stats = READ_ONCE(sig->stats);
You hold the spinlock here, so I don't think you need the READ_ONCE().
- if (!stats) {
stats = stats_new;
WRITE_ONCE(sig->stats, stats_new);
You probably want a release here to publish the zeroes from the zalloc() (back to my first comment). With those changes:
Reviewed-by: Will Deacon will@kernel.org
However, this caused me to look at do_group_exit() and we appear to have racy accesses on sig->flags there thanks to signal_group_exit(). I worry that might run quite deep, and can probably be looked at separately.
Will
On Fri, Nov 29, 2019 at 05:56:05PM +0000, Will Deacon wrote:
On Mon, Oct 21, 2019 at 03:04:18PM +0200, Christian Brauner wrote:
On Mon, Oct 21, 2019 at 02:19:01PM +0200, Rasmus Villemoes wrote:
On 21/10/2019 13.33, Christian Brauner wrote:
The first approach used smp_load_acquire() and smp_store_release(). However, after having discussed this it seems that the data dependency for kmem_cache_alloc() would be fixed by WRITE_ONCE(). Furthermore, the smp_load_acquire() would only manage to order the stats check before the thread_group_empty() check. So it seems just using READ_ONCE() and WRITE_ONCE() will do the job and I wanted to bring this up for discussion at least.
/* v6 */
- Christian Brauner christian.brauner@ubuntu.com:
- bring up READ_ONCE()/WRITE_ONCE() approach for discussion
kernel/taskstats.c | 26 +++++++++++++++----------- 1 file changed, 15 insertions(+), 11 deletions(-)
diff --git a/kernel/taskstats.c b/kernel/taskstats.c index 13a0f2e6ebc2..111bb4139aa2 100644 --- a/kernel/taskstats.c +++ b/kernel/taskstats.c @@ -554,25 +554,29 @@ static int taskstats_user_cmd(struct sk_buff *skb, struct genl_info *info) static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk) { struct signal_struct *sig = tsk->signal;
- struct taskstats *stats;
- struct taskstats *stats_new, *stats;
- if (sig->stats || thread_group_empty(tsk))
goto ret;
- /* Pairs with WRITE_ONCE() below. */
- stats = READ_ONCE(sig->stats);
- if (stats || thread_group_empty(tsk))
return stats;
/* No problem if kmem_cache_zalloc() fails */
- stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
- stats_new = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
spin_lock_irq(&tsk->sighand->siglock);
- if (!sig->stats) {
sig->stats = stats;
stats = NULL;
- if (!stats) {
stats = stats_new;
/* Pairs with READ_ONCE() above. */
WRITE_ONCE(sig->stats, stats_new);
stats_new = NULL;
No idea about the memory ordering issues, but don't you need to load/check sig->stats again? Otherwise it seems that two threads might both see !sig->stats, both allocate a stats_new, and both unconditionally in turn assign their stats_new to sig->stats. Then the first assignment ends up becoming a memory leak (and any writes through that pointer done by the caller end up in /dev/null...)
Trigger hand too fast. I guess you're thinking sm like:
diff --git a/kernel/taskstats.c b/kernel/taskstats.c index 13a0f2e6ebc2..c4e1ed11e785 100644 --- a/kernel/taskstats.c +++ b/kernel/taskstats.c @@ -554,25 +554,27 @@ static int taskstats_user_cmd(struct sk_buff *skb, struct genl_info *info) static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk) { struct signal_struct *sig = tsk->signal;
- struct taskstats *stats;
- struct taskstats *stats_new, *stats;
- if (sig->stats || thread_group_empty(tsk))
goto ret;
- stats = READ_ONCE(sig->stats);
This probably wants to be an acquire, since both the memcpy() later on in taskstats_exit() and the accesses in {b,x}acct_add_tsk() appear to read from the taskstats structure without the sighand->siglock held and therefore may miss zeroed allocation from the zalloc() below, I think.
- if (stats || thread_group_empty(tsk))
return stats;
- /* No problem if kmem_cache_zalloc() fails */
- stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
- stats_new = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
spin_lock_irq(&tsk->sighand->siglock);
- if (!sig->stats) {
sig->stats = stats;
stats = NULL;
- stats = READ_ONCE(sig->stats);
You hold the spinlock here, so I don't think you need the READ_ONCE().
- if (!stats) {
stats = stats_new;
WRITE_ONCE(sig->stats, stats_new);
You probably want a release here to publish the zeroes from the zalloc() (back to my first comment). With those changes:
Reviewed-by: Will Deacon will@kernel.org
Thanks, this is basically what we had in v5. I'll rework and send this after the merge window closes.
However, this caused me to look at do_group_exit() and we appear to have racy accesses on sig->flags there thanks to signal_group_exit(). I worry that might run quite deep, and can probably be looked at separately.
Yeah, we should look into this but separate from this patch.
Thanks for taking a look at this! Much appreciated! Christian
On Mon, Oct 21, 2019 at 01:33:27PM +0200, Christian Brauner wrote:
When assiging and testing taskstats in taskstats_exit() there's a race when writing and reading sig->stats when a thread-group with more than one thread exits:
cpu0: thread catches fatal signal and whole thread-group gets taken down do_exit() do_group_exit() taskstats_exit() taskstats_tgid_alloc() The tasks reads sig->stats without holding sighand lock.
cpu1: task calls exit_group() do_exit() do_group_exit() taskstats_exit() taskstats_tgid_alloc() The task takes sighand lock and assigns new stats to sig->stats.
The first approach used smp_load_acquire() and smp_store_release(). However, after having discussed this it seems that the data dependency for kmem_cache_alloc() would be fixed by WRITE_ONCE(). Furthermore, the smp_load_acquire() would only manage to order the stats check before the thread_group_empty() check. So it seems just using READ_ONCE() and WRITE_ONCE() will do the job and I wanted to bring this up for discussion at least.
Mmh, the RELEASE was intended to order the memory initialization in kmem_cache_zalloc() with the later ->stats pointer assignment; AFAICT, there is no data dependency between such memory accesses.
Correspondingly, the ACQUIRE was intended to order the ->stats pointer load with later, _independent dereferences of the same pointer; the latter are, e.g., in taskstats_exit() (but not thread_group_empty()).
Looking again, I see that fill_tgid_exit()'s dereferences of ->stats are protected by ->siglock: maybe you meant to rely on such a critical section pairing with the critical section in taskstats_tgid_alloc()?
That memcpy(-, tsk->signal->stats, -) at the end of taskstats_exit() also bugs me: could these dereferences of ->stats happen concurrently with other stores to the same memory locations?
Thanks, Andrea
On Wed, Oct 23, 2019 at 2:16 PM Andrea Parri parri.andrea@gmail.com wrote:
On Mon, Oct 21, 2019 at 01:33:27PM +0200, Christian Brauner wrote:
When assiging and testing taskstats in taskstats_exit() there's a race when writing and reading sig->stats when a thread-group with more than one thread exits:
cpu0: thread catches fatal signal and whole thread-group gets taken down do_exit() do_group_exit() taskstats_exit() taskstats_tgid_alloc() The tasks reads sig->stats without holding sighand lock.
cpu1: task calls exit_group() do_exit() do_group_exit() taskstats_exit() taskstats_tgid_alloc() The task takes sighand lock and assigns new stats to sig->stats.
The first approach used smp_load_acquire() and smp_store_release(). However, after having discussed this it seems that the data dependency for kmem_cache_alloc() would be fixed by WRITE_ONCE(). Furthermore, the smp_load_acquire() would only manage to order the stats check before the thread_group_empty() check. So it seems just using READ_ONCE() and WRITE_ONCE() will do the job and I wanted to bring this up for discussion at least.
Mmh, the RELEASE was intended to order the memory initialization in kmem_cache_zalloc() with the later ->stats pointer assignment; AFAICT, there is no data dependency between such memory accesses.
I agree. This needs smp_store_release. The latest version that I looked at contained: smp_store_release(&sig->stats, stats_new);
Correspondingly, the ACQUIRE was intended to order the ->stats pointer load with later, _independent dereferences of the same pointer; the latter are, e.g., in taskstats_exit() (but not thread_group_empty()).
How these later loads can be completely independent of the pointer value? They need to obtain the pointer value from somewhere. And this can only be done by loaded it. And if a thread loads a pointer and then dereferences that pointer, that's a data/address dependency and we assume this is now covered by READ_ONCE. Or these later loads of the pointer can also race with the store? If so, I think they also need to use READ_ONCE (rather than turn this earlier pointer load into acquire).
Looking again, I see that fill_tgid_exit()'s dereferences of ->stats are protected by ->siglock: maybe you meant to rely on such a critical section pairing with the critical section in taskstats_tgid_alloc()?
That memcpy(-, tsk->signal->stats, -) at the end of taskstats_exit() also bugs me: could these dereferences of ->stats happen concurrently with other stores to the same memory locations?
Thanks, Andrea
On Wed, Oct 23, 2019 at 02:39:55PM +0200, Dmitry Vyukov wrote:
On Wed, Oct 23, 2019 at 2:16 PM Andrea Parri parri.andrea@gmail.com wrote:
On Mon, Oct 21, 2019 at 01:33:27PM +0200, Christian Brauner wrote:
When assiging and testing taskstats in taskstats_exit() there's a race when writing and reading sig->stats when a thread-group with more than one thread exits:
cpu0: thread catches fatal signal and whole thread-group gets taken down do_exit() do_group_exit() taskstats_exit() taskstats_tgid_alloc() The tasks reads sig->stats without holding sighand lock.
cpu1: task calls exit_group() do_exit() do_group_exit() taskstats_exit() taskstats_tgid_alloc() The task takes sighand lock and assigns new stats to sig->stats.
The first approach used smp_load_acquire() and smp_store_release(). However, after having discussed this it seems that the data dependency for kmem_cache_alloc() would be fixed by WRITE_ONCE(). Furthermore, the smp_load_acquire() would only manage to order the stats check before the thread_group_empty() check. So it seems just using READ_ONCE() and WRITE_ONCE() will do the job and I wanted to bring this up for discussion at least.
Mmh, the RELEASE was intended to order the memory initialization in kmem_cache_zalloc() with the later ->stats pointer assignment; AFAICT, there is no data dependency between such memory accesses.
I agree. This needs smp_store_release. The latest version that I looked at contained: smp_store_release(&sig->stats, stats_new);
This is what really makes me wonder. Can the compiler really re-order the kmem_cache_zalloc() call with the assignment. If that's really the case then shouldn't all allocation functions have compiler barriers in them? This then seems like a very generic problem.
Correspondingly, the ACQUIRE was intended to order the ->stats pointer load with later, _independent dereferences of the same pointer; the latter are, e.g., in taskstats_exit() (but not thread_group_empty()).
How these later loads can be completely independent of the pointer value? They need to obtain the pointer value from somewhere. And this can only be done by loaded it. And if a thread loads a pointer and then dereferences that pointer, that's a data/address dependency and we assume this is now covered by READ_ONCE. Or these later loads of the pointer can also race with the store? If
To clarify, later loads as in taskstats_exit() and thread_group_empty(), not the later load in the double-checked locking case.
so, I think they also need to use READ_ONCE (rather than turn this earlier pointer load into acquire).
Using READ_ONCE() in the alloc, taskstat_exit(), and thread_group_empty() case.
Christian
On Wed, Oct 23, 2019 at 3:11 PM Christian Brauner christian.brauner@ubuntu.com wrote:
On Wed, Oct 23, 2019 at 02:39:55PM +0200, Dmitry Vyukov wrote:
On Wed, Oct 23, 2019 at 2:16 PM Andrea Parri parri.andrea@gmail.com wrote:
On Mon, Oct 21, 2019 at 01:33:27PM +0200, Christian Brauner wrote:
When assiging and testing taskstats in taskstats_exit() there's a race when writing and reading sig->stats when a thread-group with more than one thread exits:
cpu0: thread catches fatal signal and whole thread-group gets taken down do_exit() do_group_exit() taskstats_exit() taskstats_tgid_alloc() The tasks reads sig->stats without holding sighand lock.
cpu1: task calls exit_group() do_exit() do_group_exit() taskstats_exit() taskstats_tgid_alloc() The task takes sighand lock and assigns new stats to sig->stats.
The first approach used smp_load_acquire() and smp_store_release(). However, after having discussed this it seems that the data dependency for kmem_cache_alloc() would be fixed by WRITE_ONCE(). Furthermore, the smp_load_acquire() would only manage to order the stats check before the thread_group_empty() check. So it seems just using READ_ONCE() and WRITE_ONCE() will do the job and I wanted to bring this up for discussion at least.
Mmh, the RELEASE was intended to order the memory initialization in kmem_cache_zalloc() with the later ->stats pointer assignment; AFAICT, there is no data dependency between such memory accesses.
I agree. This needs smp_store_release. The latest version that I looked at contained: smp_store_release(&sig->stats, stats_new);
This is what really makes me wonder. Can the compiler really re-order the kmem_cache_zalloc() call with the assignment.
Yes. Not sure about compiler, but hardware definitely can. And generally one does not care if it's compiler or hardware.
If that's really the case then shouldn't all allocation functions have compiler barriers in them? This then seems like a very generic problem.
No. One puts memory barriers into synchronization primitives. This equally affects memset's, memcpy's and in fact all normal stores. Adding a memory barrier to every normal store is not the solution to this. The memory barrier is done before publication of the memory. And we already have smp_store_release for this. So if one doesn't publish objects with a plain store (which breaks all possible rules anyways) and uses a proper primitive, there is no problem.
Correspondingly, the ACQUIRE was intended to order the ->stats pointer load with later, _independent dereferences of the same pointer; the latter are, e.g., in taskstats_exit() (but not thread_group_empty()).
How these later loads can be completely independent of the pointer value? They need to obtain the pointer value from somewhere. And this can only be done by loaded it. And if a thread loads a pointer and then dereferences that pointer, that's a data/address dependency and we assume this is now covered by READ_ONCE. Or these later loads of the pointer can also race with the store? If
To clarify, later loads as in taskstats_exit() and thread_group_empty(), not the later load in the double-checked locking case.
so, I think they also need to use READ_ONCE (rather than turn this earlier pointer load into acquire).
Using READ_ONCE() in the alloc, taskstat_exit(), and thread_group_empty() case.
Christian
How these later loads can be completely independent of the pointer value? They need to obtain the pointer value from somewhere. And this can only be done by loaded it. And if a thread loads a pointer and then dereferences that pointer, that's a data/address dependency and we assume this is now covered by READ_ONCE.
The "dependency" I was considering here is a dependency _between the load of sig->stats in taskstats_tgid_alloc() and the (program-order) later loads of *(sig->stats) in taskstats_exit(). Roughly speaking, such a dependency should correspond to a dependency chain at the asm or registers level from the first load to the later loads; e.g., in:
Thread [register r0 contains the address of sig->stats]
A: LOAD r1,[r0] // LOAD_ACQUIRE sig->stats ... B: LOAD r2,[r0] // LOAD *(sig->stats) C: LOAD r3,[r2]
there would be no such dependency from A to C. Compare, e.g., with:
Thread [register r0 contains the address of sig->stats]
A: LOAD r1,[r0] // LOAD_ACQUIRE sig->stats ... C: LOAD r3,[r1] // LOAD *(sig->stats)
AFAICT, there's no guarantee that the compilers will generate such a dependency from the code under discussion.
Or these later loads of the pointer can also race with the store? If so, I think they also need to use READ_ONCE (rather than turn this earlier pointer load into acquire).
AFAICT, _if the LOAD_ACQUIRE reads from the mentioned STORE_RELEASE, then the former must induce enough synchronization to eliminate data races (as well as any undesired re-ordering).
TBH, I am not familiar enough with the underlying logic of this code to say whether that "if .. reads from .." pre-condition holds by the time those *(sig->stats) execute.
Thanks, Andrea
On Thu, Oct 24, 2019 at 1:32 PM Andrea Parri parri.andrea@gmail.com wrote:
How these later loads can be completely independent of the pointer value? They need to obtain the pointer value from somewhere. And this can only be done by loaded it. And if a thread loads a pointer and then dereferences that pointer, that's a data/address dependency and we assume this is now covered by READ_ONCE.
The "dependency" I was considering here is a dependency _between the load of sig->stats in taskstats_tgid_alloc() and the (program-order) later loads of *(sig->stats) in taskstats_exit(). Roughly speaking, such a dependency should correspond to a dependency chain at the asm or registers level from the first load to the later loads; e.g., in:
Thread [register r0 contains the address of sig->stats]
A: LOAD r1,[r0] // LOAD_ACQUIRE sig->stats ... B: LOAD r2,[r0] // LOAD *(sig->stats) C: LOAD r3,[r2]
there would be no such dependency from A to C. Compare, e.g., with:
Thread [register r0 contains the address of sig->stats]
A: LOAD r1,[r0] // LOAD_ACQUIRE sig->stats ... C: LOAD r3,[r1] // LOAD *(sig->stats)
AFAICT, there's no guarantee that the compilers will generate such a dependency from the code under discussion.
Fixing this by making A ACQUIRE looks like somewhat weird code pattern to me (though correct). B is what loads the address used to read indirect data, so B ought to be ACQUIRE (or LOAD-DEPENDS which we get from READ_ONCE).
What you are suggesting is:
addr = ptr.load(memory_order_acquire); if (addr) { addr = ptr.load(memory_order_relaxed); data = *addr; }
whereas the canonical/non-convoluted form of this pattern is:
addr = ptr.load(memory_order_consume); if (addr) data = *addr;
Moreover the second load of ptr is not even atomic in our case, so it is a subject to another data race?
On Thu, Oct 24, 2019 at 01:51:20PM +0200, Dmitry Vyukov wrote:
On Thu, Oct 24, 2019 at 1:32 PM Andrea Parri parri.andrea@gmail.com wrote:
How these later loads can be completely independent of the pointer value? They need to obtain the pointer value from somewhere. And this can only be done by loaded it. And if a thread loads a pointer and then dereferences that pointer, that's a data/address dependency and we assume this is now covered by READ_ONCE.
The "dependency" I was considering here is a dependency _between the load of sig->stats in taskstats_tgid_alloc() and the (program-order) later loads of *(sig->stats) in taskstats_exit(). Roughly speaking, such a dependency should correspond to a dependency chain at the asm or registers level from the first load to the later loads; e.g., in:
Thread [register r0 contains the address of sig->stats]
A: LOAD r1,[r0] // LOAD_ACQUIRE sig->stats ... B: LOAD r2,[r0] // LOAD *(sig->stats) C: LOAD r3,[r2]
there would be no such dependency from A to C. Compare, e.g., with:
Thread [register r0 contains the address of sig->stats]
A: LOAD r1,[r0] // LOAD_ACQUIRE sig->stats ... C: LOAD r3,[r1] // LOAD *(sig->stats)
AFAICT, there's no guarantee that the compilers will generate such a dependency from the code under discussion.
Fixing this by making A ACQUIRE looks like somewhat weird code pattern to me (though correct). B is what loads the address used to read indirect data, so B ought to be ACQUIRE (or LOAD-DEPENDS which we get from READ_ONCE).
What you are suggesting is:
addr = ptr.load(memory_order_acquire); if (addr) { addr = ptr.load(memory_order_relaxed); data = *addr; }
whereas the canonical/non-convoluted form of this pattern is:
addr = ptr.load(memory_order_consume); if (addr) data = *addr;
No, I'd rather be suggesting:
addr = ptr.load(memory_order_acquire); if (addr) data = *addr;
since I'd not expect any form of encouragement to rely on "consume" or on "READ_ONCE() + true-address-dependency" from myself. ;-)
IAC, v6 looks more like:
addr = ptr.load(memory_order_consume); if (!!addr) *ptr = 1; data = *ptr;
to me (hence my comments/questions ...).
Thanks, Andrea
On Thu, Oct 24, 2019 at 3:05 PM Andrea Parri parri.andrea@gmail.com wrote:
On Thu, Oct 24, 2019 at 01:51:20PM +0200, Dmitry Vyukov wrote:
On Thu, Oct 24, 2019 at 1:32 PM Andrea Parri parri.andrea@gmail.com wrote:
How these later loads can be completely independent of the pointer value? They need to obtain the pointer value from somewhere. And this can only be done by loaded it. And if a thread loads a pointer and then dereferences that pointer, that's a data/address dependency and we assume this is now covered by READ_ONCE.
The "dependency" I was considering here is a dependency _between the load of sig->stats in taskstats_tgid_alloc() and the (program-order) later loads of *(sig->stats) in taskstats_exit(). Roughly speaking, such a dependency should correspond to a dependency chain at the asm or registers level from the first load to the later loads; e.g., in:
Thread [register r0 contains the address of sig->stats]
A: LOAD r1,[r0] // LOAD_ACQUIRE sig->stats ... B: LOAD r2,[r0] // LOAD *(sig->stats) C: LOAD r3,[r2]
there would be no such dependency from A to C. Compare, e.g., with:
Thread [register r0 contains the address of sig->stats]
A: LOAD r1,[r0] // LOAD_ACQUIRE sig->stats ... C: LOAD r3,[r1] // LOAD *(sig->stats)
AFAICT, there's no guarantee that the compilers will generate such a dependency from the code under discussion.
Fixing this by making A ACQUIRE looks like somewhat weird code pattern to me (though correct). B is what loads the address used to read indirect data, so B ought to be ACQUIRE (or LOAD-DEPENDS which we get from READ_ONCE).
What you are suggesting is:
addr = ptr.load(memory_order_acquire); if (addr) { addr = ptr.load(memory_order_relaxed); data = *addr; }
whereas the canonical/non-convoluted form of this pattern is:
addr = ptr.load(memory_order_consume); if (addr) data = *addr;
No, I'd rather be suggesting:
addr = ptr.load(memory_order_acquire); if (addr) data = *addr;
since I'd not expect any form of encouragement to rely on "consume" or on "READ_ONCE() + true-address-dependency" from myself. ;-)
But why? I think kernel contains lots of such cases and it seems to be officially documented by the LKMM: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tool... address dependencies and ppo Am I missing something? Why do we need acquire with address dependency?
IAC, v6 looks more like:
addr = ptr.load(memory_order_consume); if (!!addr) *ptr = 1; data = *ptr;
to me (hence my comments/questions ...).
Thanks, Andrea
On Thu, Oct 24, 2019 at 03:13:48PM +0200, Dmitry Vyukov wrote:
On Thu, Oct 24, 2019 at 3:05 PM Andrea Parri parri.andrea@gmail.com wrote:
On Thu, Oct 24, 2019 at 01:51:20PM +0200, Dmitry Vyukov wrote:
On Thu, Oct 24, 2019 at 1:32 PM Andrea Parri parri.andrea@gmail.com wrote:
How these later loads can be completely independent of the pointer value? They need to obtain the pointer value from somewhere. And this can only be done by loaded it. And if a thread loads a pointer and then dereferences that pointer, that's a data/address dependency and we assume this is now covered by READ_ONCE.
The "dependency" I was considering here is a dependency _between the load of sig->stats in taskstats_tgid_alloc() and the (program-order) later loads of *(sig->stats) in taskstats_exit(). Roughly speaking, such a dependency should correspond to a dependency chain at the asm or registers level from the first load to the later loads; e.g., in:
Thread [register r0 contains the address of sig->stats]
A: LOAD r1,[r0] // LOAD_ACQUIRE sig->stats ... B: LOAD r2,[r0] // LOAD *(sig->stats) C: LOAD r3,[r2]
there would be no such dependency from A to C. Compare, e.g., with:
Thread [register r0 contains the address of sig->stats]
A: LOAD r1,[r0] // LOAD_ACQUIRE sig->stats ... C: LOAD r3,[r1] // LOAD *(sig->stats)
AFAICT, there's no guarantee that the compilers will generate such a dependency from the code under discussion.
Fixing this by making A ACQUIRE looks like somewhat weird code pattern to me (though correct). B is what loads the address used to read indirect data, so B ought to be ACQUIRE (or LOAD-DEPENDS which we get from READ_ONCE).
What you are suggesting is:
addr = ptr.load(memory_order_acquire); if (addr) { addr = ptr.load(memory_order_relaxed); data = *addr; }
whereas the canonical/non-convoluted form of this pattern is:
addr = ptr.load(memory_order_consume); if (addr) data = *addr;
No, I'd rather be suggesting:
addr = ptr.load(memory_order_acquire); if (addr) data = *addr;
since I'd not expect any form of encouragement to rely on "consume" or on "READ_ONCE() + true-address-dependency" from myself. ;-)
But why? I think kernel contains lots of such cases and it seems to be officially documented by the LKMM: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tool... address dependencies and ppo
You mean this section: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tool... and specifically: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tool... ?
On Thu, Oct 24, 2019 at 3:21 PM Christian Brauner christian.brauner@ubuntu.com wrote:
On Thu, Oct 24, 2019 at 03:13:48PM +0200, Dmitry Vyukov wrote:
On Thu, Oct 24, 2019 at 3:05 PM Andrea Parri parri.andrea@gmail.com wrote:
On Thu, Oct 24, 2019 at 01:51:20PM +0200, Dmitry Vyukov wrote:
On Thu, Oct 24, 2019 at 1:32 PM Andrea Parri parri.andrea@gmail.com wrote:
How these later loads can be completely independent of the pointer value? They need to obtain the pointer value from somewhere. And this can only be done by loaded it. And if a thread loads a pointer and then dereferences that pointer, that's a data/address dependency and we assume this is now covered by READ_ONCE.
The "dependency" I was considering here is a dependency _between the load of sig->stats in taskstats_tgid_alloc() and the (program-order) later loads of *(sig->stats) in taskstats_exit(). Roughly speaking, such a dependency should correspond to a dependency chain at the asm or registers level from the first load to the later loads; e.g., in:
Thread [register r0 contains the address of sig->stats]
A: LOAD r1,[r0] // LOAD_ACQUIRE sig->stats ... B: LOAD r2,[r0] // LOAD *(sig->stats) C: LOAD r3,[r2]
there would be no such dependency from A to C. Compare, e.g., with:
Thread [register r0 contains the address of sig->stats]
A: LOAD r1,[r0] // LOAD_ACQUIRE sig->stats ... C: LOAD r3,[r1] // LOAD *(sig->stats)
AFAICT, there's no guarantee that the compilers will generate such a dependency from the code under discussion.
Fixing this by making A ACQUIRE looks like somewhat weird code pattern to me (though correct). B is what loads the address used to read indirect data, so B ought to be ACQUIRE (or LOAD-DEPENDS which we get from READ_ONCE).
What you are suggesting is:
addr = ptr.load(memory_order_acquire); if (addr) { addr = ptr.load(memory_order_relaxed); data = *addr; }
whereas the canonical/non-convoluted form of this pattern is:
addr = ptr.load(memory_order_consume); if (addr) data = *addr;
No, I'd rather be suggesting:
addr = ptr.load(memory_order_acquire); if (addr) data = *addr;
since I'd not expect any form of encouragement to rely on "consume" or on "READ_ONCE() + true-address-dependency" from myself. ;-)
But why? I think kernel contains lots of such cases and it seems to be officially documented by the LKMM: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tool... address dependencies and ppo
You mean this section: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tool... and specifically: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tool... ?
Yes, and also this: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tool...
But why? I think kernel contains lots of such cases and it seems to be officially documented by the LKMM: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tool... address dependencies and ppo
Well, that same documentation also alerts about some of the pitfalls developers can incur while relying on dependencies. I'm sure you're more than aware of some of the debate surrounding these issues.
Andrea
On Thu, Oct 24, 2019 at 3:43 PM Andrea Parri parri.andrea@gmail.com wrote:
But why? I think kernel contains lots of such cases and it seems to be officially documented by the LKMM: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tool... address dependencies and ppo
Well, that same documentation also alerts about some of the pitfalls developers can incur while relying on dependencies. I'm sure you're more than aware of some of the debate surrounding these issues.
I thought that LKMM is finally supposed to stop all these centi-threads around subtle details of ordering. And not we finally have it. And it says that using address-dependencies is legal. And you are one of the authors. And now you are arguing here that we better not use it :) Can we have some black/white yes/no for code correctness reflected in LKMM please :) If we are banning address dependencies, don't we need to fix all of rcu uses?
On Thu, Oct 24, 2019 at 03:58:40PM +0200, Dmitry Vyukov wrote:
On Thu, Oct 24, 2019 at 3:43 PM Andrea Parri parri.andrea@gmail.com wrote:
But why? I think kernel contains lots of such cases and it seems to be officially documented by the LKMM: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tool... address dependencies and ppo
Well, that same documentation also alerts about some of the pitfalls developers can incur while relying on dependencies. I'm sure you're more than aware of some of the debate surrounding these issues.
I thought that LKMM is finally supposed to stop all these centi-threads around subtle details of ordering. And not we finally have it. And it says that using address-dependencies is legal. And you are one of the authors. And now you are arguing here that we better not use it :) Can we have some black/white yes/no for code correctness reflected in LKMM please :) If we are banning address dependencies, don't we need to fix all of rcu uses?
Current limitations of the LKMM are listed in tools/memory-model/README (and I myself discussed a number of them at LPC recently); the relevant point here seems to be:
1. Compiler optimizations are not accurately modeled. Of course, the use of READ_ONCE() and WRITE_ONCE() limits the compiler's ability to optimize, but under some circumstances it is possible for the compiler to undermine the memory model. [...]
Note that this limitation in turn limits LKMM's ability to accurately model address, control, and data dependencies.
A less elegant, but hopefully more effective, way to phrase such point is maybe "feel free to rely on dependencies, but then do not blame the LKMM authors please". ;-)
Thanks, Andrea
On Thu, Oct 24, 2019 at 4:40 PM Andrea Parri parri.andrea@gmail.com wrote:
On Thu, Oct 24, 2019 at 03:58:40PM +0200, Dmitry Vyukov wrote:
On Thu, Oct 24, 2019 at 3:43 PM Andrea Parri parri.andrea@gmail.com wrote:
But why? I think kernel contains lots of such cases and it seems to be officially documented by the LKMM: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tool... address dependencies and ppo
Well, that same documentation also alerts about some of the pitfalls developers can incur while relying on dependencies. I'm sure you're more than aware of some of the debate surrounding these issues.
I thought that LKMM is finally supposed to stop all these centi-threads around subtle details of ordering. And not we finally have it. And it says that using address-dependencies is legal. And you are one of the authors. And now you are arguing here that we better not use it :) Can we have some black/white yes/no for code correctness reflected in LKMM please :) If we are banning address dependencies, don't we need to fix all of rcu uses?
Current limitations of the LKMM are listed in tools/memory-model/README (and I myself discussed a number of them at LPC recently); the relevant point here seems to be:
Compiler optimizations are not accurately modeled. Of course, the use of READ_ONCE() and WRITE_ONCE() limits the compiler's ability to optimize, but under some circumstances it is possible for the compiler to undermine the memory model. [...] Note that this limitation in turn limits LKMM's ability to accurately model address, control, and data dependencies.
A less elegant, but hopefully more effective, way to phrase such point is maybe "feel free to rely on dependencies, but then do not blame the LKMM authors please". ;-)
We are not going to blame LKMM authors :)
Acquire will introduce actual hardware barrier on arm/power/etc. Maybe it does not matter here. But I feel if we start replacing all load-depends/rcu with acquire, it will be noticeable overhead. So what do we do in the context of the whole kernel?
On Mon, Oct 21, 2019 at 01:33:27PM +0200, Christian Brauner wrote:
When assiging and testing taskstats in taskstats_exit() there's a race when writing and reading sig->stats when a thread-group with more than one thread exits:
cpu0: thread catches fatal signal and whole thread-group gets taken down do_exit() do_group_exit()
Nit: I don't think this is the signal-handling path.
taskstats_exit() taskstats_tgid_alloc() The tasks reads sig->stats without holding sighand lock.
cpu1: task calls exit_group() do_exit() do_group_exit()
Nit: These ^^ seem to be the wrong way round.
Will
On Wed, 9 Oct 2019 at 13:31, Christian Brauner christian.brauner@ubuntu.com wrote:
When assiging and testing taskstats in taskstats_exit() there's a race when writing and reading sig->stats when a thread-group with more than one thread exits:
cpu0: thread catches fatal signal and whole thread-group gets taken down do_exit() do_group_exit() taskstats_exit() taskstats_tgid_alloc() The tasks reads sig->stats without holding sighand lock seeing garbage.
cpu1: task calls exit_group() do_exit() do_group_exit() taskstats_exit() taskstats_tgid_alloc() The task takes sighand lock and assigns new stats to sig->stats.
Fix this by using smp_load_acquire() and smp_store_release().
Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com Fixes: 34ec12349c8a ("taskstats: cleanup ->signal->stats allocation") Cc: stable@vger.kernel.org Signed-off-by: Christian Brauner christian.brauner@ubuntu.com Reviewed-by: Dmitry Vyukov dvyukov@google.com
/* v1 */ Link: https://lore.kernel.org/r/20191005112806.13960-1-christian.brauner@ubuntu.co...
/* v2 */ Link: https://lore.kernel.org/r/20191006235216.7483-1-christian.brauner@ubuntu.com
- Dmitry Vyukov dvyukov@google.com, Marco Elver elver@google.com:
- fix the original double-checked locking using memory barriers
/* v3 */ Link: https://lore.kernel.org/r/20191007110117.1096-1-christian.brauner@ubuntu.com...
- Andrea Parri parri.andrea@gmail.com:
- document memory barriers to make checkpatch happy
/* v4 */
- Andrea Parri parri.andrea@gmail.com:
- use smp_load_acquire(), not READ_ONCE()
- update commit message
Acked-by: Marco Elver elver@google.com
Note that this now looks almost like what I suggested, except the return at the end of the function is accessing sig->stats again. In this case, it seems it's fine assuming sig->stats cannot be written elsewhere. Just wanted to point it out to make sure it's considered.
Thanks!
kernel/taskstats.c | 24 +++++++++++++++--------- 1 file changed, 15 insertions(+), 9 deletions(-)
diff --git a/kernel/taskstats.c b/kernel/taskstats.c index 13a0f2e6ebc2..e6b45315607a 100644 --- a/kernel/taskstats.c +++ b/kernel/taskstats.c @@ -554,24 +554,30 @@ static int taskstats_user_cmd(struct sk_buff *skb, struct genl_info *info) static struct taskstats *taskstats_tgid_alloc(struct task_struct *tsk) { struct signal_struct *sig = tsk->signal;
struct taskstats *stats;
struct taskstats *stats_new, *stats;
if (sig->stats || thread_group_empty(tsk))
goto ret;
/* Pairs with smp_store_release() below. */
stats = smp_load_acquire(sig->stats);
if (stats || thread_group_empty(tsk))
return stats; /* No problem if kmem_cache_zalloc() fails */
stats = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL);
stats_new = kmem_cache_zalloc(taskstats_cache, GFP_KERNEL); spin_lock_irq(&tsk->sighand->siglock); if (!sig->stats) {
sig->stats = stats;
stats = NULL;
/*
* Pairs with smp_store_release() above and order the
* kmem_cache_zalloc().
*/
smp_store_release(&sig->stats, stats_new);
stats_new = NULL; } spin_unlock_irq(&tsk->sighand->siglock);
if (stats)
kmem_cache_free(taskstats_cache, stats);
-ret:
if (stats_new)
kmem_cache_free(taskstats_cache, stats_new);
return sig->stats;
}
-- 2.23.0
On Wed, Oct 09, 2019 at 01:48:27PM +0200, Marco Elver wrote:
On Wed, 9 Oct 2019 at 13:31, Christian Brauner christian.brauner@ubuntu.com wrote:
When assiging and testing taskstats in taskstats_exit() there's a race when writing and reading sig->stats when a thread-group with more than one thread exits:
cpu0: thread catches fatal signal and whole thread-group gets taken down do_exit() do_group_exit() taskstats_exit() taskstats_tgid_alloc() The tasks reads sig->stats without holding sighand lock seeing garbage.
cpu1: task calls exit_group() do_exit() do_group_exit() taskstats_exit() taskstats_tgid_alloc() The task takes sighand lock and assigns new stats to sig->stats.
Fix this by using smp_load_acquire() and smp_store_release().
Reported-by: syzbot+c5d03165a1bd1dead0c1@syzkaller.appspotmail.com Fixes: 34ec12349c8a ("taskstats: cleanup ->signal->stats allocation") Cc: stable@vger.kernel.org Signed-off-by: Christian Brauner christian.brauner@ubuntu.com Reviewed-by: Dmitry Vyukov dvyukov@google.com
/* v1 */ Link: https://lore.kernel.org/r/20191005112806.13960-1-christian.brauner@ubuntu.co...
/* v2 */ Link: https://lore.kernel.org/r/20191006235216.7483-1-christian.brauner@ubuntu.com
- Dmitry Vyukov dvyukov@google.com, Marco Elver elver@google.com:
- fix the original double-checked locking using memory barriers
/* v3 */ Link: https://lore.kernel.org/r/20191007110117.1096-1-christian.brauner@ubuntu.com...
- Andrea Parri parri.andrea@gmail.com:
- document memory barriers to make checkpatch happy
/* v4 */
- Andrea Parri parri.andrea@gmail.com:
- use smp_load_acquire(), not READ_ONCE()
- update commit message
Acked-by: Marco Elver elver@google.com
Note that this now looks almost like what I suggested, except the
Right, I think we all just needed to get our heads clear about what exactly is happening here. This codepath is not a very prominent one. :)
return at the end of the function is accessing sig->stats again. In this case, it seems it's fine assuming sig->stats cannot be written elsewhere. Just wanted to point it out to make sure it's considered.
Yes, I considered that but thanks for mentioning it.
Note that this patch has a bug. It should be smp_load_acquire(&sig->stats) and not smp_load_acquire(sig->stats). I accidently didn't automatically recompile the patchset after the last change I made. Andrea thankfully caught this.
Thanks! Christian
linux-stable-mirror@lists.linaro.org