On 12/1/22 08:44, Will Deacon wrote:
On Sun, Nov 27, 2022 at 08:44:41PM -0500, Waiman Long wrote:
Since commit 07ec77a1d4e8 ("sched: Allow task CPU affinity to be restricted on asymmetric systems"), the setting and clearing of user_cpus_ptr are done under pi_lock for arm64 architecture. However, dup_user_cpus_ptr() accesses user_cpus_ptr without any lock protection. When racing with the clearing of user_cpus_ptr in __set_cpus_allowed_ptr_locked(), it can lead to user-after-free and double-free in arm64 kernel.
Commit 8f9ea86fdf99 ("sched: Always preserve the user requested cpumask") fixes this problem as user_cpus_ptr, once set, will never be cleared in a task's lifetime. However, this bug was re-introduced in commit 851a723e45d1 ("sched: Always clear user_cpus_ptr in do_set_cpus_allowed()") which allows the clearing of user_cpus_ptr in do_set_cpus_allowed(). This time, it will affect all arches.
Fix this bug by always clearing the user_cpus_ptr of the newly cloned/forked task before the copying process starts and check the user_cpus_ptr state of the source task under pi_lock.
Note to stable, this patch won't be applicable to stable releases. Just copy the new dup_user_cpus_ptr() function over.
Fixes: 07ec77a1d4e8 ("sched: Allow task CPU affinity to be restricted on asymmetric systems") Fixes: 851a723e45d1 ("sched: Always clear user_cpus_ptr in do_set_cpus_allowed()") CC: stable@vger.kernel.org Reported-by: David Wang 王标 wangbiao3@xiaomi.com Signed-off-by: Waiman Long longman@redhat.com
kernel/sched/core.c | 32 ++++++++++++++++++++++++++++---- 1 file changed, 28 insertions(+), 4 deletions(-)
As per my comments on the previous version of this patch:
https://lore.kernel.org/lkml/20221201133602.GB28489@willie-the-truck/T/#t
I think there are other issues to fix when racing affinity changes with fork() too.
It is certainly possible that there are other bugs hiding somewhere:-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 8df51b08bb38..f2b75faaf71a 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -2624,19 +2624,43 @@ void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask) int dup_user_cpus_ptr(struct task_struct *dst, struct task_struct *src, int node) {
- cpumask_t *user_mask; unsigned long flags;
- /*
* Always clear dst->user_cpus_ptr first as their user_cpus_ptr's
* may differ by now due to racing.
*/
- dst->user_cpus_ptr = NULL;
- /*
* This check is racy and losing the race is a valid situation.
* It is not worth the extra overhead of taking the pi_lock on
* every fork/clone.
if (!src->user_cpus_ptr) return 0;*/
data_race() ?
Race is certainly possible, but the clearing of user_cpus_ptr before will mitigate any risk.
- dst->user_cpus_ptr = kmalloc_node(cpumask_size(), GFP_KERNEL, node);
- if (!dst->user_cpus_ptr)
- user_mask = kmalloc_node(cpumask_size(), GFP_KERNEL, node);
- if (!user_mask) return -ENOMEM;
- /* Use pi_lock to protect content of user_cpus_ptr */
- /*
* Use pi_lock to protect content of user_cpus_ptr
*
* Though unlikely, user_cpus_ptr can be reset to NULL by a concurrent
* do_set_cpus_allowed().
raw_spin_lock_irqsave(&src->pi_lock, flags);*/
- cpumask_copy(dst->user_cpus_ptr, src->user_cpus_ptr);
- if (src->user_cpus_ptr) {
swap(dst->user_cpus_ptr, user_mask);
Isn't 'dst->user_cpus_ptr' always NULL here? Why do we need the swap() instead of just assigning the thing directly?
True. We still need to clear user_mask. So I used swap() instead of 2 assignment statements. I am fine to go with either way.
Cheers, Longman