Hi all,
This is another resend of several task->mm fixes, the bugs I found during LMK code audit. Architectures were traverse the tasklist in an unsafe manner, plus there are a few cases of unsafe access to task->mm in general.
There were no objections on the previous resend, and the final words were somewhere along "the patches are fine" line.
In v3: - Dropped a controversal 'Make find_lock_task_mm() sparse-aware' patch; - Reword arm and sh commit messages, per Oleg Nesterov's suggestions; - Added an optimization trick in clear_tasks_mm_cpumask(): take only the rcu read lock, no need for the whole tasklist_lock. Suggested by Peter Zijlstra.
In v2: - introduced a small helper in cpu.c: most arches duplicate the same [buggy] code snippet, so it's better to fix it and move the logic into a common function.
Thanks,
Many architectures clear tasks' mm_cpumask like this:
read_lock(&tasklist_lock); for_each_process(p) { if (p->mm) cpumask_clear_cpu(cpu, mm_cpumask(p->mm)); } read_unlock(&tasklist_lock);
Depending on the context, the code above may have several problems, such as:
1. Working with task->mm w/o getting mm or grabing the task lock is dangerous as ->mm might disappear (exit_mm() assigns NULL under task_lock(), so tasklist lock is not enough).
2. Checking for process->mm is not enough because process' main thread may exit or detach its mm via use_mm(), but other threads may still have a valid mm.
This patch implements a small helper function that does things correctly, i.e.:
1. We take the task's lock while whe handle its mm (we can't use get_task_mm()/mmput() pair as mmput() might sleep);
2. To catch exited main thread case, we use find_lock_task_mm(), which walks up all threads and returns an appropriate task (with task lock held).
Also, Per Peter Zijlstra's idea, now we don't grab tasklist_lock in the new helper, instead we take the rcu read lock. We can do this because the function is called after the cpu is taken down and marked offline, so no new tasks will get this cpu set in their mm mask.
Signed-off-by: Anton Vorontsov anton.vorontsov@linaro.org --- include/linux/cpu.h | 1 + kernel/cpu.c | 26 ++++++++++++++++++++++++++ 2 files changed, 27 insertions(+)
diff --git a/include/linux/cpu.h b/include/linux/cpu.h index ee28844..d2ca49f 100644 --- a/include/linux/cpu.h +++ b/include/linux/cpu.h @@ -179,6 +179,7 @@ extern void put_online_cpus(void); #define hotcpu_notifier(fn, pri) cpu_notifier(fn, pri) #define register_hotcpu_notifier(nb) register_cpu_notifier(nb) #define unregister_hotcpu_notifier(nb) unregister_cpu_notifier(nb) +void clear_tasks_mm_cpumask(int cpu); int cpu_down(unsigned int cpu);
#ifdef CONFIG_ARCH_CPU_PROBE_RELEASE diff --git a/kernel/cpu.c b/kernel/cpu.c index 2060c6e..ecdf499 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -10,6 +10,8 @@ #include <linux/sched.h> #include <linux/unistd.h> #include <linux/cpu.h> +#include <linux/oom.h> +#include <linux/rcupdate.h> #include <linux/export.h> #include <linux/kthread.h> #include <linux/stop_machine.h> @@ -171,6 +173,30 @@ void __ref unregister_cpu_notifier(struct notifier_block *nb) } EXPORT_SYMBOL(unregister_cpu_notifier);
+void clear_tasks_mm_cpumask(int cpu) +{ + struct task_struct *p; + + /* + * This function is called after the cpu is taken down and marked + * offline, so its not like new tasks will ever get this cpu set in + * their mm mask. -- Peter Zijlstra + * Thus, we may use rcu_read_lock() here, instead of grabbing + * full-fledged tasklist_lock. + */ + rcu_read_lock(); + for_each_process(p) { + struct task_struct *t; + + t = find_lock_task_mm(p); + if (!t) + continue; + cpumask_clear_cpu(cpu, mm_cpumask(t->mm)); + task_unlock(t); + } + rcu_read_unlock(); +} + static inline void check_for_tasks(int cpu) { struct task_struct *p;
On Mon, 23 Apr 2012 00:07:36 -0700 Anton Vorontsov anton.vorontsov@linaro.org wrote:
Many architectures clear tasks' mm_cpumask like this:
read_lock(&tasklist_lock); for_each_process(p) { if (p->mm) cpumask_clear_cpu(cpu, mm_cpumask(p->mm)); } read_unlock(&tasklist_lock);
Depending on the context, the code above may have several problems, such as:
Working with task->mm w/o getting mm or grabing the task lock is dangerous as ->mm might disappear (exit_mm() assigns NULL under task_lock(), so tasklist lock is not enough).
Checking for process->mm is not enough because process' main thread may exit or detach its mm via use_mm(), but other threads may still have a valid mm.
This patch implements a small helper function that does things correctly, i.e.:
We take the task's lock while whe handle its mm (we can't use get_task_mm()/mmput() pair as mmput() might sleep);
To catch exited main thread case, we use find_lock_task_mm(), which walks up all threads and returns an appropriate task (with task lock held).
Also, Per Peter Zijlstra's idea, now we don't grab tasklist_lock in the new helper, instead we take the rcu read lock. We can do this because the function is called after the cpu is taken down and marked offline, so no new tasks will get this cpu set in their mm mask.
Seems reasonable.
--- a/include/linux/cpu.h +++ b/include/linux/cpu.h @@ -179,6 +179,7 @@ extern void put_online_cpus(void); #define hotcpu_notifier(fn, pri) cpu_notifier(fn, pri) #define register_hotcpu_notifier(nb) register_cpu_notifier(nb) #define unregister_hotcpu_notifier(nb) unregister_cpu_notifier(nb) +void clear_tasks_mm_cpumask(int cpu); int cpu_down(unsigned int cpu); #ifdef CONFIG_ARCH_CPU_PROBE_RELEASE diff --git a/kernel/cpu.c b/kernel/cpu.c index 2060c6e..ecdf499 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -10,6 +10,8 @@ #include <linux/sched.h> #include <linux/unistd.h> #include <linux/cpu.h> +#include <linux/oom.h> +#include <linux/rcupdate.h> #include <linux/export.h> #include <linux/kthread.h> #include <linux/stop_machine.h> @@ -171,6 +173,30 @@ void __ref unregister_cpu_notifier(struct notifier_block *nb) } EXPORT_SYMBOL(unregister_cpu_notifier); +void clear_tasks_mm_cpumask(int cpu)
The operation of this function was presumably obvious to you at the time you wrote it, but that isn't true of other people at later times.
Please document it?
+{
- struct task_struct *p;
- /*
* This function is called after the cpu is taken down and marked
* offline,
hm, well. Who said that this function will only ever be called after that CPU was taken down? There is nothing in the function name nor in the (absent) documentation which enforces this precondition.
If someone tries to use this function for a different purpose, or copies-and-modifies it for a different purpose, we just shot them in the foot.
They'd be pretty dumb to do that without reading the local comment, but still...
so its not like new tasks will ever get this cpu set in
* their mm mask. -- Peter Zijlstra
* Thus, we may use rcu_read_lock() here, instead of grabbing
* full-fledged tasklist_lock.
*/
- rcu_read_lock();
- for_each_process(p) {
struct task_struct *t;
t = find_lock_task_mm(p);
if (!t)
continue;
cpumask_clear_cpu(cpu, mm_cpumask(t->mm));
task_unlock(t);
- }
- rcu_read_unlock();
+}
It is good that this code exists under CONFIG_HOTPLUG_CPU. Did you check that everything works correctly with CONFIG_HOTPLUG_CPU=n?
On Thu, 2012-04-26 at 16:59 -0700, Andrew Morton wrote:
+void clear_tasks_mm_cpumask(int cpu)
The operation of this function was presumably obvious to you at the time you wrote it, but that isn't true of other people at later times.
Please document it?
+{
struct task_struct *p;
/*
* This function is called after the cpu is taken down and marked
* offline,
hm, well. Who said that this function will only ever be called after that CPU was taken down? There is nothing in the function name nor in the (absent) documentation which enforces this precondition.
If someone tries to use this function for a different purpose, or copies-and-modifies it for a different purpose, we just shot them in the foot.
They'd be pretty dumb to do that without reading the local comment, but still...
Methinks something simple like:
WARN_ON(cpu_online(cpu));
Ought to cure that worry, no? :-)
so its not like new tasks will ever get this cpu set in
* their mm mask. -- Peter Zijlstra
* Thus, we may use rcu_read_lock() here, instead of grabbing
* full-fledged tasklist_lock.
*/
rcu_read_lock();
for_each_process(p) {
struct task_struct *t;
t = find_lock_task_mm(p);
if (!t)
continue;
cpumask_clear_cpu(cpu, mm_cpumask(t->mm));
task_unlock(t);
}
rcu_read_unlock();
+}
This patch adds more comments on clear_tasks_mm_cpumask, plus adds a runtime check: the function is only suitable for offlined CPUs, and if called inappropriately, the kernel should scream aloud.
Suggested-by: Andrew Morton akpm@linux-foundation.org Suggested-by: Peter Zijlstra a.p.zijlstra@chello.nl Signed-off-by: Anton Vorontsov anton.vorontsov@linaro.org ---
On Tue, May 01, 2012 at 12:45:33PM +0200, Peter Zijlstra wrote:
On Thu, 2012-04-26 at 16:59 -0700, Andrew Morton wrote:
+void clear_tasks_mm_cpumask(int cpu)
The operation of this function was presumably obvious to you at the time you wrote it, but that isn't true of other people at later times.
Please document it?
[...]
If someone tries to use this function for a different purpose, or copies-and-modifies it for a different purpose, we just shot them in the foot.
They'd be pretty dumb to do that without reading the local comment, but still...
Methinks something simple like:
WARN_ON(cpu_online(cpu));
Ought to cure that worry, no? :-)
Yeah, this is all good ideas, thanks.
How about the following patch?
kernel/cpu.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+)
diff --git a/kernel/cpu.c b/kernel/cpu.c index ecdf499..1bfa26f 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -13,6 +13,7 @@ #include <linux/oom.h> #include <linux/rcupdate.h> #include <linux/export.h> +#include <linux/bug.h> #include <linux/kthread.h> #include <linux/stop_machine.h> #include <linux/mutex.h> @@ -173,6 +174,18 @@ void __ref unregister_cpu_notifier(struct notifier_block *nb) } EXPORT_SYMBOL(unregister_cpu_notifier);
+/** + * clear_tasks_mm_cpumask - Safely clear tasks' mm_cpumask for a CPU + * @cpu: a CPU id + * + * This function walks up all processes, finds a valid mm struct for + * each one and then clears a corresponding bit in mm's cpumask. While + * this all sounds trivial, there are various non-obvious corner cases, + * which this function tries to solve in a safe manner. + * + * Also note that the function uses a somewhat relaxed locking scheme, + * so it may be called only for an already offlined CPU. + */ void clear_tasks_mm_cpumask(int cpu) { struct task_struct *p; @@ -184,10 +197,15 @@ void clear_tasks_mm_cpumask(int cpu) * Thus, we may use rcu_read_lock() here, instead of grabbing * full-fledged tasklist_lock. */ + WARN_ON(cpu_online(cpu)); rcu_read_lock(); for_each_process(p) { struct task_struct *t;
+ /* + * Main thread might exit, but other threads may still have + * a valid mm. Find one. + */ t = find_lock_task_mm(p); if (!t) continue;
On Thu, Apr 26, 2012 at 04:59:11PM -0700, Andrew Morton wrote: [...]
so its not like new tasks will ever get this cpu set in
* their mm mask. -- Peter Zijlstra
* Thus, we may use rcu_read_lock() here, instead of grabbing
* full-fledged tasklist_lock.
*/
- rcu_read_lock();
- for_each_process(p) {
struct task_struct *t;
t = find_lock_task_mm(p);
if (!t)
continue;
cpumask_clear_cpu(cpu, mm_cpumask(t->mm));
task_unlock(t);
- }
- rcu_read_unlock();
+}
It is good that this code exists under CONFIG_HOTPLUG_CPU. Did you check that everything works correctly with CONFIG_HOTPLUG_CPU=n?
Yeah, only the code under CONFIG_HOTPLUG_CPU calls the function, so it should be all fine.
Thanks!
Checking for process->mm is not enough because process' main thread may exit or detach its mm via use_mm(), but other threads may still have a valid mm.
To fix this we would need to use find_lock_task_mm(), which would walk up all threads and returns an appropriate task (with task lock held).
clear_tasks_mm_cpumask() has this issue fixed, so let's use it.
Suggested-by: Oleg Nesterov oleg@redhat.com Signed-off-by: Anton Vorontsov anton.vorontsov@linaro.org --- arch/arm/kernel/smp.c | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-)
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c index addbbe8..e824ee4 100644 --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -130,7 +130,6 @@ static void percpu_timer_stop(void); int __cpu_disable(void) { unsigned int cpu = smp_processor_id(); - struct task_struct *p; int ret;
ret = platform_cpu_disable(cpu); @@ -160,12 +159,7 @@ int __cpu_disable(void) flush_cache_all(); local_flush_tlb_all();
- read_lock(&tasklist_lock); - for_each_process(p) { - if (p->mm) - cpumask_clear_cpu(cpu, mm_cpumask(p->mm)); - } - read_unlock(&tasklist_lock); + clear_tasks_mm_cpumask(cpu);
return 0; }
Current CPU hotplug code has some task->mm handling issues:
1. Working with task->mm w/o getting mm or grabing the task lock is dangerous as ->mm might disappear (exit_mm() assigns NULL under task_lock(), so tasklist lock is not enough).
We can't use get_task_mm()/mmput() pair as mmput() might sleep, so we must take the task lock while handle its mm.
2. Checking for process->mm is not enough because process' main thread may exit or detach its mm via use_mm(), but other threads may still have a valid mm.
To fix this we would need to use find_lock_task_mm(), which would walk up all threads and returns an appropriate task (with task lock held).
clear_tasks_mm_cpumask() has all the issues fixed, so let's use it.
Suggested-by: Oleg Nesterov oleg@redhat.com Signed-off-by: Anton Vorontsov anton.vorontsov@linaro.org --- arch/powerpc/mm/mmu_context_nohash.c | 11 ++--------- 1 file changed, 2 insertions(+), 9 deletions(-)
diff --git a/arch/powerpc/mm/mmu_context_nohash.c b/arch/powerpc/mm/mmu_context_nohash.c index 5b63bd3..e779642 100644 --- a/arch/powerpc/mm/mmu_context_nohash.c +++ b/arch/powerpc/mm/mmu_context_nohash.c @@ -333,9 +333,7 @@ static int __cpuinit mmu_context_cpu_notify(struct notifier_block *self, unsigned long action, void *hcpu) { unsigned int cpu = (unsigned int)(long)hcpu; -#ifdef CONFIG_HOTPLUG_CPU - struct task_struct *p; -#endif + /* We don't touch CPU 0 map, it's allocated at aboot and kept * around forever */ @@ -358,12 +356,7 @@ static int __cpuinit mmu_context_cpu_notify(struct notifier_block *self, stale_map[cpu] = NULL;
/* We also clear the cpu_vm_mask bits of CPUs going away */ - read_lock(&tasklist_lock); - for_each_process(p) { - if (p->mm) - cpumask_clear_cpu(cpu, mm_cpumask(p->mm)); - } - read_unlock(&tasklist_lock); + clear_tasks_mm_cpumask(cpu); break; #endif /* CONFIG_HOTPLUG_CPU */ }
Checking for process->mm is not enough because process' main thread may exit or detach its mm via use_mm(), but other threads may still have a valid mm.
To fix this we would need to use find_lock_task_mm(), which would walk up all threads and returns an appropriate task (with task lock held).
clear_tasks_mm_cpumask() has the issue fixed, so let's use it.
Suggested-by: Oleg Nesterov oleg@redhat.com Signed-off-by: Anton Vorontsov anton.vorontsov@linaro.org --- arch/sh/kernel/smp.c | 7 +------ 1 file changed, 1 insertion(+), 6 deletions(-)
diff --git a/arch/sh/kernel/smp.c b/arch/sh/kernel/smp.c index eaebdf6..4664f76 100644 --- a/arch/sh/kernel/smp.c +++ b/arch/sh/kernel/smp.c @@ -123,7 +123,6 @@ void native_play_dead(void) int __cpu_disable(void) { unsigned int cpu = smp_processor_id(); - struct task_struct *p; int ret;
ret = mp_ops->cpu_disable(cpu); @@ -153,11 +152,7 @@ int __cpu_disable(void) flush_cache_all(); local_flush_tlb_all();
- read_lock(&tasklist_lock); - for_each_process(p) - if (p->mm) - cpumask_clear_cpu(cpu, mm_cpumask(p->mm)); - read_unlock(&tasklist_lock); + clear_tasks_mm_cpumask(cpu);
return 0; }
The patch fixes two problems:
1. Working with task->mm w/o getting mm or grabing the task lock is dangerous as ->mm might disappear (exit_mm() assigns NULL under task_lock(), so tasklist lock is not enough).
We can't use get_task_mm()/mmput() pair as mmput() might sleep, so we have to take the task lock while handle its mm.
2. Checking for process->mm is not enough because process' main thread may exit or detach its mm via use_mm(), but other threads may still have a valid mm.
To catch this we use find_lock_task_mm(), which walks up all threads and returns an appropriate task (with task lock held).
Suggested-by: Oleg Nesterov oleg@redhat.com Signed-off-by: Anton Vorontsov anton.vorontsov@linaro.org --- arch/blackfin/kernel/trace.c | 26 +++++++++++++------------- 1 file changed, 13 insertions(+), 13 deletions(-)
diff --git a/arch/blackfin/kernel/trace.c b/arch/blackfin/kernel/trace.c index 44bbf2f..d08f0e3 100644 --- a/arch/blackfin/kernel/trace.c +++ b/arch/blackfin/kernel/trace.c @@ -10,6 +10,8 @@ #include <linux/hardirq.h> #include <linux/thread_info.h> #include <linux/mm.h> +#include <linux/oom.h> +#include <linux/sched.h> #include <linux/uaccess.h> #include <linux/module.h> #include <linux/kallsyms.h> @@ -28,7 +30,6 @@ void decode_address(char *buf, unsigned long address) struct task_struct *p; struct mm_struct *mm; unsigned long flags, offset; - unsigned char in_atomic = (bfin_read_IPEND() & 0x10) || in_atomic(); struct rb_node *n;
#ifdef CONFIG_KALLSYMS @@ -114,15 +115,15 @@ void decode_address(char *buf, unsigned long address) */ write_lock_irqsave(&tasklist_lock, flags); for_each_process(p) { - mm = (in_atomic ? p->mm : get_task_mm(p)); - if (!mm) - continue; + struct task_struct *t;
- if (!down_read_trylock(&mm->mmap_sem)) { - if (!in_atomic) - mmput(mm); + t = find_lock_task_mm(p); + if (!t) continue; - } + + mm = t->mm; + if (!down_read_trylock(&mm->mmap_sem)) + goto __continue;
for (n = rb_first(&mm->mm_rb); n; n = rb_next(n)) { struct vm_area_struct *vma; @@ -131,7 +132,7 @@ void decode_address(char *buf, unsigned long address)
if (address >= vma->vm_start && address < vma->vm_end) { char _tmpbuf[256]; - char *name = p->comm; + char *name = t->comm; struct file *file = vma->vm_file;
if (file) { @@ -164,8 +165,7 @@ void decode_address(char *buf, unsigned long address) name, vma->vm_start, vma->vm_end);
up_read(&mm->mmap_sem); - if (!in_atomic) - mmput(mm); + task_unlock(t);
if (buf[0] == '\0') sprintf(buf, "[ %s ] dynamic memory", name); @@ -175,8 +175,8 @@ void decode_address(char *buf, unsigned long address) }
up_read(&mm->mmap_sem); - if (!in_atomic) - mmput(mm); +__continue: + task_unlock(t); }
/*
On Monday 23 April 2012 03:09:01 Anton Vorontsov wrote:
- Working with task->mm w/o getting mm or grabing the task lock is dangerous as ->mm might disappear (exit_mm() assigns NULL under task_lock(), so tasklist lock is not enough).
that isn't a problem for this code as it specifically checks if it's in an atomic section. if it is, then task->mm can't go away on us.
We can't use get_task_mm()/mmput() pair as mmput() might sleep, so we have to take the task lock while handle its mm.
if we're not in an atomic section, then sleeping is fine.
- Checking for process->mm is not enough because process' main thread may exit or detach its mm via use_mm(), but other threads may still have a valid mm.
i don't think it matters for this code (per the reasons above).
To catch this we use find_lock_task_mm(), which walks up all threads and returns an appropriate task (with task lock held).
certainly fine for the non-atomic code path. i guess we'll notice in crashes if it causes a problem in atomic code paths as well. -mike
Oleg Nesterov found an interesting deadlock possibility:
sysrq_showregs_othercpus() does smp_call_function(showacpu) and showacpu() show_stack()->decode_address(). Now suppose that IPI interrupts the task holding read_lock(tasklist).
To fix this, blackfin should not grab the write_ variant of the tasklist lock, read_ one is enough.
Suggested-by: Oleg Nesterov oleg@redhat.com Signed-off-by: Anton Vorontsov anton.vorontsov@linaro.org --- arch/blackfin/kernel/trace.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/blackfin/kernel/trace.c b/arch/blackfin/kernel/trace.c index d08f0e3..f7f7a18 100644 --- a/arch/blackfin/kernel/trace.c +++ b/arch/blackfin/kernel/trace.c @@ -29,7 +29,7 @@ void decode_address(char *buf, unsigned long address) { struct task_struct *p; struct mm_struct *mm; - unsigned long flags, offset; + unsigned long offset; struct rb_node *n;
#ifdef CONFIG_KALLSYMS @@ -113,7 +113,7 @@ void decode_address(char *buf, unsigned long address) * mappings of all our processes and see if we can't be a whee * bit more specific */ - write_lock_irqsave(&tasklist_lock, flags); + read_lock(&tasklist_lock); for_each_process(p) { struct task_struct *t;
@@ -186,7 +186,7 @@ __continue: sprintf(buf, "/* kernel dynamic memory */");
done: - write_unlock_irqrestore(&tasklist_lock, flags); + read_unlock(&tasklist_lock); }
#define EXPAND_LEN ((1 << CONFIG_DEBUG_BFIN_HWTRACE_EXPAND_LEN) * 256 - 1)
Traversing the tasks requires holding tasklist_lock, otherwise it is unsafe.
p.s. However, I'm not sure that calling os_kill_ptraced_process() in the atomic context is correct. It seem to work, but please take a closer look.
Signed-off-by: Anton Vorontsov anton.vorontsov@linaro.org --- arch/um/kernel/reboot.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/arch/um/kernel/reboot.c b/arch/um/kernel/reboot.c index 4d93dff..66d754c 100644 --- a/arch/um/kernel/reboot.c +++ b/arch/um/kernel/reboot.c @@ -4,6 +4,7 @@ */
#include "linux/sched.h" +#include "linux/spinlock.h" #include "linux/slab.h" #include "kern_util.h" #include "os.h" @@ -22,6 +23,7 @@ static void kill_off_processes(void) struct task_struct *p; int pid;
+ read_lock(&tasklist_lock); for_each_process(p) { if (p->mm == NULL) continue; @@ -29,6 +31,7 @@ static void kill_off_processes(void) pid = p->mm->context.id.u.pid; os_kill_ptraced_process(pid, 1); } + read_unlock(&tasklist_lock); } }
On 23.04.2012 09:09, Anton Vorontsov wrote:
Traversing the tasks requires holding tasklist_lock, otherwise it is unsafe.
p.s. However, I'm not sure that calling os_kill_ptraced_process() in the atomic context is correct. It seem to work, but please take a closer look.
Signed-off-by: Anton Vorontsovanton.vorontsov@linaro.org
You forgot my Ack and I've already explained why os_kill_ptraced_process() is fine.
Thanks, //richard
On Mon, Apr 23, 2012 at 04:57:54PM +0200, Richard Weinberger wrote:
On 23.04.2012 09:09, Anton Vorontsov wrote:
Traversing the tasks requires holding tasklist_lock, otherwise it is unsafe.
p.s. However, I'm not sure that calling os_kill_ptraced_process() in the atomic context is correct. It seem to work, but please take a closer look.
Signed-off-by: Anton Vorontsovanton.vorontsov@linaro.org
You forgot my Ack and I've already explained why os_kill_ptraced_process() is fine.
Ouch, sorry!
Checking for task->mm is dangerous as ->mm might disappear (exit_mm() assigns NULL under task_lock(), so tasklist lock is not enough).
We can't use get_task_mm()/mmput() pair as mmput() might sleep, so let's take the task lock while we care about its mm.
Note that we should also use find_lock_task_mm() to check all process' threads for a valid mm, but for uml we'll do it in a separate patch.
Signed-off-by: Anton Vorontsov anton.vorontsov@linaro.org --- arch/um/kernel/reboot.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/arch/um/kernel/reboot.c b/arch/um/kernel/reboot.c index 66d754c..1411f4e 100644 --- a/arch/um/kernel/reboot.c +++ b/arch/um/kernel/reboot.c @@ -25,10 +25,13 @@ static void kill_off_processes(void)
read_lock(&tasklist_lock); for_each_process(p) { - if (p->mm == NULL) + task_lock(p); + if (!p->mm) { + task_unlock(p); continue; - + } pid = p->mm->context.id.u.pid; + task_unlock(p); os_kill_ptraced_process(pid, 1); } read_unlock(&tasklist_lock);
kill_off_processes() might miss a valid process, this is because checking for process->mm is not enough. Process' main thread may exit or detach its mm via use_mm(), but other threads may still have a valid mm.
To catch this we use find_lock_task_mm(), which walks up all threads and returns an appropriate task (with task lock held).
Suggested-by: Oleg Nesterov oleg@redhat.com Signed-off-by: Anton Vorontsov anton.vorontsov@linaro.org --- arch/um/kernel/reboot.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/arch/um/kernel/reboot.c b/arch/um/kernel/reboot.c index 1411f4e..3d15243 100644 --- a/arch/um/kernel/reboot.c +++ b/arch/um/kernel/reboot.c @@ -6,6 +6,7 @@ #include "linux/sched.h" #include "linux/spinlock.h" #include "linux/slab.h" +#include "linux/oom.h" #include "kern_util.h" #include "os.h" #include "skas.h" @@ -25,13 +26,13 @@ static void kill_off_processes(void)
read_lock(&tasklist_lock); for_each_process(p) { - task_lock(p); - if (!p->mm) { - task_unlock(p); + struct task_struct *t; + + t = find_lock_task_mm(p); + if (!t) continue; - } - pid = p->mm->context.id.u.pid; - task_unlock(p); + pid = t->mm->context.id.u.pid; + task_unlock(t); os_kill_ptraced_process(pid, 1); } read_unlock(&tasklist_lock);
linaro-kernel@lists.linaro.org