On Thu, Feb 02, 2012 at 01:54:41PM +0100, Oleg Nesterov wrote:
On 02/01, Anton Vorontsov wrote:
@@ -132,7 +133,7 @@ static int lowmem_shrink(struct shrinker *s, struct shrink_control *sc) } selected_oom_adj = min_adj;
- read_lock(&tasklist_lock);
- rcu_read_lock();
This has the same problem, force_sig() becomes unsafe.
Ouch, I think I finally got it. So, lock_task_sighand() is trying to gracefully grab the lock, checking if the sighand is not NULL (which means, per __exit_signal(), that the task is halfway into the grave).
Well, it seems that such a behaviour of force_sig() is not quite obvious, and there are other offenders out there. E.g. in sysrq code I don't see anything that prevent the same race.
static void send_sig_all(int sig) { struct task_struct *p;
for_each_process(p) { if (p->mm && !is_global_init(p)) /* Not swapper, init nor kernel thread */ force_sig(sig, p); } }
Would the following fix work for the sysrq?
- - - - From: Anton Vorontsov anton.vorontsov@linaro.org Subject: [PATCH] sysrq: Fix unsafe operations on tasks
sysrq should grab the tasklist lock, otherwise calling force_sig() is not safe, as it might race with exiting task, which ->sighand might be set to NULL already.
Signed-off-by: Anton Vorontsov anton.vorontsov@linaro.org --- drivers/tty/sysrq.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-)
diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c index 7867b7c..a1bcad7 100644 --- a/drivers/tty/sysrq.c +++ b/drivers/tty/sysrq.c @@ -322,11 +322,13 @@ static void send_sig_all(int sig) { struct task_struct *p;
+ read_lock(&tasklist_lock); for_each_process(p) { if (p->mm && !is_global_init(p)) /* Not swapper, init nor kernel thread */ force_sig(sig, p); } + read_unlock(&tasklist_lock); }
static void sysrq_handle_term(int key) - - - -
But for LMK I will use send_signal(), as LMK is special.
Plus, while I'm at it, might want to review a bit closer other offenders, and fix them as well.
Why do you need force_? Do you really want to kill /sbin/init (or sub-namespace init) ?
Nope.
We could change force_sig_info() to use lock_task_sighand(), but I'd like to avoid this. Imho, this interface should be cleanuped, and it should be used for synchronous signals only.
OK. Then we should fix the users?
With or without this patch, sig == NULL is not possible but !mm is not right, there could be other other threads with mm != NULL.
I'm not sure I completely follow. In the current LMK code, we check for !mm because otherwise the potential victim is useless for us (i.e. killing it will not free much memory anyway).
Thanks!