Re: [PATCH Resend] cpufreq: Set cpufreq_cpu_data to NULL before putting kobject

2 Feb 2015

Viresh,
On 2015/1/31 8:32, Viresh Kumar wrote:
...
In __cpufreq_remove_dev_finish(), per-cpu 'cpufreq_cpu_data' needs to be cleared
before calling kobject_put(&policy->kobj) *and* under the lock. Otherwise if
someone else calls cpufreq_cpu_get() in parallel with it, they can obtain a
non-NULL policy from it *after* kobject_put(&policy->kobj) was executed.
Consider this case:
Thread A				Thread B
cpufreq_cpu_get()
   read_lock_irqsave()
   read-per-cpu cpufreq_cpu_data
   				per_cpu(&cpufreq_cpu_data, cpu) = NULL
   				kobject_put(&policy->kobj);
   kobject_get(&policy->kobj);
And this will result in below Warnings:
------------[ cut here ]------------
  WARNING: CPU: 0 PID: 4 at include/linux/kref.h:47
  kobject_get+0x41/0x50()
  Modules linked in: acpi_cpufreq(+) nfsd auth_rpcgss nfs_acl
  lockd grace sunrpc xfs libcrc32c sd_mod ixgbe igb mdio ahci hwmon
  ...
  Call Trace:
   [<ffffffff81661b14>] dump_stack+0x46/0x58
   [<ffffffff81072b61>] warn_slowpath_common+0x81/0xa0
   [<ffffffff81072c7a>] warn_slowpath_null+0x1a/0x20
   [<ffffffff812e16d1>] kobject_get+0x41/0x50
   [<ffffffff815262a5>] cpufreq_cpu_get+0x75/0xc0
   [<ffffffff81527c3e>] cpufreq_update_policy+0x2e/0x1f0
   [<ffffffff810b8cb2>] ? up+0x32/0x50
   [<ffffffff81381aa9>] ? acpi_ns_get_node+0xcb/0xf2
   [<ffffffff81381efd>] ? acpi_evaluate_object+0x22c/0x252
   [<ffffffff813824f6>] ? acpi_get_handle+0x95/0xc0
   [<ffffffff81360967>] ? acpi_has_method+0x25/0x40
   [<ffffffff81391e08>] acpi_processor_ppc_has_changed+0x77/0x82
   [<ffffffff81089566>] ? move_linked_works+0x66/0x90
   [<ffffffff8138e8ed>] acpi_processor_notify+0x58/0xe7
   [<ffffffff8137410c>] acpi_ev_notify_dispatch+0x44/0x5c
   [<ffffffff8135f293>] acpi_os_execute_deferred+0x15/0x22
   [<ffffffff8108c910>] process_one_work+0x160/0x410
   [<ffffffff8108d05b>] worker_thread+0x11b/0x520
   [<ffffffff8108cf40>] ? rescuer_thread+0x380/0x380
   [<ffffffff81092421>] kthread+0xe1/0x100
   [<ffffffff81092340>] ? kthread_create_on_node+0x1b0/0x1b0
   [<ffffffff81669ebc>] ret_from_fork+0x7c/0xb0
   [<ffffffff81092340>] ? kthread_create_on_node+0x1b0/0x1b0
  ---[ end trace 89e66eb9795efdf7 ]---
And here is the actual race (+ the race mentioned above):
Thread A: Workqueue: kacpi_notify
acpi_processor_notify()
    acpi_processor_ppc_has_changed()
          cpufreq_update_policy()
            cpufreq_cpu_get()
              kobject_get()
Thread B: xenbus_thread()
xenbus_thread()
    msg->u.watch.handle->callback()
      handle_vcpu_hotplug_event()
        vcpu_hotplug()
          cpu_down()
            __cpu_notify(CPU_POST_DEAD..)
              cpufreq_cpu_callback()
                __cpufreq_remove_dev_finish()
                  cpufreq_policy_put_kobj()
                    kobject_put()
cpufreq_cpu_get() gets the policy from per-cpu variable cpufreq_cpu_data under
cpufreq_driver_lock, and once it gets a valid policy it expects it to not be
freed until cpufreq_cpu_put() is called.
But the race happens when another thread puts the kobject first and updates
cpufreq_cpu_data before or later. And so the first thread gets a valid policy
structure and before it does kobject_get() on it, the second one has already
done kobject_put().
Fix this by setting cpufreq_cpu_data to NULL before putting the kobject and that
too under locks.
Reported-by: Ethan Zhao ethan.zhao@oracle.com
Reported-by: Santosh Shilimkar santosh.shilimkar@oracle.com
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org

drivers/cpufreq/cpufreq.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 4473eba1d6b0..e3bf702b5588 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1409,9 +1409,10 @@ static int __cpufreq_remove_dev_finish(struct device *dev,
   unsigned long flags;
   struct cpufreq_policy *policy;

read_lock_irqsave(&cpufreq_driver_lock, flags);


write_lock_irqsave(&cpufreq_driver_lock, flags);
policy = per_cpu(cpufreq_cpu_data, cpu);


read_unlock_irqrestore(&cpufreq_driver_lock, flags);


per_cpu(cpufreq_cpu_data, cpu) = NULL;
write_unlock_irqrestore(&cpufreq_driver_lock, flags);

if (!policy) {
   	pr_debug("%s: No cpu_data found\n", __func__);
@@ -1466,7 +1467,6 @@ static int __cpufreq_remove_dev_finish(struct device *dev,
   	}
   }

per_cpu(cpufreq_cpu_data, cpu) = NULL;
return 0;
}

This seems couldn't prevent all the 'bad thing' from happening, E.G.
Thread A: Workqueue: kacpi_notify
acpi_processor_notify()
    acpi_processor_ppc_has_changed()
          cpufreq_update_policy()
            cpufreq_cpu_get()
beginning the deference of policy        Thread B:
            ... ... __cpufreq_remove_dev_finish()
cpufreq_policy_free(policy);
Perhaps move policy->rwsem out side the policy structure is a way to 
avoid it completely.
and you could stopping the PPC thread stepping forward as my patch as 
temporary workaround.
Thanks,
Ethan
...

    

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: [PATCH Resend] cpufreq: Set cpufreq_cpu_data to NULL before putting kobject