Hello,
This series implements reentrancy for the common clk implementation of the clk.h api. Making reentrant calls into the clock framework is both necessary and desirable for many use cases such as enabling off-chip clocks via i2c. The first patch in the series implements this.
A neat side effect of reentrancy is that it is possible for platforms using voltage regulators controlled via i2c to register rate-change notifier handlers to scale voltage as a function of clock rate. This is an effective way to implement dynamic voltage & frequency scaling. Patch #2 implements a helper function for registering such a notifier handler.
The third patch in the series demonstrates dvfs on OMAP platforms by modifying the cpufreq-omap driver; it migrates the voltage scaling logic out of the cpufreq driver's .target callback and registers callbacks via the helper introduced in patch #2.
Patches four and five are purely test coverage. And what better way to test than to muck with fragile PLL programming code? These patches test out a lot of the aforementioned reentrancy in the OMAP3+ DPLL code. They are not for merging, but as a demonstration of what is now possible.
Finally, I know that Documentation/clk.txt needs an update for these changes but I wanted this on the list before I flew out to LCA 2013. I'll provide that update during or after the conference.
Two previous (and considerably more insane) attempts at this, v1: http://article.gmane.org/gmane.linux.kernel/1327866 v2: http://marc.info/?l=linux-kernel&m=134507429302463&w=2
Mike Turquette (5): clk: allow reentrant calls into the clk framework clk: notifier handler for dynamic voltage scaling cpufreq: omap: scale regulator from clk notifier HACK: set_parent callback for OMAP4 non-core DPLLs HACK: omap: opp: add fake 400MHz OPP to bypass MPU
arch/arm/mach-omap2/cclock44xx_data.c | 1 + arch/arm/mach-omap2/clock.h | 1 + arch/arm/mach-omap2/dpll3xxx.c | 107 ++++++++++---- arch/arm/mach-omap2/opp4xxx_data.c | 18 +++ drivers/clk/Makefile | 1 + drivers/clk/clk.c | 254 ++++++++++++++++++++++++--------- drivers/clk/dvfs.c | 125 ++++++++++++++++ drivers/cpufreq/omap-cpufreq.c | 82 +++-------- include/linux/clk.h | 27 +++- 9 files changed, 459 insertions(+), 157 deletions(-) create mode 100644 drivers/clk/dvfs.c
Reentrancy into the clock framework from the clk.h api is highly desirable. This feature is necessary for clocks that are prepared and unprepared via i2c_transfer (which includes many PMICs and discrete audio chips) and it is also necessary for performing dynamic voltage & frequency scaling via clock rate-change notifiers.
This patch implements reentrancy by adding a global atomic_t which tracks the context of the current caller. Context in this case is the return value from get_current(). The clk.h api implementations are modified to first see if the relevant global lock is already held and if so compare the global context (set by whoever is holding the lock) against their own context (via a call to get_current()). If the two match then this function is a nested call from the one already holding the lock and we procede. If the context does not match then procede to call mutex_lock and busy-wait for the existing task to complete.
Thus this patch set does not increase concurrency for unrelated calls into the clock framework. Instead it simply allows reentrancy by the single task which is currently holding the global clock framework lock.
Thanks to Rajagoapl Venkat for the original idea to use get_current() and to David Brown for the suggestion to replace my previous rwlock scheme with atomic operations during code review at ELC 2013.
Signed-off-by: Mike Turquette mturquette@linaro.org Cc: Rajagopal Venkat rajagopal.venkat@linaro.org Cc: David Brown davidb@codeaurora.org --- drivers/clk/clk.c | 254 ++++++++++++++++++++++++++++++++++++++--------------- 1 file changed, 185 insertions(+), 69 deletions(-)
diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c index fabbfe1..b7d6a0a 100644 --- a/drivers/clk/clk.c +++ b/drivers/clk/clk.c @@ -19,9 +19,11 @@ #include <linux/of.h> #include <linux/device.h> #include <linux/init.h> +#include <linux/sched.h>
static DEFINE_SPINLOCK(enable_lock); static DEFINE_MUTEX(prepare_lock); +static atomic_t context;
static HLIST_HEAD(clk_root_list); static HLIST_HEAD(clk_orphan_list); @@ -433,27 +435,6 @@ unsigned int __clk_get_prepare_count(struct clk *clk) return !clk ? 0 : clk->prepare_count; }
-unsigned long __clk_get_rate(struct clk *clk) -{ - unsigned long ret; - - if (!clk) { - ret = 0; - goto out; - } - - ret = clk->rate; - - if (clk->flags & CLK_IS_ROOT) - goto out; - - if (!clk->parent) - ret = 0; - -out: - return ret; -} - unsigned long __clk_get_flags(struct clk *clk) { return !clk ? 0 : clk->flags; @@ -524,6 +505,35 @@ struct clk *__clk_lookup(const char *name) return NULL; }
+/*** locking & reentrancy ***/ + +static void clk_fwk_lock(void) +{ + /* hold the framework-wide lock, context == NULL */ + mutex_lock(&prepare_lock); + + /* set context for any reentrant calls */ + atomic_set(&context, (int) get_current()); +} + +static void clk_fwk_unlock(void) +{ + /* clear the context */ + atomic_set(&context, 0); + + /* release the framework-wide lock, context == NULL */ + mutex_unlock(&prepare_lock); +} + +static bool clk_is_reentrant(void) +{ + if (mutex_is_locked(&prepare_lock)) + if ((void *) atomic_read(&context) == get_current()) + return true; + + return false; +} + /*** clk api ***/
void __clk_unprepare(struct clk *clk) @@ -558,9 +568,15 @@ void __clk_unprepare(struct clk *clk) */ void clk_unprepare(struct clk *clk) { - mutex_lock(&prepare_lock); + /* re-enter if call is from the same context */ + if (clk_is_reentrant()) { + __clk_unprepare(clk); + return; + } + + clk_fwk_lock(); __clk_unprepare(clk); - mutex_unlock(&prepare_lock); + clk_fwk_unlock(); } EXPORT_SYMBOL_GPL(clk_unprepare);
@@ -606,10 +622,16 @@ int clk_prepare(struct clk *clk) { int ret;
- mutex_lock(&prepare_lock); - ret = __clk_prepare(clk); - mutex_unlock(&prepare_lock); + /* re-enter if call is from the same context */ + if (clk_is_reentrant()) { + ret = __clk_prepare(clk); + goto out; + }
+ clk_fwk_lock(); + ret = __clk_prepare(clk); + clk_fwk_unlock(); +out: return ret; } EXPORT_SYMBOL_GPL(clk_prepare); @@ -650,8 +672,27 @@ void clk_disable(struct clk *clk) { unsigned long flags;
+ /* must check both the global spinlock and the global mutex */ + if (spin_is_locked(&enable_lock) || mutex_is_locked(&prepare_lock)) { + if ((void *) atomic_read(&context) == get_current()) { + __clk_disable(clk); + return; + } + } + + /* hold the framework-wide lock, context == NULL */ spin_lock_irqsave(&enable_lock, flags); + + /* set context for any reentrant calls */ + atomic_set(&context, (int) get_current()); + + /* disable the clock(s) */ __clk_disable(clk); + + /* clear the context */ + atomic_set(&context, 0); + + /* release the framework-wide lock, context == NULL */ spin_unlock_irqrestore(&enable_lock, flags); } EXPORT_SYMBOL_GPL(clk_disable); @@ -703,10 +744,29 @@ int clk_enable(struct clk *clk) unsigned long flags; int ret;
+ /* this call re-enters if it is from the same context */ + if (spin_is_locked(&enable_lock) || mutex_is_locked(&prepare_lock)) { + if ((void *) atomic_read(&context) == get_current()) { + ret = __clk_enable(clk); + goto out; + } + } + + /* hold the framework-wide lock, context == NULL */ spin_lock_irqsave(&enable_lock, flags); + + /* set context for any reentrant calls */ + atomic_set(&context, (int) get_current()); + + /* enable the clock(s) */ ret = __clk_enable(clk); - spin_unlock_irqrestore(&enable_lock, flags);
+ /* clear the context */ + atomic_set(&context, 0); + + /* release the framework-wide lock, context == NULL */ + spin_unlock_irqrestore(&enable_lock, flags); +out: return ret; } EXPORT_SYMBOL_GPL(clk_enable); @@ -750,10 +810,17 @@ long clk_round_rate(struct clk *clk, unsigned long rate) { unsigned long ret;
- mutex_lock(&prepare_lock); + /* this call re-enters if it is from the same context */ + if (clk_is_reentrant()) { + ret = __clk_round_rate(clk, rate); + goto out; + } + + clk_fwk_lock(); ret = __clk_round_rate(clk, rate); - mutex_unlock(&prepare_lock); + clk_fwk_unlock();
+out: return ret; } EXPORT_SYMBOL_GPL(clk_round_rate); @@ -836,6 +903,30 @@ static void __clk_recalc_rates(struct clk *clk, unsigned long msg) __clk_recalc_rates(child, msg); }
+unsigned long __clk_get_rate(struct clk *clk) +{ + unsigned long ret; + + if (!clk) { + ret = 0; + goto out; + } + + if (clk->flags & CLK_GET_RATE_NOCACHE) + __clk_recalc_rates(clk, 0); + + ret = clk->rate; + + if (clk->flags & CLK_IS_ROOT) + goto out; + + if (!clk->parent) + ret = 0; + +out: + return ret; +} + /** * clk_get_rate - return the rate of clk * @clk: the clk whose rate is being returned @@ -848,14 +939,22 @@ unsigned long clk_get_rate(struct clk *clk) { unsigned long rate;
- mutex_lock(&prepare_lock); + /* + * FIXME - any locking here seems heavy weight + * can clk->rate be replaced with an atomic_t? + * same logic can likely be applied to prepare_count & enable_count + */
- if (clk && (clk->flags & CLK_GET_RATE_NOCACHE)) - __clk_recalc_rates(clk, 0); + if (clk_is_reentrant()) { + rate = __clk_get_rate(clk); + goto out; + }
+ clk_fwk_lock(); rate = __clk_get_rate(clk); - mutex_unlock(&prepare_lock); + clk_fwk_unlock();
+out: return rate; } EXPORT_SYMBOL_GPL(clk_get_rate); @@ -1036,6 +1135,39 @@ static void clk_change_rate(struct clk *clk) clk_change_rate(child); }
+int __clk_set_rate(struct clk *clk, unsigned long rate) +{ + int ret = 0; + struct clk *top, *fail_clk; + + /* bail early if nothing to do */ + if (rate == clk->rate) + return 0; + + if ((clk->flags & CLK_SET_RATE_GATE) && clk->prepare_count) { + return -EBUSY; + } + + /* calculate new rates and get the topmost changed clock */ + top = clk_calc_new_rates(clk, rate); + if (!top) + return -EINVAL; + + /* notify that we are about to change rates */ + fail_clk = clk_propagate_rate_change(top, PRE_RATE_CHANGE); + if (fail_clk) { + pr_warn("%s: failed to set %s rate\n", __func__, + fail_clk->name); + clk_propagate_rate_change(top, ABORT_RATE_CHANGE); + return -EBUSY; + } + + /* change the rates */ + clk_change_rate(top); + + return ret; +} + /** * clk_set_rate - specify a new rate for clk * @clk: the clk whose rate is being changed @@ -1059,44 +1191,18 @@ static void clk_change_rate(struct clk *clk) */ int clk_set_rate(struct clk *clk, unsigned long rate) { - struct clk *top, *fail_clk; int ret = 0;
- /* prevent racing with updates to the clock topology */ - mutex_lock(&prepare_lock); - - /* bail early if nothing to do */ - if (rate == clk->rate) - goto out; - - if ((clk->flags & CLK_SET_RATE_GATE) && clk->prepare_count) { - ret = -EBUSY; - goto out; - } - - /* calculate new rates and get the topmost changed clock */ - top = clk_calc_new_rates(clk, rate); - if (!top) { - ret = -EINVAL; - goto out; - } - - /* notify that we are about to change rates */ - fail_clk = clk_propagate_rate_change(top, PRE_RATE_CHANGE); - if (fail_clk) { - pr_warn("%s: failed to set %s rate\n", __func__, - fail_clk->name); - clk_propagate_rate_change(top, ABORT_RATE_CHANGE); - ret = -EBUSY; + if (clk_is_reentrant()) { + ret = __clk_set_rate(clk, rate); goto out; }
- /* change the rates */ - clk_change_rate(top); + clk_fwk_lock(); + ret = __clk_set_rate(clk, rate); + clk_fwk_unlock();
out: - mutex_unlock(&prepare_lock); - return ret; } EXPORT_SYMBOL_GPL(clk_set_rate); @@ -1111,10 +1217,16 @@ struct clk *clk_get_parent(struct clk *clk) { struct clk *parent;
- mutex_lock(&prepare_lock); + if (clk_is_reentrant()) { + parent = __clk_get_parent(clk); + goto out; + } + + clk_fwk_lock(); parent = __clk_get_parent(clk); - mutex_unlock(&prepare_lock); + clk_fwk_unlock();
+out: return parent; } EXPORT_SYMBOL_GPL(clk_get_parent); @@ -1293,6 +1405,7 @@ out: int clk_set_parent(struct clk *clk, struct clk *parent) { int ret = 0; + bool reenter;
if (!clk || !clk->ops) return -EINVAL; @@ -1300,8 +1413,10 @@ int clk_set_parent(struct clk *clk, struct clk *parent) if (!clk->ops->set_parent) return -ENOSYS;
- /* prevent racing with updates to the clock topology */ - mutex_lock(&prepare_lock); + reenter = clk_is_reentrant(); + + if (!reenter) + clk_fwk_lock();
if (clk->parent == parent) goto out; @@ -1330,7 +1445,8 @@ int clk_set_parent(struct clk *clk, struct clk *parent) __clk_reparent(clk, parent);
out: - mutex_unlock(&prepare_lock); + if (!reenter) + clk_fwk_unlock();
return ret; }
On 28 February 2013 05:49, Mike Turquette mturquette@linaro.org wrote:
Reentrancy into the clock framework from the clk.h api is highly desirable. This feature is necessary for clocks that are prepared and unprepared via i2c_transfer (which includes many PMICs and discrete audio chips) and it is also necessary for performing dynamic voltage & frequency scaling via clock rate-change notifiers.
This patch implements reentrancy by adding a global atomic_t which tracks the context of the current caller. Context in this case is the return value from get_current(). The clk.h api implementations are modified to first see if the relevant global lock is already held and if so compare the global context (set by whoever is holding the lock) against their own context (via a call to get_current()). If the two match then this function is a nested call from the one already holding the lock and we procede. If the context does not match then procede to call mutex_lock and busy-wait for the existing task to complete.
Thus this patch set does not increase concurrency for unrelated calls into the clock framework. Instead it simply allows reentrancy by the single task which is currently holding the global clock framework lock.
Thanks to Rajagoapl Venkat for the original idea to use get_current() and to David Brown for the suggestion to replace my previous rwlock scheme with atomic operations during code review at ELC 2013.
Signed-off-by: Mike Turquette mturquette@linaro.org Cc: Rajagopal Venkat rajagopal.venkat@linaro.org Cc: David Brown davidb@codeaurora.org
drivers/clk/clk.c | 254 ++++++++++++++++++++++++++++++++++++++--------------- 1 file changed, 185 insertions(+), 69 deletions(-)
diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c index fabbfe1..b7d6a0a 100644 --- a/drivers/clk/clk.c +++ b/drivers/clk/clk.c @@ -19,9 +19,11 @@ #include <linux/of.h> #include <linux/device.h> #include <linux/init.h> +#include <linux/sched.h>
static DEFINE_SPINLOCK(enable_lock); static DEFINE_MUTEX(prepare_lock); +static atomic_t context;
static HLIST_HEAD(clk_root_list); static HLIST_HEAD(clk_orphan_list); @@ -433,27 +435,6 @@ unsigned int __clk_get_prepare_count(struct clk *clk) return !clk ? 0 : clk->prepare_count; }
-unsigned long __clk_get_rate(struct clk *clk) -{
unsigned long ret;
if (!clk) {
ret = 0;
goto out;
}
ret = clk->rate;
if (clk->flags & CLK_IS_ROOT)
goto out;
if (!clk->parent)
ret = 0;
-out:
return ret;
-}
unsigned long __clk_get_flags(struct clk *clk) { return !clk ? 0 : clk->flags; @@ -524,6 +505,35 @@ struct clk *__clk_lookup(const char *name) return NULL; }
+/*** locking & reentrancy ***/
+static void clk_fwk_lock(void) +{
/* hold the framework-wide lock, context == NULL */
mutex_lock(&prepare_lock);
/* set context for any reentrant calls */
atomic_set(&context, (int) get_current());
+}
+static void clk_fwk_unlock(void) +{
/* clear the context */
atomic_set(&context, 0);
/* release the framework-wide lock, context == NULL */
mutex_unlock(&prepare_lock);
+}
+static bool clk_is_reentrant(void) +{
if (mutex_is_locked(&prepare_lock))
if ((void *) atomic_read(&context) == get_current())
return true;
return false;
+}
/*** clk api ***/
void __clk_unprepare(struct clk *clk) @@ -558,9 +568,15 @@ void __clk_unprepare(struct clk *clk) */ void clk_unprepare(struct clk *clk) {
mutex_lock(&prepare_lock);
/* re-enter if call is from the same context */
if (clk_is_reentrant()) {
__clk_unprepare(clk);
return;
}
clk_fwk_lock(); __clk_unprepare(clk);
mutex_unlock(&prepare_lock);
clk_fwk_unlock();
} EXPORT_SYMBOL_GPL(clk_unprepare);
@@ -606,10 +622,16 @@ int clk_prepare(struct clk *clk) { int ret;
mutex_lock(&prepare_lock);
ret = __clk_prepare(clk);
mutex_unlock(&prepare_lock);
/* re-enter if call is from the same context */
if (clk_is_reentrant()) {
ret = __clk_prepare(clk);
goto out;
}
clk_fwk_lock();
ret = __clk_prepare(clk);
clk_fwk_unlock();
+out: return ret; }
This above code seems fine to me. The slowpath functions using the prepare lock would be reentrant with this change, which is really great.
EXPORT_SYMBOL_GPL(clk_prepare); @@ -650,8 +672,27 @@ void clk_disable(struct clk *clk) { unsigned long flags;
/* must check both the global spinlock and the global mutex */
if (spin_is_locked(&enable_lock) || mutex_is_locked(&prepare_lock)) {
if ((void *) atomic_read(&context) == get_current()) {
__clk_disable(clk);
return;
}
}
/* hold the framework-wide lock, context == NULL */ spin_lock_irqsave(&enable_lock, flags);
/* set context for any reentrant calls */
atomic_set(&context, (int) get_current());
/* disable the clock(s) */ __clk_disable(clk);
/* clear the context */
atomic_set(&context, 0);
/* release the framework-wide lock, context == NULL */ spin_unlock_irqrestore(&enable_lock, flags);
} EXPORT_SYMBOL_GPL(clk_disable); @@ -703,10 +744,29 @@ int clk_enable(struct clk *clk) unsigned long flags; int ret;
/* this call re-enters if it is from the same context */
if (spin_is_locked(&enable_lock) || mutex_is_locked(&prepare_lock)) {
if ((void *) atomic_read(&context) == get_current()) {
ret = __clk_enable(clk);
goto out;
}
}
I beleive the clk_enable|disable code will be racy. What do you think about this scenario:
1. Thread 1, calls clk_prepare -> clk is not reentrant -> mutex_lock -> set_context to thread1. 2. Thread 2, calls clk_enable -> above "if" will mean that get_current returns thread 1 context and then clk_enable continues -> spin_lock_irqsave -> set_context to thread 2. 3. Thread 1 continues and triggers a reentancy for clk_prepare -> clk is not reentrant (since thread 2 has set a new context) -> mutex_lock and we will hang forever.
Do you think above scenario could happen?
I think the solution would be to invent another "static atomic_t context;" which is used only for fast path functions (clk_enable|disable). That should do the trick I think.
/* hold the framework-wide lock, context == NULL */ spin_lock_irqsave(&enable_lock, flags);
/* set context for any reentrant calls */
atomic_set(&context, (int) get_current());
/* enable the clock(s) */ ret = __clk_enable(clk);
spin_unlock_irqrestore(&enable_lock, flags);
/* clear the context */
atomic_set(&context, 0);
/* release the framework-wide lock, context == NULL */
spin_unlock_irqrestore(&enable_lock, flags);
+out: return ret; } EXPORT_SYMBOL_GPL(clk_enable); @@ -750,10 +810,17 @@ long clk_round_rate(struct clk *clk, unsigned long rate) { unsigned long ret;
mutex_lock(&prepare_lock);
/* this call re-enters if it is from the same context */
if (clk_is_reentrant()) {
ret = __clk_round_rate(clk, rate);
goto out;
}
clk_fwk_lock(); ret = __clk_round_rate(clk, rate);
mutex_unlock(&prepare_lock);
clk_fwk_unlock();
+out: return ret; } EXPORT_SYMBOL_GPL(clk_round_rate); @@ -836,6 +903,30 @@ static void __clk_recalc_rates(struct clk *clk, unsigned long msg) __clk_recalc_rates(child, msg); }
+unsigned long __clk_get_rate(struct clk *clk) +{
unsigned long ret;
if (!clk) {
ret = 0;
goto out;
}
if (clk->flags & CLK_GET_RATE_NOCACHE)
__clk_recalc_rates(clk, 0);
ret = clk->rate;
if (clk->flags & CLK_IS_ROOT)
goto out;
if (!clk->parent)
ret = 0;
+out:
return ret;
+}
/**
- clk_get_rate - return the rate of clk
- @clk: the clk whose rate is being returned
@@ -848,14 +939,22 @@ unsigned long clk_get_rate(struct clk *clk) { unsigned long rate;
mutex_lock(&prepare_lock);
/*
* FIXME - any locking here seems heavy weight
* can clk->rate be replaced with an atomic_t?
* same logic can likely be applied to prepare_count & enable_count
*/
if (clk && (clk->flags & CLK_GET_RATE_NOCACHE))
__clk_recalc_rates(clk, 0);
if (clk_is_reentrant()) {
rate = __clk_get_rate(clk);
goto out;
}
clk_fwk_lock(); rate = __clk_get_rate(clk);
mutex_unlock(&prepare_lock);
clk_fwk_unlock();
+out: return rate; } EXPORT_SYMBOL_GPL(clk_get_rate); @@ -1036,6 +1135,39 @@ static void clk_change_rate(struct clk *clk) clk_change_rate(child); }
+int __clk_set_rate(struct clk *clk, unsigned long rate) +{
int ret = 0;
struct clk *top, *fail_clk;
/* bail early if nothing to do */
if (rate == clk->rate)
return 0;
if ((clk->flags & CLK_SET_RATE_GATE) && clk->prepare_count) {
return -EBUSY;
}
/* calculate new rates and get the topmost changed clock */
top = clk_calc_new_rates(clk, rate);
if (!top)
return -EINVAL;
/* notify that we are about to change rates */
fail_clk = clk_propagate_rate_change(top, PRE_RATE_CHANGE);
if (fail_clk) {
pr_warn("%s: failed to set %s rate\n", __func__,
fail_clk->name);
clk_propagate_rate_change(top, ABORT_RATE_CHANGE);
return -EBUSY;
}
/* change the rates */
clk_change_rate(top);
return ret;
+}
/**
- clk_set_rate - specify a new rate for clk
- @clk: the clk whose rate is being changed
@@ -1059,44 +1191,18 @@ static void clk_change_rate(struct clk *clk) */ int clk_set_rate(struct clk *clk, unsigned long rate) {
struct clk *top, *fail_clk; int ret = 0;
/* prevent racing with updates to the clock topology */
mutex_lock(&prepare_lock);
/* bail early if nothing to do */
if (rate == clk->rate)
goto out;
if ((clk->flags & CLK_SET_RATE_GATE) && clk->prepare_count) {
ret = -EBUSY;
goto out;
}
/* calculate new rates and get the topmost changed clock */
top = clk_calc_new_rates(clk, rate);
if (!top) {
ret = -EINVAL;
goto out;
}
/* notify that we are about to change rates */
fail_clk = clk_propagate_rate_change(top, PRE_RATE_CHANGE);
if (fail_clk) {
pr_warn("%s: failed to set %s rate\n", __func__,
fail_clk->name);
clk_propagate_rate_change(top, ABORT_RATE_CHANGE);
ret = -EBUSY;
if (clk_is_reentrant()) {
ret = __clk_set_rate(clk, rate); goto out; }
/* change the rates */
clk_change_rate(top);
clk_fwk_lock();
ret = __clk_set_rate(clk, rate);
clk_fwk_unlock();
out:
mutex_unlock(&prepare_lock);
return ret;
} EXPORT_SYMBOL_GPL(clk_set_rate); @@ -1111,10 +1217,16 @@ struct clk *clk_get_parent(struct clk *clk) { struct clk *parent;
mutex_lock(&prepare_lock);
if (clk_is_reentrant()) {
parent = __clk_get_parent(clk);
goto out;
}
clk_fwk_lock(); parent = __clk_get_parent(clk);
mutex_unlock(&prepare_lock);
clk_fwk_unlock();
+out: return parent; } EXPORT_SYMBOL_GPL(clk_get_parent); @@ -1293,6 +1405,7 @@ out: int clk_set_parent(struct clk *clk, struct clk *parent) { int ret = 0;
bool reenter; if (!clk || !clk->ops) return -EINVAL;
@@ -1300,8 +1413,10 @@ int clk_set_parent(struct clk *clk, struct clk *parent) if (!clk->ops->set_parent) return -ENOSYS;
/* prevent racing with updates to the clock topology */
mutex_lock(&prepare_lock);
reenter = clk_is_reentrant();
if (!reenter)
clk_fwk_lock(); if (clk->parent == parent) goto out;
@@ -1330,7 +1445,8 @@ int clk_set_parent(struct clk *clk, struct clk *parent) __clk_reparent(clk, parent);
out:
mutex_unlock(&prepare_lock);
if (!reenter)
clk_fwk_unlock(); return ret;
}
1.7.10.4
-- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Kind regards Ulf Hansson
Quoting Ulf Hansson (2013-02-28 01:54:34)
On 28 February 2013 05:49, Mike Turquette mturquette@linaro.org wrote:
@@ -703,10 +744,29 @@ int clk_enable(struct clk *clk) unsigned long flags; int ret;
/* this call re-enters if it is from the same context */
if (spin_is_locked(&enable_lock) || mutex_is_locked(&prepare_lock)) {
if ((void *) atomic_read(&context) == get_current()) {
ret = __clk_enable(clk);
goto out;
}
}
I beleive the clk_enable|disable code will be racy. What do you think about this scenario:
- Thread 1, calls clk_prepare -> clk is not reentrant -> mutex_lock
-> set_context to thread1. 2. Thread 2, calls clk_enable -> above "if" will mean that get_current returns thread 1 context and then clk_enable continues -> spin_lock_irqsave -> set_context to thread 2. 3. Thread 1 continues and triggers a reentancy for clk_prepare -> clk is not reentrant (since thread 2 has set a new context) -> mutex_lock and we will hang forever.
Do you think above scenario could happen?
I think the solution would be to invent another "static atomic_t context;" which is used only for fast path functions (clk_enable|disable). That should do the trick I think.
Ulf,
You are correct. In fact I have a branch that has two separate context pointers, one for mutex-protected functions and one for spinlock-protected functions. Somehow I managed to discard that change before settling on the final version that was published.
I'll add the change back in.
Thanks for the review, Mike
On Mon, Mar 18, 2013 at 01:15:51PM -0700, Mike Turquette wrote:
Quoting Ulf Hansson (2013-02-28 01:54:34)
On 28 February 2013 05:49, Mike Turquette mturquette@linaro.org wrote:
@@ -703,10 +744,29 @@ int clk_enable(struct clk *clk) unsigned long flags; int ret;
/* this call re-enters if it is from the same context */
if (spin_is_locked(&enable_lock) || mutex_is_locked(&prepare_lock)) {
if ((void *) atomic_read(&context) == get_current()) {
ret = __clk_enable(clk);
goto out;
}
}
I beleive the clk_enable|disable code will be racy. What do you think about this scenario:
- Thread 1, calls clk_prepare -> clk is not reentrant -> mutex_lock
-> set_context to thread1. 2. Thread 2, calls clk_enable -> above "if" will mean that get_current returns thread 1 context and then clk_enable continues -> spin_lock_irqsave -> set_context to thread 2. 3. Thread 1 continues and triggers a reentancy for clk_prepare -> clk is not reentrant (since thread 2 has set a new context) -> mutex_lock and we will hang forever.
Do you think above scenario could happen?
I think the solution would be to invent another "static atomic_t context;" which is used only for fast path functions (clk_enable|disable). That should do the trick I think.
Ulf,
You are correct. In fact I have a branch that has two separate context pointers, one for mutex-protected functions and one for spinlock-protected functions. Somehow I managed to discard that change before settling on the final version that was published.
Err.
Do not forget one very important point.
Any clock which has clk_enable() called on it must have had clk_prepare() already called _and_ completed. A second clk_prepare() call on the same clock should be a no-op other than to increase the prepare reference count on it.
If you do anything else, you are going to get into sticky problems.
Quoting Russell King - ARM Linux (2013-03-18 14:00:11)
On Mon, Mar 18, 2013 at 01:15:51PM -0700, Mike Turquette wrote:
Quoting Ulf Hansson (2013-02-28 01:54:34)
On 28 February 2013 05:49, Mike Turquette mturquette@linaro.org wrote:
@@ -703,10 +744,29 @@ int clk_enable(struct clk *clk) unsigned long flags; int ret;
/* this call re-enters if it is from the same context */
if (spin_is_locked(&enable_lock) || mutex_is_locked(&prepare_lock)) {
if ((void *) atomic_read(&context) == get_current()) {
ret = __clk_enable(clk);
goto out;
}
}
I beleive the clk_enable|disable code will be racy. What do you think about this scenario:
- Thread 1, calls clk_prepare -> clk is not reentrant -> mutex_lock
-> set_context to thread1. 2. Thread 2, calls clk_enable -> above "if" will mean that get_current returns thread 1 context and then clk_enable continues -> spin_lock_irqsave -> set_context to thread 2. 3. Thread 1 continues and triggers a reentancy for clk_prepare -> clk is not reentrant (since thread 2 has set a new context) -> mutex_lock and we will hang forever.
Do you think above scenario could happen?
I think the solution would be to invent another "static atomic_t context;" which is used only for fast path functions (clk_enable|disable). That should do the trick I think.
Ulf,
You are correct. In fact I have a branch that has two separate context pointers, one for mutex-protected functions and one for spinlock-protected functions. Somehow I managed to discard that change before settling on the final version that was published.
Err.
Do not forget one very important point.
Any clock which has clk_enable() called on it must have had clk_prepare() already called _and_ completed. A second clk_prepare() call on the same clock should be a no-op other than to increase the prepare reference count on it.
If you do anything else, you are going to get into sticky problems.
Correct usage of the api is of course still necessary. The reentrancy patch doesn't change api usage by drivers and does not violate the sequencing of clk_prepare/clk_enable and clk_disable/clk_unprepare. In Ulf's example thread 2 should have already called clk_prepare before calling clk_enable.
Ulf has correctly pointed out a bug in the locking/context logic due to having two distinct lock's for fast/slow operations. It will be fixed in the next verison.
Thanks, Mike
On Thu, 2013-02-28 at 12:49 +0800, Mike Turquette wrote:
Reentrancy into the clock framework from the clk.h api is highly desirable. This feature is necessary for clocks that are prepared and unprepared via i2c_transfer (which includes many PMICs and discrete audio chips) and it is also necessary for performing dynamic voltage & frequency scaling via clock rate-change notifiers.
This patch implements reentrancy by adding a global atomic_t which tracks the context of the current caller. Context in this case is the return value from get_current(). The clk.h api implementations are modified to first see if the relevant global lock is already held and if so compare the global context (set by whoever is holding the lock) against their own context (via a call to get_current()). If the two match then this function is a nested call from the one already holding the lock and we procede. If the context does not match then procede to call mutex_lock and busy-wait for the existing task to complete.
Thus this patch set does not increase concurrency for unrelated calls into the clock framework. Instead it simply allows reentrancy by the single task which is currently holding the global clock framework lock.
Thanks to Rajagoapl Venkat for the original idea to use get_current() and to David Brown for the suggestion to replace my previous rwlock scheme with atomic operations during code review at ELC 2013.
Signed-off-by: Mike Turquette mturquette@linaro.org Cc: Rajagopal Venkat rajagopal.venkat@linaro.org Cc: David Brown davidb@codeaurora.org
Hi Mike,
Will this single patch be accepted? I guess you might not merge the whole series but I think this one is useful, is it possible that you can send out this single patch (or just merge this one) as an improvement of CCF? Or you think otherwise?
Thanks, Bill
Quoting Bill Huang (2013-03-26 20:33:31)
On Thu, 2013-02-28 at 12:49 +0800, Mike Turquette wrote:
Reentrancy into the clock framework from the clk.h api is highly desirable. This feature is necessary for clocks that are prepared and unprepared via i2c_transfer (which includes many PMICs and discrete audio chips) and it is also necessary for performing dynamic voltage & frequency scaling via clock rate-change notifiers.
This patch implements reentrancy by adding a global atomic_t which tracks the context of the current caller. Context in this case is the return value from get_current(). The clk.h api implementations are modified to first see if the relevant global lock is already held and if so compare the global context (set by whoever is holding the lock) against their own context (via a call to get_current()). If the two match then this function is a nested call from the one already holding the lock and we procede. If the context does not match then procede to call mutex_lock and busy-wait for the existing task to complete.
Thus this patch set does not increase concurrency for unrelated calls into the clock framework. Instead it simply allows reentrancy by the single task which is currently holding the global clock framework lock.
Thanks to Rajagoapl Venkat for the original idea to use get_current() and to David Brown for the suggestion to replace my previous rwlock scheme with atomic operations during code review at ELC 2013.
Signed-off-by: Mike Turquette mturquette@linaro.org Cc: Rajagopal Venkat rajagopal.venkat@linaro.org Cc: David Brown davidb@codeaurora.org
Hi Mike,
Will this single patch be accepted? I guess you might not merge the whole series but I think this one is useful, is it possible that you can send out this single patch (or just merge this one) as an improvement of CCF? Or you think otherwise?
Bill,
Yes, I plan to merge this single patch for 3.10 and have posted a new version fixing the issue pointed out by Ulf. Please leave any review comments you have.
Thanks, Mike
Thanks, Bill
Dynamic voltage and frequency scaling (dvfs) is a common power saving technique in many of today's modern processors. This patch introduces a common clk rate-change notifier handler which scales voltage appropriately whenever clk_set_rate is called on an affected clock.
There are three prerequisites to using this feature:
1) the affected clocks must be using the common clk framework 2) voltage must be scaled using the regulator framework 3) clock frequency and regulator voltage values must be paired via the OPP library
If a platform or device meets these requirements then using the notifier handler is straightforward. A struct device is used as the basis for performing initial look-ups for clocks via clk_get and regulators via regulator_get. This means that notifiers are subscribed on a per-device basis and multiple devices can have notifiers subscribed to the same clock. Put another way, the voltage chosen for a rail during a call to clk_set_rate is a function of the device, not the clock.
Signed-off-by: Mike Turquette mturquette@linaro.org --- drivers/clk/Makefile | 1 + drivers/clk/dvfs.c | 125 ++++++++++++++++++++++++++++++++++++++++++++++++++ include/linux/clk.h | 27 ++++++++++- 3 files changed, 152 insertions(+), 1 deletion(-) create mode 100644 drivers/clk/dvfs.c
diff --git a/drivers/clk/Makefile b/drivers/clk/Makefile index e73b1d6..e720b7c 100644 --- a/drivers/clk/Makefile +++ b/drivers/clk/Makefile @@ -7,6 +7,7 @@ obj-$(CONFIG_COMMON_CLK) += clk-fixed-factor.o obj-$(CONFIG_COMMON_CLK) += clk-fixed-rate.o obj-$(CONFIG_COMMON_CLK) += clk-gate.o obj-$(CONFIG_COMMON_CLK) += clk-mux.o +obj-$(CONFIG_COMMON_CLK) += dvfs.o
# SoCs specific obj-$(CONFIG_ARCH_BCM2835) += clk-bcm2835.o diff --git a/drivers/clk/dvfs.c b/drivers/clk/dvfs.c new file mode 100644 index 0000000..d916d0b --- /dev/null +++ b/drivers/clk/dvfs.c @@ -0,0 +1,125 @@ +/* + * Copyright (C) 2011-2012 Linaro Ltd mturquette@linaro.org + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * Helper functions for dynamic voltage & frequency transitions using + * the OPP library. + */ + +#include <linux/clk.h> +#include <linux/regulator/consumer.h> +#include <linux/opp.h> +#include <linux/device.h> +#include <linux/slab.h> +#include <linux/module.h> + +/* + * XXX clk, regulator & tolerance should be stored in the OPP table? + */ +struct dvfs_info { + struct device *dev; + struct clk *clk; + struct regulator *reg; + int tol; + struct notifier_block nb; +}; + +#define to_dvfs_info(_nb) container_of(_nb, struct dvfs_info, nb) + +static int dvfs_clk_notifier_handler(struct notifier_block *nb, + unsigned long flags, void *data) +{ + struct clk_notifier_data *cnd = data; + struct dvfs_info *di = to_dvfs_info(nb); + int ret, volt_new, volt_old; + struct opp *opp; + + volt_old = regulator_get_voltage(di->reg); + rcu_read_lock(); + opp = opp_find_freq_floor(di->dev, &cnd->new_rate); + volt_new = opp_get_voltage(opp); + rcu_read_unlock(); + + /* scaling up? scale voltage before frequency */ + if (flags & PRE_RATE_CHANGE && cnd->new_rate > cnd->old_rate) { + dev_dbg(di->dev, "%s: %d mV --> %d mV\n", + __func__, volt_old, volt_new); + + ret = regulator_set_voltage_tol(di->reg, volt_new, di->tol); + + if (ret) { + dev_warn(di->dev, "%s: unable to scale voltage up.\n", + __func__); + return notifier_from_errno(ret); + } + } + + /* scaling down? scale voltage after frequency */ + if (flags & POST_RATE_CHANGE && cnd->new_rate < cnd->old_rate) { + dev_dbg(di->dev, "%s: %d mV --> %d mV\n", + __func__, volt_old, volt_new); + + ret = regulator_set_voltage_tol(di->reg, volt_new, di->tol); + + if (ret) { + dev_warn(di->dev, "%s: unable to scale voltage down.\n", + __func__); + return notifier_from_errno(ret); + } + } + + return NOTIFY_OK; +} + +struct dvfs_info *dvfs_clk_notifier_register(struct dvfs_info_init *dii) +{ + struct dvfs_info *di; + int ret = 0; + + if (!dii) + return ERR_PTR(-EINVAL); + + di = kzalloc(sizeof(struct dvfs_info), GFP_KERNEL); + if (!di) + return ERR_PTR(-ENOMEM); + + di->dev = dii->dev; + di->clk = clk_get(di->dev, dii->con_id); + if (IS_ERR(di->clk)) { + ret = -ENOMEM; + goto err; + } + + di->reg = regulator_get(di->dev, dii->reg_id); + if (IS_ERR(di->reg)) { + ret = -ENOMEM; + goto err; + } + + di->tol = dii->tol; + di->nb.notifier_call = dvfs_clk_notifier_handler; + + ret = clk_notifier_register(di->clk, &di->nb); + + if (ret) + goto err; + + return di; + +err: + kfree(di); + return ERR_PTR(ret); +} +EXPORT_SYMBOL_GPL(dvfs_clk_notifier_register); + +void dvfs_clk_notifier_unregister(struct dvfs_info *di) +{ + clk_notifier_unregister(di->clk, &di->nb); + clk_put(di->clk); + regulator_put(di->reg); + kfree(di); +} +EXPORT_SYMBOL_GPL(dvfs_clk_notifier_unregister); diff --git a/include/linux/clk.h b/include/linux/clk.h index b3ac22d..28d952f 100644 --- a/include/linux/clk.h +++ b/include/linux/clk.h @@ -78,9 +78,34 @@ struct clk_notifier_data { unsigned long new_rate; };
-int clk_notifier_register(struct clk *clk, struct notifier_block *nb); +/** + * struct dvfs_info_init - data needs to initialize struct dvfs_info + * @dev: device related to this frequency-voltage pair + * @con_id: string name of clock connection + * @reg_id: string name of regulator + * @tol: voltage tolerance for this device + * + * Provides the data needed to register a common dvfs sequence in a clk + * notifier handler. The clk and regulator lookups are stored in a + * private struct and the notifier handler is registered with the clk + * framework with a call to dvfs_clk_notifier_register. + * + * FIXME stuffing @tol here is a hack. It belongs in the opp table. + * Maybe clk & regulator will also live in the opp table some day. + */ +struct dvfs_info_init { + struct device *dev; + const char *con_id; + const char *reg_id; + int tol; +}; + +struct dvfs_info;
+int clk_notifier_register(struct clk *clk, struct notifier_block *nb); int clk_notifier_unregister(struct clk *clk, struct notifier_block *nb); +struct dvfs_info *dvfs_clk_notifier_register(struct dvfs_info_init *dii); +void dvfs_clk_notifier_unregister(struct dvfs_info *di);
#endif
Mike Turquette <mturquette@...> writes:
Dynamic voltage and frequency scaling (dvfs) is a common power saving technique in many of today's modern processors. This patch introduces a common clk rate-change notifier handler which scales voltage appropriately whenever clk_set_rate is called on an affected clock.
For DVFS to be useful, we might need to be notified on "clk_enable" and "clk_disable" since there are cases that driver will disable or enable clock directly but not calling clk_set_rate. Do you agree that notifier in common clock need to be improved?
There are three prerequisites to using this feature:
- the affected clocks must be using the common clk framework
- voltage must be scaled using the regulator framework
- clock frequency and regulator voltage values must be paired via the
OPP library
On Thu, 2013-02-28 at 12:49 +0800, Mike Turquette wrote:
Dynamic voltage and frequency scaling (dvfs) is a common power saving technique in many of today's modern processors. This patch introduces a common clk rate-change notifier handler which scales voltage appropriately whenever clk_set_rate is called on an affected clock.
I really think clk_enable and clk_disable should also be triggering notifier call and DVFS should act accordingly since there are cases drivers won't set clock rate but instead disable its clock directly, do you agree?
There are three prerequisites to using this feature:
- the affected clocks must be using the common clk framework
- voltage must be scaled using the regulator framework
- clock frequency and regulator voltage values must be paired via the
OPP library
Just a note, Tegra Core won't meet prerequisite #3 since each regulator voltage values is associated with clocks driving those many sub-HW blocks in it.
Quoting Bill Huang (2013-03-01 01:41:31)
On Thu, 2013-02-28 at 12:49 +0800, Mike Turquette wrote:
Dynamic voltage and frequency scaling (dvfs) is a common power saving technique in many of today's modern processors. This patch introduces a common clk rate-change notifier handler which scales voltage appropriately whenever clk_set_rate is called on an affected clock.
I really think clk_enable and clk_disable should also be triggering notifier call and DVFS should act accordingly since there are cases drivers won't set clock rate but instead disable its clock directly, do you agree?
Hi Bill,
I'll think about this. Perhaps a better solution would be to adapt these drivers to runtime PM. Then a call to runtime_pm_put() would result in a call to clk_disable(...) and regulator_set_voltage(...).
There is no performance-based equivalent to runtime PM, which is one reason why clk_set_rate is a likely entry point into dvfs. But for operations that have nice api's like runtime PM it would be better to use those interfaces and not overload the clk.h api unnecessarily.
There are three prerequisites to using this feature:
- the affected clocks must be using the common clk framework
- voltage must be scaled using the regulator framework
- clock frequency and regulator voltage values must be paired via the
OPP library
Just a note, Tegra Core won't meet prerequisite #3 since each regulator voltage values is associated with clocks driving those many sub-HW blocks in it.
This patch isn't the one and only way to perform dvfs. It is just a helper function for registering notifier handlers for systems that meet the above three requirements. Even if you do not use the OPP library there is no reason why you could not register your own rate-change notifier handler to implement dvfs using whatever lookup-table you use today.
And patches are welcome to extend the usefulness of this helper. I'd like as many people to benefit from this mechanism as possible.
At some point we should think hard about DT bindings for these operating points...
Regards, Mike
Quoting Mike Turquette (2013-03-01 10:22:34)
Quoting Bill Huang (2013-03-01 01:41:31)
On Thu, 2013-02-28 at 12:49 +0800, Mike Turquette wrote:
Dynamic voltage and frequency scaling (dvfs) is a common power saving technique in many of today's modern processors. This patch introduces a common clk rate-change notifier handler which scales voltage appropriately whenever clk_set_rate is called on an affected clock.
I really think clk_enable and clk_disable should also be triggering notifier call and DVFS should act accordingly since there are cases drivers won't set clock rate but instead disable its clock directly, do you agree?
Hi Bill,
I'll think about this. Perhaps a better solution would be to adapt these drivers to runtime PM. Then a call to runtime_pm_put() would result in a call to clk_disable(...) and regulator_set_voltage(...).
There is no performance-based equivalent to runtime PM, which is one reason why clk_set_rate is a likely entry point into dvfs. But for operations that have nice api's like runtime PM it would be better to use those interfaces and not overload the clk.h api unnecessarily.
Bill,
I wasn't thinking at all when I wrote this. Trying to rush to the airport I guess...
clk_enable() and clk_disable() must not sleep and all operations are done under a spinlock. So this rules out most use of notifiers. It is expected for some drivers to very aggressively enable/disable clocks in interrupt handlers so scaling voltage as a function of clk_{en|dis}able calls is also likely out of the question.
Some platforms have chosen to implement voltage scaling in their .prepare callbacks. I personally do not like this and still prefer drivers be adapted to runtime pm and let those callbacks handle voltage scaling along with clock handling.
Regards, Mike
There are three prerequisites to using this feature:
- the affected clocks must be using the common clk framework
- voltage must be scaled using the regulator framework
- clock frequency and regulator voltage values must be paired via the
OPP library
Just a note, Tegra Core won't meet prerequisite #3 since each regulator voltage values is associated with clocks driving those many sub-HW blocks in it.
This patch isn't the one and only way to perform dvfs. It is just a helper function for registering notifier handlers for systems that meet the above three requirements. Even if you do not use the OPP library there is no reason why you could not register your own rate-change notifier handler to implement dvfs using whatever lookup-table you use today.
And patches are welcome to extend the usefulness of this helper. I'd like as many people to benefit from this mechanism as possible.
At some point we should think hard about DT bindings for these operating points...
Regards, Mike
On Sat, 2013-03-02 at 04:48 +0800, Mike Turquette wrote:
Quoting Mike Turquette (2013-03-01 10:22:34)
Quoting Bill Huang (2013-03-01 01:41:31)
On Thu, 2013-02-28 at 12:49 +0800, Mike Turquette wrote:
Dynamic voltage and frequency scaling (dvfs) is a common power saving technique in many of today's modern processors. This patch introduces a common clk rate-change notifier handler which scales voltage appropriately whenever clk_set_rate is called on an affected clock.
I really think clk_enable and clk_disable should also be triggering notifier call and DVFS should act accordingly since there are cases drivers won't set clock rate but instead disable its clock directly, do you agree?
Hi Bill,
I'll think about this. Perhaps a better solution would be to adapt these drivers to runtime PM. Then a call to runtime_pm_put() would result in a call to clk_disable(...) and regulator_set_voltage(...).
There is no performance-based equivalent to runtime PM, which is one reason why clk_set_rate is a likely entry point into dvfs. But for operations that have nice api's like runtime PM it would be better to use those interfaces and not overload the clk.h api unnecessarily.
Bill,
I wasn't thinking at all when I wrote this. Trying to rush to the airport I guess...
clk_enable() and clk_disable() must not sleep and all operations are done under a spinlock. So this rules out most use of notifiers. It is expected for some drivers to very aggressively enable/disable clocks in interrupt handlers so scaling voltage as a function of clk_{en|dis}able calls is also likely out of the question.
Yeah for those existing drivers to call enable/disable clocks in interrupt have ruled out this, I didn't think through when I was asking.
Some platforms have chosen to implement voltage scaling in their .prepare callbacks. I personally do not like this and still prefer drivers be adapted to runtime pm and let those callbacks handle voltage scaling along with clock handling.
I think different SoC have different mechanisms or constraints on doing their DVFS, such as Tegra VDD_CORE rail, it supplies power to many devices and as a consequence each device do not have their own power rail to control, instead a central driver to handle/control this power rail is needed (to set voltage at the maximum of the requested voltage from all its sub-devices), so I'm wondering even if every drivers are doing DVFS through runtime pm, we're still having hole on knowing whether or not clocks of the interested devices are enabled/disabled at runtime, I'm not familiar with runtime pm and hence do not know if there is a mechanism to handle this, I'll study a bit. Thanks.
Regards, Mike
There are three prerequisites to using this feature:
- the affected clocks must be using the common clk framework
- voltage must be scaled using the regulator framework
- clock frequency and regulator voltage values must be paired via the
OPP library
Just a note, Tegra Core won't meet prerequisite #3 since each regulator voltage values is associated with clocks driving those many sub-HW blocks in it.
This patch isn't the one and only way to perform dvfs. It is just a helper function for registering notifier handlers for systems that meet the above three requirements. Even if you do not use the OPP library there is no reason why you could not register your own rate-change notifier handler to implement dvfs using whatever lookup-table you use today.
And patches are welcome to extend the usefulness of this helper. I'd like as many people to benefit from this mechanism as possible.
The extension is not so easy for us though since OPP library is assuming each device has a 1-1 mapping on its operating frequency and voltage.
At some point we should think hard about DT bindings for these operating points...
Regards, Mike
On Fri, Mar 01, 2013 at 06:55:54PM -0800, Bill Huang wrote:
On Sat, 2013-03-02 at 04:48 +0800, Mike Turquette wrote:
Quoting Mike Turquette (2013-03-01 10:22:34)
Quoting Bill Huang (2013-03-01 01:41:31)
On Thu, 2013-02-28 at 12:49 +0800, Mike Turquette wrote:
Dynamic voltage and frequency scaling (dvfs) is a common power saving technique in many of today's modern processors. This patch introduces a common clk rate-change notifier handler which scales voltage appropriately whenever clk_set_rate is called on an affected clock.
I really think clk_enable and clk_disable should also be triggering notifier call and DVFS should act accordingly since there are cases drivers won't set clock rate but instead disable its clock directly, do you agree?
Hi Bill,
I'll think about this. Perhaps a better solution would be to adapt these drivers to runtime PM. Then a call to runtime_pm_put() would result in a call to clk_disable(...) and regulator_set_voltage(...).
There is no performance-based equivalent to runtime PM, which is one reason why clk_set_rate is a likely entry point into dvfs. But for operations that have nice api's like runtime PM it would be better to use those interfaces and not overload the clk.h api unnecessarily.
Bill,
I wasn't thinking at all when I wrote this. Trying to rush to the airport I guess...
clk_enable() and clk_disable() must not sleep and all operations are done under a spinlock. So this rules out most use of notifiers. It is expected for some drivers to very aggressively enable/disable clocks in interrupt handlers so scaling voltage as a function of clk_{en|dis}able calls is also likely out of the question.
Yeah for those existing drivers to call enable/disable clocks in interrupt have ruled out this, I didn't think through when I was asking.
Some platforms have chosen to implement voltage scaling in their .prepare callbacks. I personally do not like this and still prefer drivers be adapted to runtime pm and let those callbacks handle voltage scaling along with clock handling.
Voltage scaling in clock notifiers seems similar. Clock and regulater embedded code into each other will cause things complicated.
I think different SoC have different mechanisms or constraints on doing their DVFS, such as Tegra VDD_CORE rail, it supplies power to many devices and as a consequence each device do not have their own power rail to control, instead a central driver to handle/control this power rail is needed (to set voltage at the maximum of the requested voltage from all its sub-devices), so I'm wondering even if every drivers are doing DVFS through runtime pm, we're still having hole on knowing whether or not clocks of the interested devices are enabled/disabled at runtime, I'm not familiar with runtime pm and hence do not know if there is a mechanism to handle this, I'll study a bit. Thanks.
Regards, Mike
There are three prerequisites to using this feature:
- the affected clocks must be using the common clk framework
- voltage must be scaled using the regulator framework
- clock frequency and regulator voltage values must be paired via the
OPP library
Just a note, Tegra Core won't meet prerequisite #3 since each regulator voltage values is associated with clocks driving those many sub-HW blocks in it.
This patch isn't the one and only way to perform dvfs. It is just a helper function for registering notifier handlers for systems that meet the above three requirements. Even if you do not use the OPP library there is no reason why you could not register your own rate-change notifier handler to implement dvfs using whatever lookup-table you use today.
And patches are welcome to extend the usefulness of this helper. I'd like as many people to benefit from this mechanism as possible.
The extension is not so easy for us though since OPP library is assuming each device has a 1-1 mapping on its operating frequency and voltage.
Is there someone working on OPP clock/regulator sets support?
Thanks Richard
At some point we should think hard about DT bindings for these operating points...
Regards, Mike
linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Quoting Richard Zhao (2013-03-02 00:22:19)
On Fri, Mar 01, 2013 at 06:55:54PM -0800, Bill Huang wrote:
On Sat, 2013-03-02 at 04:48 +0800, Mike Turquette wrote:
Quoting Mike Turquette (2013-03-01 10:22:34)
Quoting Bill Huang (2013-03-01 01:41:31)
On Thu, 2013-02-28 at 12:49 +0800, Mike Turquette wrote:
Dynamic voltage and frequency scaling (dvfs) is a common power saving technique in many of today's modern processors. This patch introduces a common clk rate-change notifier handler which scales voltage appropriately whenever clk_set_rate is called on an affected clock.
I really think clk_enable and clk_disable should also be triggering notifier call and DVFS should act accordingly since there are cases drivers won't set clock rate but instead disable its clock directly, do you agree?
Hi Bill,
I'll think about this. Perhaps a better solution would be to adapt these drivers to runtime PM. Then a call to runtime_pm_put() would result in a call to clk_disable(...) and regulator_set_voltage(...).
There is no performance-based equivalent to runtime PM, which is one reason why clk_set_rate is a likely entry point into dvfs. But for operations that have nice api's like runtime PM it would be better to use those interfaces and not overload the clk.h api unnecessarily.
Bill,
I wasn't thinking at all when I wrote this. Trying to rush to the airport I guess...
clk_enable() and clk_disable() must not sleep and all operations are done under a spinlock. So this rules out most use of notifiers. It is expected for some drivers to very aggressively enable/disable clocks in interrupt handlers so scaling voltage as a function of clk_{en|dis}able calls is also likely out of the question.
Yeah for those existing drivers to call enable/disable clocks in interrupt have ruled out this, I didn't think through when I was asking.
Some platforms have chosen to implement voltage scaling in their .prepare callbacks. I personally do not like this and still prefer drivers be adapted to runtime pm and let those callbacks handle voltage scaling along with clock handling.
Voltage scaling in clock notifiers seems similar. Clock and regulater embedded code into each other will cause things complicated.
Hi Richard,
Sorry, I do not follow the above statement. Can you clarify what you mean?
I think different SoC have different mechanisms or constraints on doing their DVFS, such as Tegra VDD_CORE rail, it supplies power to many devices and as a consequence each device do not have their own power rail to control, instead a central driver to handle/control this power rail is needed (to set voltage at the maximum of the requested voltage from all its sub-devices), so I'm wondering even if every drivers are doing DVFS through runtime pm, we're still having hole on knowing whether or not clocks of the interested devices are enabled/disabled at runtime, I'm not familiar with runtime pm and hence do not know if there is a mechanism to handle this, I'll study a bit. Thanks.
Regards, Mike
There are three prerequisites to using this feature:
- the affected clocks must be using the common clk framework
- voltage must be scaled using the regulator framework
- clock frequency and regulator voltage values must be paired via the
OPP library
Just a note, Tegra Core won't meet prerequisite #3 since each regulator voltage values is associated with clocks driving those many sub-HW blocks in it.
This patch isn't the one and only way to perform dvfs. It is just a helper function for registering notifier handlers for systems that meet the above three requirements. Even if you do not use the OPP library there is no reason why you could not register your own rate-change notifier handler to implement dvfs using whatever lookup-table you use today.
And patches are welcome to extend the usefulness of this helper. I'd like as many people to benefit from this mechanism as possible.
The extension is not so easy for us though since OPP library is assuming each device has a 1-1 mapping on its operating frequency and voltage.
Is there someone working on OPP clock/regulator sets support?
No, but I'm going to bring this up at LCA on Wednesday. It would be nice to have some DT bindings for declaring operating points that tie clocks & regulators together.
Regards, Mike
Thanks Richard
At some point we should think hard about DT bindings for these operating points...
Regards, Mike
linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Hi Mike,
On Sun, Mar 03, 2013 at 02:54:24AM -0800, Mike Turquette wrote:
Quoting Richard Zhao (2013-03-02 00:22:19)
On Fri, Mar 01, 2013 at 06:55:54PM -0800, Bill Huang wrote:
On Sat, 2013-03-02 at 04:48 +0800, Mike Turquette wrote:
Quoting Mike Turquette (2013-03-01 10:22:34)
Quoting Bill Huang (2013-03-01 01:41:31)
On Thu, 2013-02-28 at 12:49 +0800, Mike Turquette wrote: > Dynamic voltage and frequency scaling (dvfs) is a common power saving > technique in many of today's modern processors. This patch introduces a > common clk rate-change notifier handler which scales voltage > appropriately whenever clk_set_rate is called on an affected clock.
I really think clk_enable and clk_disable should also be triggering notifier call and DVFS should act accordingly since there are cases drivers won't set clock rate but instead disable its clock directly, do you agree? >
Hi Bill,
I'll think about this. Perhaps a better solution would be to adapt these drivers to runtime PM. Then a call to runtime_pm_put() would result in a call to clk_disable(...) and regulator_set_voltage(...).
There is no performance-based equivalent to runtime PM, which is one reason why clk_set_rate is a likely entry point into dvfs. But for operations that have nice api's like runtime PM it would be better to use those interfaces and not overload the clk.h api unnecessarily.
Bill,
I wasn't thinking at all when I wrote this. Trying to rush to the airport I guess...
clk_enable() and clk_disable() must not sleep and all operations are done under a spinlock. So this rules out most use of notifiers. It is expected for some drivers to very aggressively enable/disable clocks in interrupt handlers so scaling voltage as a function of clk_{en|dis}able calls is also likely out of the question.
Yeah for those existing drivers to call enable/disable clocks in interrupt have ruled out this, I didn't think through when I was asking.
Some platforms have chosen to implement voltage scaling in their .prepare callbacks. I personally do not like this and still prefer drivers be adapted to runtime pm and let those callbacks handle voltage scaling along with clock handling.
Voltage scaling in clock notifiers seems similar. Clock and regulater embedded code into each other will cause things complicated.
Hi Richard,
Sorry, I do not follow the above statement. Can you clarify what you mean?
As we have agreement that a operating point may have multiple clocks and regulators, this patch is impossible to support multi clocks. And it might mislead dvfs implementer to use clock notifier. It may be good to have unified api like dvfs_set_opp(opp), or drivers can handle clocks and regulators theirselves which is more flexible. What do you think?
Thanks Richard
Quoting Richard Zhao (2013-03-03 05:27:52)
Hi Mike,
On Sun, Mar 03, 2013 at 02:54:24AM -0800, Mike Turquette wrote:
Quoting Richard Zhao (2013-03-02 00:22:19)
On Fri, Mar 01, 2013 at 06:55:54PM -0800, Bill Huang wrote:
On Sat, 2013-03-02 at 04:48 +0800, Mike Turquette wrote:
Quoting Mike Turquette (2013-03-01 10:22:34)
Quoting Bill Huang (2013-03-01 01:41:31) > On Thu, 2013-02-28 at 12:49 +0800, Mike Turquette wrote: > > Dynamic voltage and frequency scaling (dvfs) is a common power saving > > technique in many of today's modern processors. This patch introduces a > > common clk rate-change notifier handler which scales voltage > > appropriately whenever clk_set_rate is called on an affected clock. > > I really think clk_enable and clk_disable should also be triggering > notifier call and DVFS should act accordingly since there are cases > drivers won't set clock rate but instead disable its clock directly, do > you agree? > >
Hi Bill,
I'll think about this. Perhaps a better solution would be to adapt these drivers to runtime PM. Then a call to runtime_pm_put() would result in a call to clk_disable(...) and regulator_set_voltage(...).
There is no performance-based equivalent to runtime PM, which is one reason why clk_set_rate is a likely entry point into dvfs. But for operations that have nice api's like runtime PM it would be better to use those interfaces and not overload the clk.h api unnecessarily.
Bill,
I wasn't thinking at all when I wrote this. Trying to rush to the airport I guess...
clk_enable() and clk_disable() must not sleep and all operations are done under a spinlock. So this rules out most use of notifiers. It is expected for some drivers to very aggressively enable/disable clocks in interrupt handlers so scaling voltage as a function of clk_{en|dis}able calls is also likely out of the question.
Yeah for those existing drivers to call enable/disable clocks in interrupt have ruled out this, I didn't think through when I was asking.
Some platforms have chosen to implement voltage scaling in their .prepare callbacks. I personally do not like this and still prefer drivers be adapted to runtime pm and let those callbacks handle voltage scaling along with clock handling.
Voltage scaling in clock notifiers seems similar. Clock and regulater embedded code into each other will cause things complicated.
Hi Richard,
Sorry, I do not follow the above statement. Can you clarify what you mean?
As we have agreement that a operating point may have multiple clocks and regulators, this patch is impossible to support multi clocks. And it might mislead dvfs implementer to use clock notifier. It may be good to have unified api like dvfs_set_opp(opp), or drivers can handle clocks and regulators theirselves which is more flexible. What do you think?
Yes, there is a long-standing question whether clk_set_rate is sufficient to cover dvfs needs or if a new api is required. There are many possible solutions.
One solution is to use clk_set_rate and use the rate-change notifiers to call clk_set_rate on the other required clocks. This is graceful from the perspective of the driver since the driver author only has to think about directly managing the clock(s) for that device and the rest is managed automagically. It is more complicated for the platform integrator that must make sure the automagical stuff is set up correctly. This might be considered a more "distributed" approach.
Another solution would be a new api which calls clk_set_rate individually on the affected clocks (e.g. a functional clk, an async bridge child clock, and some other related non-child clock). This is less complicated for the platform integrator and represents a more "centralized" approach. It is less graceful for the device driver author who must learn a new api and decide whether to call the new api or to call clk_set_rate.
A hybrid solution might be to set a flag on a clock (e.g. CLK_SET_RATE_DVFS) which tells the clk framework that this clock is special and clk_set_rate should call dvfs_set_opp(), where dvfs_set_opp() is never exposed to drivers directly.
None of the above solutions are related to your point about scaling voltage from clk_set_rate. Voltage may still be scaled from the clock rate-change notifier despite the method chose to scale groups of clocks.
Regards, Mike
Thanks Richard
On 4 March 2013 08:25, Mike Turquette mturquette@linaro.org wrote:
Quoting Richard Zhao (2013-03-03 05:27:52)
Hi Mike,
On Sun, Mar 03, 2013 at 02:54:24AM -0800, Mike Turquette wrote:
Quoting Richard Zhao (2013-03-02 00:22:19)
On Fri, Mar 01, 2013 at 06:55:54PM -0800, Bill Huang wrote:
On Sat, 2013-03-02 at 04:48 +0800, Mike Turquette wrote:
Quoting Mike Turquette (2013-03-01 10:22:34) > Quoting Bill Huang (2013-03-01 01:41:31) > > On Thu, 2013-02-28 at 12:49 +0800, Mike Turquette wrote: > > > Dynamic voltage and frequency scaling (dvfs) is a common power saving > > > technique in many of today's modern processors. This patch introduces a > > > common clk rate-change notifier handler which scales voltage > > > appropriately whenever clk_set_rate is called on an affected clock. > > > > I really think clk_enable and clk_disable should also be triggering > > notifier call and DVFS should act accordingly since there are cases > > drivers won't set clock rate but instead disable its clock directly, do > > you agree? > > > > > Hi Bill, > > I'll think about this. Perhaps a better solution would be to adapt > these drivers to runtime PM. Then a call to runtime_pm_put() would > result in a call to clk_disable(...) and regulator_set_voltage(...). > > There is no performance-based equivalent to runtime PM, which is one > reason why clk_set_rate is a likely entry point into dvfs. But for > operations that have nice api's like runtime PM it would be better to > use those interfaces and not overload the clk.h api unnecessarily. >
Bill,
I wasn't thinking at all when I wrote this. Trying to rush to the airport I guess...
clk_enable() and clk_disable() must not sleep and all operations are done under a spinlock. So this rules out most use of notifiers. It is expected for some drivers to very aggressively enable/disable clocks in interrupt handlers so scaling voltage as a function of clk_{en|dis}able calls is also likely out of the question.
Yeah for those existing drivers to call enable/disable clocks in interrupt have ruled out this, I didn't think through when I was asking.
Some platforms have chosen to implement voltage scaling in their .prepare callbacks. I personally do not like this and still prefer drivers be adapted to runtime pm and let those callbacks handle voltage scaling along with clock handling.
Voltage scaling in clock notifiers seems similar. Clock and regulater embedded code into each other will cause things complicated.
Hi Richard,
Sorry, I do not follow the above statement. Can you clarify what you mean?
As we have agreement that a operating point may have multiple clocks and regulators, this patch is impossible to support multi clocks. And it might mislead dvfs implementer to use clock notifier. It may be good to have unified api like dvfs_set_opp(opp), or drivers can handle clocks and regulators theirselves which is more flexible. What do you think?
Yes, there is a long-standing question whether clk_set_rate is sufficient to cover dvfs needs or if a new api is required. There are many possible solutions.
One solution is to use clk_set_rate and use the rate-change notifiers to call clk_set_rate on the other required clocks. This is graceful from the perspective of the driver since the driver author only has to think about directly managing the clock(s) for that device and the rest is managed automagically. It is more complicated for the platform integrator that must make sure the automagical stuff is set up correctly. This might be considered a more "distributed" approach.
From my point of view, I see some concern with this solution. Mainly
because of a lot of complexity with regards to DVFS will be pushed down to be handled by each an every driver. Probably it will be okay for SoC specific drivers but likely not for cross SoC drivers, if you see what I mean.
Another solution would be a new api which calls clk_set_rate individually on the affected clocks (e.g. a functional clk, an async bridge child clock, and some other related non-child clock). This is less complicated for the platform integrator and represents a more "centralized" approach. It is less graceful for the device driver author who must learn a new api and decide whether to call the new api or to call clk_set_rate.
This could be somewhat more preferred than the first solution. Likely less code in each driver but still "DVFS intelligence" need to exist there.
A hybrid solution might be to set a flag on a clock (e.g. CLK_SET_RATE_DVFS) which tells the clk framework that this clock is special and clk_set_rate should call dvfs_set_opp(), where dvfs_set_opp() is never exposed to drivers directly.
I like the direction of were this idea is going. In principle that will mean that you actually can do DVFS handling from clk_prepare|unprepare as well, which I think is a wanted feature as well. Moreover, drivers do in general not need to bother about DVFS which in the end should be a good, right?
For reference, from a ux500 SoC perspective, we have internal code - not upstreamed yet, which implements a specific "OPP clock" type. No changes has been done to the common clk framework for this. The "OPP clock" clk hw, utilizes the OPP library to find out what opp to chose for a certain frequency and then request a change if needed. To make this solution really fly we do require the clk API to be re-entrant since that is needed to be able to update the OPP. Moreover, it would be possible to make use of the clk_prepare|unprepare callbacks for this clk hw to also handle OPP changes.
None of the above solutions are related to your point about scaling voltage from clk_set_rate. Voltage may still be scaled from the clock rate-change notifier despite the method chose to scale groups of clocks.
Regards, Mike
Thanks Richard
linaro-dev mailing list linaro-dev@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-dev
Kind regards Ulf Hansson
On 03/01/2013 02:41 AM, Bill Huang wrote:
On Thu, 2013-02-28 at 12:49 +0800, Mike Turquette wrote:
Dynamic voltage and frequency scaling (dvfs) is a common power saving technique in many of today's modern processors. This patch introduces a common clk rate-change notifier handler which scales voltage appropriately whenever clk_set_rate is called on an affected clock.
I really think clk_enable and clk_disable should also be triggering notifier call and DVFS should act accordingly since there are cases drivers won't set clock rate but instead disable its clock directly, do you agree?
There are three prerequisites to using this feature:
- the affected clocks must be using the common clk framework
- voltage must be scaled using the regulator framework
- clock frequency and regulator voltage values must be paired via the
OPP library
Just a note, Tegra Core won't meet prerequisite #3 since each regulator voltage values is associated with clocks driving those many sub-HW blocks in it.
Perhaps that "just" means extending the dvfs.c code here to iterate over each clock consumer (rather than each clock provider), and having each set a minimum voltage (rather than a specific voltage), and having the regulator core apply the maximum of those minimum constraints?
Or something like that anyway.
On Sat, 2013-03-02 at 04:49 +0800, Stephen Warren wrote:
On 03/01/2013 02:41 AM, Bill Huang wrote:
On Thu, 2013-02-28 at 12:49 +0800, Mike Turquette wrote:
There are three prerequisites to using this feature:
- the affected clocks must be using the common clk framework
- voltage must be scaled using the regulator framework
- clock frequency and regulator voltage values must be paired via the
OPP library
Just a note, Tegra Core won't meet prerequisite #3 since each regulator voltage values is associated with clocks driving those many sub-HW blocks in it.
Perhaps that "just" means extending the dvfs.c code here to iterate over each clock consumer (rather than each clock provider), and having each set a minimum voltage (rather than a specific voltage), and having the regulator core apply the maximum of those minimum constraints?
Or something like that anyway.
Thanks, I'll think about this or maybe study a bit, it sounds like we can leverage existing api in regulator framework (which I don't know) to do what you've proposed, please clarify if I misunderstand.
-- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
On 02/28/2013 05:49 AM, Mike Turquette wrote:
Dynamic voltage and frequency scaling (dvfs) is a common power saving technique in many of today's modern processors. This patch introduces a common clk rate-change notifier handler which scales voltage appropriately whenever clk_set_rate is called on an affected clock.
There are three prerequisites to using this feature:
- the affected clocks must be using the common clk framework
- voltage must be scaled using the regulator framework
- clock frequency and regulator voltage values must be paired via the
OPP library
If a platform or device meets these requirements then using the notifier handler is straightforward. A struct device is used as the basis for performing initial look-ups for clocks via clk_get and regulators via regulator_get. This means that notifiers are subscribed on a per-device basis and multiple devices can have notifiers subscribed to the same clock. Put another way, the voltage chosen for a rail during a call to clk_set_rate is a function of the device, not the clock.
Signed-off-by: Mike Turquette mturquette@linaro.org
[...]
+struct dvfs_info *dvfs_clk_notifier_register(struct dvfs_info_init *dii) +{
- struct dvfs_info *di;
- int ret = 0;
- if (!dii)
return ERR_PTR(-EINVAL);
- di = kzalloc(sizeof(struct dvfs_info), GFP_KERNEL);
- if (!di)
return ERR_PTR(-ENOMEM);
- di->dev = dii->dev;
- di->clk = clk_get(di->dev, dii->con_id);
- if (IS_ERR(di->clk)) {
ret = -ENOMEM;
goto err;
- }
- di->reg = regulator_get(di->dev, dii->reg_id);
- if (IS_ERR(di->reg)) {
ret = -ENOMEM;
goto err;
- }
- di->tol = dii->tol;
- di->nb.notifier_call = dvfs_clk_notifier_handler;
- ret = clk_notifier_register(di->clk, &di->nb);
- if (ret)
goto err;
Shouldn't regulator_put() and clk_put() be called in the error path?
- return di;
+err:
- kfree(di);
- return ERR_PTR(ret);
+} +EXPORT_SYMBOL_GPL(dvfs_clk_notifier_register);
+void dvfs_clk_notifier_unregister(struct dvfs_info *di) +{
- clk_notifier_unregister(di->clk, &di->nb);
- clk_put(di->clk);
- regulator_put(di->reg);
- kfree(di);
+} +EXPORT_SYMBOL_GPL(dvfs_clk_notifier_unregister);
Regards, Francesco
Hi Mike
+/*
- Copyright (C) 2011-2012 Linaro Ltd mturquette@linaro.org
- This program is free software; you can redistribute it and/or modify
- it under the terms of the GNU General Public License version 2 as
- published by the Free Software Foundation.
- Helper functions for dynamic voltage & frequency transitions using
- the OPP library.
- */
+#include <linux/clk.h> +#include <linux/regulator/consumer.h> +#include <linux/opp.h> +#include <linux/device.h> +#include <linux/slab.h> +#include <linux/module.h>
+/*
- XXX clk, regulator & tolerance should be stored in the OPP table?
- */
+struct dvfs_info {
- struct device *dev;
- struct clk *clk;
- struct regulator *reg;
- int tol;
- struct notifier_block nb;
+};
+#define to_dvfs_info(_nb) container_of(_nb, struct dvfs_info, nb)
+static int dvfs_clk_notifier_handler(struct notifier_block *nb,
unsigned long flags, void *data)
+{
- struct clk_notifier_data *cnd = data;
- struct dvfs_info *di = to_dvfs_info(nb);
- int ret, volt_new, volt_old;
- struct opp *opp;
- volt_old = regulator_get_voltage(di->reg);
- rcu_read_lock();
- opp = opp_find_freq_floor(di->dev, &cnd->new_rate);
I think here should be opp_find_freq_ceil(). Let's imagine we have a following OPP table for some device: 1 - <100MHz - 1.0V> 2 - <200MHz - 1.2V> 3 - <400MHz - 1.4V>
If device driver scales clock to 150MHz, then OPP #1 will be chosen. This will lead to configuration <150MHz - 1.0V> that may be unstable. It would be better to ceil and end-up with <150MHz - 1.2V>.
- volt_new = opp_get_voltage(opp);
- rcu_read_unlock();
- /* scaling up? scale voltage before frequency */
- if (flags & PRE_RATE_CHANGE && cnd->new_rate > cnd->old_rate) {
opp_find_freq_*() functions can update cnd->new_rate, so you may compare not exactly what you are expecting.
dev_dbg(di->dev, "%s: %d mV --> %d mV\n",
__func__, volt_old, volt_new);
ret = regulator_set_voltage_tol(di->reg, volt_new, di->tol);
I may not have a deep understanding of regulator framework, but looks like regulator_set_voltage_tol() is not the right API here. As per my understanding regulator framework should aggregate voltage request from different consumers of the particular regulator.
Let's say there are two consumers of VDD regulator. First device set 1.0V. It will map to range 0.98-1.02V (2% tolerance). If second device will try to set 1.3V it will fail because range 1.28-1.32V does not overlap with the first request.
Maybe better to set upper limit to max OPP voltage or just use INT_MAX.
if (ret) {
dev_warn(di->dev, "%s: unable to scale voltage up.\n",
__func__);
return notifier_from_errno(ret);
}
- }
- /* scaling down? scale voltage after frequency */
- if (flags & POST_RATE_CHANGE && cnd->new_rate < cnd->old_rate) {
dev_dbg(di->dev, "%s: %d mV --> %d mV\n",
__func__, volt_old, volt_new);
ret = regulator_set_voltage_tol(di->reg, volt_new, di->tol);
if (ret) {
dev_warn(di->dev, "%s: unable to scale voltage down.\n",
__func__);
return notifier_from_errno(ret);
}
- }
- return NOTIFY_OK;
+}
+struct dvfs_info *dvfs_clk_notifier_register(struct dvfs_info_init *dii) +{
- struct dvfs_info *di;
- int ret = 0;
- if (!dii)
return ERR_PTR(-EINVAL);
- di = kzalloc(sizeof(struct dvfs_info), GFP_KERNEL);
- if (!di)
return ERR_PTR(-ENOMEM);
- di->dev = dii->dev;
- di->clk = clk_get(di->dev, dii->con_id);
- if (IS_ERR(di->clk)) {
ret = -ENOMEM;
goto err;
- }
- di->reg = regulator_get(di->dev, dii->reg_id);
- if (IS_ERR(di->reg)) {
ret = -ENOMEM;
goto err;
- }
- di->tol = dii->tol;
- di->nb.notifier_call = dvfs_clk_notifier_handler;
- ret = clk_notifier_register(di->clk, &di->nb);
- if (ret)
goto err;
- return di;
+err:
- kfree(di);
- return ERR_PTR(ret);
+} +EXPORT_SYMBOL_GPL(dvfs_clk_notifier_register);
+void dvfs_clk_notifier_unregister(struct dvfs_info *di) +{
- clk_notifier_unregister(di->clk, &di->nb);
- clk_put(di->clk);
- regulator_put(di->reg);
- kfree(di);
+} +EXPORT_SYMBOL_GPL(dvfs_clk_notifier_unregister); diff --git a/include/linux/clk.h b/include/linux/clk.h index b3ac22d..28d952f 100644 --- a/include/linux/clk.h +++ b/include/linux/clk.h @@ -78,9 +78,34 @@ struct clk_notifier_data { unsigned long new_rate; };
-int clk_notifier_register(struct clk *clk, struct notifier_block *nb); +/**
- struct dvfs_info_init - data needs to initialize struct dvfs_info
- @dev: device related to this frequency-voltage pair
- @con_id: string name of clock connection
- @reg_id: string name of regulator
- @tol: voltage tolerance for this device
- Provides the data needed to register a common dvfs sequence in a clk
- notifier handler. The clk and regulator lookups are stored in a
- private struct and the notifier handler is registered with the clk
- framework with a call to dvfs_clk_notifier_register.
- FIXME stuffing @tol here is a hack. It belongs in the opp table.
- Maybe clk & regulator will also live in the opp table some day.
- */
+struct dvfs_info_init {
- struct device *dev;
- const char *con_id;
- const char *reg_id;
- int tol;
+};
+struct dvfs_info;
+int clk_notifier_register(struct clk *clk, struct notifier_block *nb); int clk_notifier_unregister(struct clk *clk, struct notifier_block *nb); +struct dvfs_info *dvfs_clk_notifier_register(struct dvfs_info_init *dii); +void dvfs_clk_notifier_unregister(struct dvfs_info *di);
#endif
This patch moves direct control of the MPU voltage regulator out of the cpufreq driver .target callback and instead uses the common dvfs clk rate-change notifier infrastructure.
Ideally it would be nice to reduce the .target callback for omap's cpufreq driver to a simple call to clk_set_rate. For now there is still some other stuff needed there (jiffies per loop, rounding the rate, etc etc).
Signed-off-by: Mike Turquette mturquette@linaro.org --- drivers/cpufreq/omap-cpufreq.c | 82 ++++++++++------------------------------ 1 file changed, 20 insertions(+), 62 deletions(-)
diff --git a/drivers/cpufreq/omap-cpufreq.c b/drivers/cpufreq/omap-cpufreq.c index 1f3417a..6bec1c4 100644 --- a/drivers/cpufreq/omap-cpufreq.c +++ b/drivers/cpufreq/omap-cpufreq.c @@ -37,7 +37,7 @@ static struct cpufreq_frequency_table *freq_table; static atomic_t freq_table_users = ATOMIC_INIT(0); static struct clk *mpu_clk; static struct device *mpu_dev; -static struct regulator *mpu_reg; +static struct dvfs_info *di;
static int omap_verify_speed(struct cpufreq_policy *policy) { @@ -62,10 +62,9 @@ static int omap_target(struct cpufreq_policy *policy, unsigned int relation) { unsigned int i; - int r, ret = 0; + int ret = 0; struct cpufreq_freqs freqs; - struct opp *opp; - unsigned long freq, volt = 0, volt_old = 0, tol = 0; + unsigned long freq;
if (!freq_table) { dev_err(mpu_dev, "%s: cpu%d: no freq table!\n", __func__, @@ -109,50 +108,13 @@ static int omap_target(struct cpufreq_policy *policy, } freq = ret;
- if (mpu_reg) { - opp = opp_find_freq_ceil(mpu_dev, &freq); - if (IS_ERR(opp)) { - dev_err(mpu_dev, "%s: unable to find MPU OPP for %d\n", - __func__, freqs.new); - return -EINVAL; - } - volt = opp_get_voltage(opp); - tol = volt * OPP_TOLERANCE / 100; - volt_old = regulator_get_voltage(mpu_reg); - } - - dev_dbg(mpu_dev, "cpufreq-omap: %u MHz, %ld mV --> %u MHz, %ld mV\n", - freqs.old / 1000, volt_old ? volt_old / 1000 : -1, - freqs.new / 1000, volt ? volt / 1000 : -1); - - /* scaling up? scale voltage before frequency */ - if (mpu_reg && (freqs.new > freqs.old)) { - r = regulator_set_voltage(mpu_reg, volt - tol, volt + tol); - if (r < 0) { - dev_warn(mpu_dev, "%s: unable to scale voltage up.\n", - __func__); - freqs.new = freqs.old; - goto done; - } - } + dev_dbg(mpu_dev, "cpufreq-omap: %u MHz --> %u MHz\n", + freqs.old / 1000, freqs.new / 1000);
ret = clk_set_rate(mpu_clk, freqs.new * 1000);
- /* scaling down? scale voltage after frequency */ - if (mpu_reg && (freqs.new < freqs.old)) { - r = regulator_set_voltage(mpu_reg, volt - tol, volt + tol); - if (r < 0) { - dev_warn(mpu_dev, "%s: unable to scale voltage down.\n", - __func__); - ret = clk_set_rate(mpu_clk, freqs.old * 1000); - freqs.new = freqs.old; - goto done; - } - } - freqs.new = omap_getspeed(policy->cpu);
-done: /* notifiers */ for_each_cpu(i, policy->cpus) { freqs.cpu = i; @@ -172,10 +134,6 @@ static int __cpuinit omap_cpu_init(struct cpufreq_policy *policy) { int result = 0;
- mpu_clk = clk_get(NULL, "cpufreq_ck"); - if (IS_ERR(mpu_clk)) - return PTR_ERR(mpu_clk); - if (policy->cpu >= NR_CPUS) { result = -EINVAL; goto fail_ck; @@ -253,34 +211,34 @@ static struct cpufreq_driver omap_driver = {
static int __init omap_cpufreq_init(void) { + struct dvfs_info_init dii; + + mpu_clk = clk_get(NULL, "cpufreq_ck"); + if (IS_ERR(mpu_clk)) + return PTR_ERR(mpu_clk); + mpu_dev = get_cpu_device(0); if (!mpu_dev) { pr_warning("%s: unable to get the mpu device\n", __func__); return -EINVAL; }
- mpu_reg = regulator_get(mpu_dev, "vcc"); - if (IS_ERR(mpu_reg)) { - pr_warning("%s: unable to get MPU regulator\n", __func__); - mpu_reg = NULL; - } else { - /* - * Ensure physical regulator is present. - * (e.g. could be dummy regulator.) - */ - if (regulator_get_voltage(mpu_reg) < 0) { - pr_warn("%s: physical regulator not present for MPU\n", + dii.dev = mpu_dev; + dii.con_id = "cpufreq_ck"; + dii.reg_id = "vcc"; + dii.tol = OPP_TOLERANCE; + + di = dvfs_clk_notifier_register(&dii); + if (IS_ERR(di)) + pr_warning("%s: failed to register dvfs clk notifier\n", __func__); - regulator_put(mpu_reg); - mpu_reg = NULL; - } - }
return cpufreq_register_driver(&omap_driver); }
static void __exit omap_cpufreq_exit(void) { + dvfs_clk_notifier_unregister(di); cpufreq_unregister_driver(&omap_driver); }
This is a silly patch that demonstrates calling clk_set_parent from within a .set_rate callback, which itself was called by clk_set_rate. It may make your board burst into flames or otherwise void various warrantees.
I do not suggest that the OMAP folks take this approach in unless they really want to. Instead it was a way for me to increase code coverage while testing the reentrancy changes to the core clock framework.
Changes in this patch include removing __clk_prepare and __clk_unprepare from omap3_noncore_dpll_set_rate and using the (now reentrant) clk_prepare & clk_unprepare versions. Most importantly this patch introduces omap3_noncore_dpll_set_parent and adds it to the clk_ops for all OMAP3+ DPLLs.
The net gain is that on OMAP4 platforms it is now possible to call clk_set_parent(some_dpll_ck, ...) in order to change the PLL input from the reference clock to the bypass clock, and vice versa.
omap3_noncore_dpll_set_rate is modified to call clk_set_parent when appropriate as a way to test reentrancy.
Not-signed-off-by: Mike Turquette mturquette@linaro.org --- arch/arm/mach-omap2/cclock44xx_data.c | 1 + arch/arm/mach-omap2/clock.h | 1 + arch/arm/mach-omap2/dpll3xxx.c | 107 +++++++++++++++++++++++++-------- 3 files changed, 84 insertions(+), 25 deletions(-)
diff --git a/arch/arm/mach-omap2/cclock44xx_data.c b/arch/arm/mach-omap2/cclock44xx_data.c index 5789a5e..df5da7f 100644 --- a/arch/arm/mach-omap2/cclock44xx_data.c +++ b/arch/arm/mach-omap2/cclock44xx_data.c @@ -386,6 +386,7 @@ static const struct clk_ops dpll_ck_ops = { .round_rate = &omap2_dpll_round_rate, .set_rate = &omap3_noncore_dpll_set_rate, .get_parent = &omap2_init_dpll_parent, + .set_parent = &omap3_noncore_dpll_set_parent, };
static struct clk_hw_omap dpll_iva_ck_hw = { diff --git a/arch/arm/mach-omap2/clock.h b/arch/arm/mach-omap2/clock.h index b402048..1cf43a5 100644 --- a/arch/arm/mach-omap2/clock.h +++ b/arch/arm/mach-omap2/clock.h @@ -367,6 +367,7 @@ long omap2_dpll_round_rate(struct clk_hw *hw, unsigned long target_rate, unsigned long omap3_dpll_recalc(struct clk_hw *hw, unsigned long parent_rate); int omap3_noncore_dpll_enable(struct clk_hw *hw); void omap3_noncore_dpll_disable(struct clk_hw *hw); +int omap3_noncore_dpll_set_parent(struct clk_hw *hw, u8 index); int omap3_noncore_dpll_set_rate(struct clk_hw *hw, unsigned long rate, unsigned long parent_rate); u32 omap3_dpll_autoidle_read(struct clk_hw_omap *clk); diff --git a/arch/arm/mach-omap2/dpll3xxx.c b/arch/arm/mach-omap2/dpll3xxx.c index 0a02aab5..bae123e 100644 --- a/arch/arm/mach-omap2/dpll3xxx.c +++ b/arch/arm/mach-omap2/dpll3xxx.c @@ -450,6 +450,76 @@ void omap3_noncore_dpll_disable(struct clk_hw *hw) clkdm_clk_disable(clk->clkdm, hw->clk); }
+/* Non-CORE DPLL set parent code */ + +/** + * omap3_noncore_dpll_set_parent - set non-core DPLL input + * @hw: hardware object for this clock/dpll + * @index: parent to switch to in the array of possible parents + * + * Sets the input to the DPLL to either the reference clock or bypass + * clock. Returns error code upon failure or 0 upon success. + */ +int omap3_noncore_dpll_set_parent(struct clk_hw *hw, u8 index) +{ + struct clk_hw_omap *clk = to_clk_hw_omap(hw); + u16 freqsel = 0; + struct dpll_data *dd; + int ret; + + if (!hw) + return -EINVAL; + + dd = clk->dpll_data; + if (!dd) + return -EINVAL; + + clk_prepare(dd->clk_bypass); + clk_enable(dd->clk_bypass); + clk_prepare(dd->clk_ref); + clk_enable(dd->clk_ref); + + /* FIXME hard coded magic numbers are gross */ + switch (index) { + /* dpll input is the reference clock */ + case 0: + if (dd->last_rounded_rate == 0) + return -EINVAL; + + /* No freqsel on OMAP4 and OMAP3630 */ + if (!cpu_is_omap44xx() && !cpu_is_omap3630()) { + freqsel = _omap3_dpll_compute_freqsel(clk, + dd->last_rounded_n); + WARN_ON(!freqsel); + } + + pr_debug("%s: %s: set rate: locking rate to %lu.\n", + __func__, __clk_get_name(hw->clk), dd->last_rounded_rate); + + ret = omap3_noncore_dpll_program(clk, freqsel); + break; + + /* dpll input is the bypass clock */ + case 1: + pr_debug("%s: %s: set rate: entering bypass.\n", + __func__, __clk_get_name(hw->clk)); + + ret = _omap3_noncore_dpll_bypass(clk); + break; + + default: + pr_warn("%s: %s: set parent: invalid parent\n", + __func__, __clk_get_name(hw->clk)); + return -EINVAL; + } + + clk_disable(dd->clk_ref); + clk_unprepare(dd->clk_ref); + clk_disable(dd->clk_bypass); + clk_unprepare(dd->clk_bypass); + + return 0; +}
/* Non-CORE DPLL rate set code */
@@ -468,7 +538,6 @@ int omap3_noncore_dpll_set_rate(struct clk_hw *hw, unsigned long rate, unsigned long parent_rate) { struct clk_hw_omap *clk = to_clk_hw_omap(hw); - struct clk *new_parent = NULL; u16 freqsel = 0; struct dpll_data *dd; int ret; @@ -480,22 +549,18 @@ int omap3_noncore_dpll_set_rate(struct clk_hw *hw, unsigned long rate, if (!dd) return -EINVAL;
- __clk_prepare(dd->clk_bypass); + clk_prepare(dd->clk_bypass); clk_enable(dd->clk_bypass); - __clk_prepare(dd->clk_ref); + clk_prepare(dd->clk_ref); clk_enable(dd->clk_ref);
- if (__clk_get_rate(dd->clk_bypass) == rate && + /* FIXME below block should call clk_set_parent */ + if (clk_get_rate(dd->clk_bypass) == rate && (dd->modes & (1 << DPLL_LOW_POWER_BYPASS))) { - pr_debug("%s: %s: set rate: entering bypass.\n", - __func__, __clk_get_name(hw->clk)); - - ret = _omap3_noncore_dpll_bypass(clk); - if (!ret) - new_parent = dd->clk_bypass; + clk_set_parent(hw->clk, dd->clk_bypass); } else { if (dd->last_rounded_rate != rate) - rate = __clk_round_rate(hw->clk, rate); + rate = clk_round_rate(hw->clk, rate);
if (dd->last_rounded_rate == 0) return -EINVAL; @@ -510,24 +575,16 @@ int omap3_noncore_dpll_set_rate(struct clk_hw *hw, unsigned long rate, pr_debug("%s: %s: set rate: locking rate to %lu.\n", __func__, __clk_get_name(hw->clk), rate);
- ret = omap3_noncore_dpll_program(clk, freqsel); - if (!ret) - new_parent = dd->clk_ref; + if (clk_get_parent(hw->clk) == dd->clk_bypass) + clk_set_parent(hw->clk, dd->clk_ref); + else + ret = omap3_noncore_dpll_program(clk, freqsel); } - /* - * FIXME - this is all wrong. common code handles reparenting and - * migrating prepare/enable counts. dplls should be a multiplexer - * clock and this should be a set_parent operation so that all of that - * stuff is inherited for free - */ - - if (!ret) - __clk_reparent(hw->clk, new_parent);
clk_disable(dd->clk_ref); - __clk_unprepare(dd->clk_ref); + clk_unprepare(dd->clk_ref); clk_disable(dd->clk_bypass); - __clk_unprepare(dd->clk_bypass); + clk_unprepare(dd->clk_bypass);
return 0; }
From: Mike Turquette mturquette@ti.com
The following is another silly patch which was done to test calling clk_set_parent from within a call to clk_set_rate. It may make your board burst into flames or otherwise void various warrantees.
This patch introduces a 400MHz OPP for the MPU, which happens to correspond to the bypass clk rate on the 4430 Panda board and 4460 Panda ES board, both using a 38.4MHz SYS_CLK oscillator rate.
One may test this by using the cpufreq-userspace governor and executing: echo 400000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed
On Panda ES validation can be done via: $ find /debug/clk/ -iname "dpll_mpu_ck" /debug/clk/virt_38400000_ck/sys_clkin_ck/dpll_mpu_ck $ echo 400000 > scaling_setspeed $ find /debug/clk/ -iname "dpll_mpu_ck" /debug/clk/virt_38400000_ck/sys_clkin_ck/dpll_core_ck/dpll_core_x2_ck/dpll_core_m5x2_ck/div_mpu_hs_clk/dpll_mpu_ck $ cat /proc/cpuinfo ... BogoMIPS : 794.26 ... $ cat /sys/class/regulator/regulator.3/microvolts 1200000
Not-signed-off-by: Mike Turquette mturquette@ti.com --- arch/arm/mach-omap2/opp4xxx_data.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+)
diff --git a/arch/arm/mach-omap2/opp4xxx_data.c b/arch/arm/mach-omap2/opp4xxx_data.c index d470b72..c7bccf7 100644 --- a/arch/arm/mach-omap2/opp4xxx_data.c +++ b/arch/arm/mach-omap2/opp4xxx_data.c @@ -67,6 +67,15 @@ struct omap_volt_data omap443x_vdd_core_volt_data[] = { static struct omap_opp_def __initdata omap443x_opp_def_list[] = { /* MPU OPP1 - OPP50 */ OPP_INITIALIZER("mpu", true, 300000000, OMAP4430_VDD_MPU_OPP50_UV), + /* + * MPU OPP1.5 - 400MHz - completely FAKE - not endorsed by TI + * + * DPLL_MPU is in Low Power Bypass driven by DPLL_CORE. After + * transitioning to this OPP you can see the migration in debugfs: + * /d/clk/virt_38400000_ck/sys_clkin_ck/dpll_mpu_ck to + * /d/.../dpll_core_ck/dpll_core_x2_ck/dpll_core_m5x2_ck/div_mpu_hs_clk + */ + OPP_INITIALIZER("mpu", true, 400000000, 1100000), /* MPU OPP2 - OPP100 */ OPP_INITIALIZER("mpu", true, 600000000, OMAP4430_VDD_MPU_OPP100_UV), /* MPU OPP3 - OPP-Turbo */ @@ -126,6 +135,15 @@ struct omap_volt_data omap446x_vdd_core_volt_data[] = { static struct omap_opp_def __initdata omap446x_opp_def_list[] = { /* MPU OPP1 - OPP50 */ OPP_INITIALIZER("mpu", true, 350000000, OMAP4460_VDD_MPU_OPP50_UV), + /* + * MPU OPP1.5 - 400MHz - completely FAKE - not endorsed by TI + * + * DPLL_MPU is in Low Power Bypass driven by DPLL_CORE. After + * transitioning to this OPP you can see the migration in debugfs: + * /d/clk/virt_38400000_ck/sys_clkin_ck/dpll_mpu_ck to + * /d/.../dpll_core_ck/dpll_core_x2_ck/dpll_core_m5x2_ck/div_mpu_hs_clk + */ + OPP_INITIALIZER("mpu", true, 400000000, 1100000), /* MPU OPP2 - OPP100 */ OPP_INITIALIZER("mpu", true, 700000000, OMAP4460_VDD_MPU_OPP100_UV), /* MPU OPP3 - OPP-Turbo */