Hi,
The main feedback I got for the V3 series came from Kevin, who suggested that we should reuse OPP tables for genpd devices as well, instead of creating a new table type. And that's what this version is trying to do.
Some platforms have the capability to configure the performance state of their power domains. The process of configuring the performance state is pretty much platform dependent and we may need to work with a wide range of configurables. For some platforms, like Qcom, it can be a positive integer value alone, while in other cases it can be voltage levels, etc.
The power-domain framework until now was only designed for the idle state management of the device and this needs to change in order to reuse the power-domain framework for active state management of the devices.
This series adapts the genpd and OPP frameworks to allow OPP tables to be used for the genpd devices as well.
The first 2 patches update the DT bindings of the power-domains and OPP tables. And the other 7 patches implement the details in QoS, genpd and OPP frameworks.
This is tested currently by hacking the kernel a bit with virtual power-domains for the dual A15 exynos platform. The earlier version of patches was also tested by Rajendra Nayak (Qcom) on *real* Qualcomm hardware for which this work is getting done. And so this version should work as well.
Here is sample DT and C code we need to write for platforms:
DT: ---
/ { domain_opp_table: opp_table0 { compatible = "operating-points-v2";
opp@1 { domain-performance-state = <1>; opp-microvolt = <975000 970000 985000>; }; opp@2 { domain-performance-state = <2>; opp-microvolt = <1075000 1000000 1085000>; }; };
foo_domain: power-controller@12340000 { compatible = "foo,power-controller"; reg = <0x12340000 0x1000>; #power-domain-cells = <0>; operating-points-v2 = <&domain_opp_table>; }
cpu0_opp_table: opp_table1 { compatible = "operating-points-v2"; opp-shared;
opp@1000000000 { opp-hz = /bits/ 64 <1000000000>; domain-performance-state = <1>; }; opp@1100000000 { opp-hz = /bits/ 64 <1100000000>; domain-performance-state = <2>; }; opp@1200000000 { opp-hz = /bits/ 64 <1200000000>; domain-performance-state = <2>; }; };
cpus { #address-cells = <1>; #size-cells = <0>;
cpu@0 { compatible = "arm,cortex-a9"; reg = <0>; clocks = <&clk_controller 0>; clock-names = "cpu"; operating-points-v2 = <&cpu0_opp_table>; power-domains = <&foo_domain>; }; }; };
Driver code: ------------
static int pd_performance(struct generic_pm_domain *domain, unsigned int state) { struct dev_pm_opp *opp;
opp = dev_pm_opp_find_dps(&domain->dev, state, true);
/* Use OPP and state in platform specific way */
return 0; }
static const struct of_device_id pm_domain_of_match[] __initconst = { { .compatible = "foo,genpd", }, { }, };
static int __init genpd_test_init(void) { struct device *dev = get_cpu_device(0); struct device_node *np; const struct of_device_id *match; int n; int ret;
for_each_matching_node_and_match(np, pm_domain_of_match, &match) { pd.name = kstrdup_const(strrchr(np->full_name, '/') + 1, GFP_KERNEL); if (!pd.name) { of_node_put(np); return -ENOMEM; }
pd.set_performance_state = pd_performance;
pm_genpd_init(&pd, NULL, false); of_genpd_add_provider_simple(np, &pd); }
ret = dev_pm_domain_attach(dev, false);
return ret; }
Pushed here as well:
https://git.linaro.org/people/viresh.kumar/linux.git/log/?h=opp/genpd-perfor...
V3->V4: - Use OPP table for genpd devices as well. - Add struct device to genpd, in order to reuse OPP infrastructure. - Based over: https://marc.info/?l=linux-kernel&m=148972988002317&w=2 - Fixed examples in DT document to have voltage in target,min,max order.
V2->V3: - Based over latest pm/linux-next - Bindings and code are merged together - Lots of updates in bindings - the performance-states node is present within the power-domain now, instead of its phandle. - performance-level property is replaced by "reg". - domain-performance-state property of the consumers contain an integer value now instead of phandle. - Lots of updates to the code as well - Patch "PM / QOS: Add default case to the switch" is merged with other patches and the code is changed a bit as well. - Don't pass 'type' to dev_pm_qos_add_notifier(), rather handle all notifiers with a single list. A new patch is added for that. - The OPP framework patch can be applied now and has proper SoB from me. - Dropped "PM / domain: Save/restore performance state at runtime suspend/resume". - Drop all WARN(). - Tested-by Rajendra nayak.
V1->V2: - Based over latest pm/linux-next - It is mostly a resend of what is sent earlier as this series hasn't got any reviews so far and Rafael suggested that its better I resend it. - Only the 4/6 patch got an update, which was shared earlier as reply to V1 as well. It has got several fixes for taking care of power domain hierarchy, etc.
-- viresh
Viresh Kumar (9): PM / OPP: Allow OPP table to be used for power-domains PM / Domains: Use OPP tables for power-domains PM / QOS: Keep common notifier list for genpd constraints PM / QOS: Add DEV_PM_QOS_PERFORMANCE request PM / OPP: Add support to parse OPP table for power-domains PM / OPP: Add dev_pm_opp_find_dps() helper PM / domain: Register for PM QOS performance notifier PM / Domain: Add struct device to genpd PM / Domain: Add support to parse domain's OPP table
Documentation/devicetree/bindings/opp/opp.txt | 73 ++++++- .../devicetree/bindings/power/power_domain.txt | 42 ++++ Documentation/power/pm_qos_interface.txt | 2 +- drivers/base/power/domain.c | 183 ++++++++++++++-- drivers/base/power/opp/core.c | 229 +++++++++++++++++++-- drivers/base/power/opp/debugfs.c | 9 +- drivers/base/power/opp/of.c | 80 ++++++- drivers/base/power/opp/opp.h | 14 ++ drivers/base/power/qos.c | 36 +++- include/linux/pm_domain.h | 6 + include/linux/pm_opp.h | 8 + include/linux/pm_qos.h | 17 ++ kernel/power/qos.c | 2 +- 13 files changed, 646 insertions(+), 55 deletions(-)
Power-domains need to express their active states in DT and what's better than OPP table for that.
This patch allows power-domains to reuse OPP tables to express their active states. The "opp-hz" property isn't a required property anymore as power-domains may not always use them.
Add a new property "domain-performance-state", which will contain positive integer values to represent performance levels of the power-domains as described in this patch.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- Documentation/devicetree/bindings/opp/opp.txt | 73 ++++++++++++++++++++++++++- 1 file changed, 71 insertions(+), 2 deletions(-)
diff --git a/Documentation/devicetree/bindings/opp/opp.txt b/Documentation/devicetree/bindings/opp/opp.txt index 63725498bd20..d0b95c9e1011 100644 --- a/Documentation/devicetree/bindings/opp/opp.txt +++ b/Documentation/devicetree/bindings/opp/opp.txt @@ -76,10 +76,9 @@ This describes the OPPs belonging to a device. This node can have following This defines voltage-current-frequency combinations along with other related properties.
-Required properties: +Optional properties: - opp-hz: Frequency in Hz, expressed as a 64-bit big-endian integer.
-Optional properties: - opp-microvolt: voltage in micro Volts.
A single regulator's voltage is specified with an array of size one or three. @@ -154,6 +153,19 @@ properties.
- status: Marks the node enabled/disabled.
+- domain-performance-state: A positive integer value representing the minimum + power-domain performance level required by the device for the OPP node. The + integer value '0' represents the lowest performance level and the higher + values represent higher performance levels. When present in the OPP table of a + power-domain, it represents the performance level of the domain. When present + in the OPP table of a normal device, it represents the performance level of + the parent power-domain. The OPP table can contain the + "domain-performance-state" property, only if the device node contains the + "power-domains" or "#power-domain-cells" property. The OPP nodes aren't + allowed to contain the "domain-performance-state" property partially, i.e. + Either all OPP nodes in the OPP table have the "domain-performance-state" + property or none of them have it. + Example 1: Single cluster Dual-core ARM cortex A9, switch DVFS states together.
/ { @@ -528,3 +540,60 @@ Example 5: opp-supported-hw }; }; }; + +Example 7: domain-Performance-state: +(example: For 1GHz require domain state 1 and for 1.1 & 1.2 GHz require state 2) + +/ { + domain_opp_table: opp_table0 { + compatible = "operating-points-v2"; + + opp@1 { + domain-performance-state = <1>; + opp-microvolt = <975000 970000 985000>; + }; + opp@2 { + domain-performance-state = <2>; + opp-microvolt = <1075000 1000000 1085000>; + }; + }; + + foo_domain: power-controller@12340000 { + compatible = "foo,power-controller"; + reg = <0x12340000 0x1000>; + #power-domain-cells = <0>; + operating-points-v2 = <&domain_opp_table>; + } + + cpu0_opp_table: opp_table1 { + compatible = "operating-points-v2"; + opp-shared; + + opp@1000000000 { + opp-hz = /bits/ 64 <1000000000>; + domain-performance-state = <1>; + }; + opp@1100000000 { + opp-hz = /bits/ 64 <1100000000>; + domain-performance-state = <2>; + }; + opp@1200000000 { + opp-hz = /bits/ 64 <1200000000>; + domain-performance-state = <2>; + }; + }; + + cpus { + #address-cells = <1>; + #size-cells = <0>; + + cpu@0 { + compatible = "arm,cortex-a9"; + reg = <0>; + clocks = <&clk_controller 0>; + clock-names = "cpu"; + operating-points-v2 = <&cpu0_opp_table>; + power-domains = <&foo_domain>; + }; + }; +};
On Mon, Mar 20, 2017 at 03:02:13PM +0530, Viresh Kumar wrote:
Power-domains need to express their active states in DT and what's better than OPP table for that.
This patch allows power-domains to reuse OPP tables to express their active states. The "opp-hz" property isn't a required property anymore as power-domains may not always use them.
Then maybe you shouldn't be trying to make OPP table work here. At that point you just need a table of voltage(s) per performance state?
Add a new property "domain-performance-state", which will contain positive integer values to represent performance levels of the power-domains as described in this patch.
Why not reference the OPP entries from the domain:
performance-states = <&opp1>, <&opp2>;
Just thinking out loud, not saying that is what you should do. The continual evolution of power (management) domain, idle state, and OPP bindings is getting tiring.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
Documentation/devicetree/bindings/opp/opp.txt | 73 ++++++++++++++++++++++++++- 1 file changed, 71 insertions(+), 2 deletions(-)
On 24-03-17, 10:44, Rob Herring wrote:
On Mon, Mar 20, 2017 at 03:02:13PM +0530, Viresh Kumar wrote:
Power-domains need to express their active states in DT and what's better than OPP table for that.
This patch allows power-domains to reuse OPP tables to express their active states. The "opp-hz" property isn't a required property anymore as power-domains may not always use them.
Then maybe you shouldn't be trying to make OPP table work here. At that point you just need a table of voltage(s) per performance state?
Because that's what Kevin strongly recommended in the previous versions.
@Kevin: Would you like to reply here ?
Add a new property "domain-performance-state", which will contain positive integer values to represent performance levels of the power-domains as described in this patch.
Why not reference the OPP entries from the domain:
performance-states = <&opp1>, <&opp2>;
Because that would require additional code in the OPP core to parse these then. Right now it is quite straight forward with the bindings I presented.
Just thinking out loud, not saying that is what you should do. The continual evolution of power (management) domain, idle state, and OPP bindings is getting tiring.
I agree :)
Fixing Kevin's email id :(
On 10-04-17, 14:55, Viresh Kumar wrote:
On 24-03-17, 10:44, Rob Herring wrote:
On Mon, Mar 20, 2017 at 03:02:13PM +0530, Viresh Kumar wrote:
Power-domains need to express their active states in DT and what's better than OPP table for that.
This patch allows power-domains to reuse OPP tables to express their active states. The "opp-hz" property isn't a required property anymore as power-domains may not always use them.
Then maybe you shouldn't be trying to make OPP table work here. At that point you just need a table of voltage(s) per performance state?
Because that's what Kevin strongly recommended in the previous versions.
@Kevin: Would you like to reply here ?
Add a new property "domain-performance-state", which will contain positive integer values to represent performance levels of the power-domains as described in this patch.
Why not reference the OPP entries from the domain:
performance-states = <&opp1>, <&opp2>;
Because that would require additional code in the OPP core to parse these then. Right now it is quite straight forward with the bindings I presented.
Just thinking out loud, not saying that is what you should do. The continual evolution of power (management) domain, idle state, and OPP bindings is getting tiring.
I agree :)
On 20/03/17 09:32, Viresh Kumar wrote:
Power-domains need to express their active states in DT and what's better than OPP table for that.
This patch allows power-domains to reuse OPP tables to express their active states. The "opp-hz" property isn't a required property anymore as power-domains may not always use them.
Add a new property "domain-performance-state", which will contain positive integer values to represent performance levels of the power-domains as described in this patch.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
Documentation/devicetree/bindings/opp/opp.txt | 73 ++++++++++++++++++++++++++- 1 file changed, 71 insertions(+), 2 deletions(-)
diff --git a/Documentation/devicetree/bindings/opp/opp.txt b/Documentation/devicetree/bindings/opp/opp.txt index 63725498bd20..d0b95c9e1011 100644 --- a/Documentation/devicetree/bindings/opp/opp.txt +++ b/Documentation/devicetree/bindings/opp/opp.txt @@ -76,10 +76,9 @@ This describes the OPPs belonging to a device. This node can have following This defines voltage-current-frequency combinations along with other related properties. -Required properties: +Optional properties:
- opp-hz: Frequency in Hz, expressed as a 64-bit big-endian integer.
-Optional properties:
- opp-microvolt: voltage in micro Volts.
A single regulator's voltage is specified with an array of size one or three. @@ -154,6 +153,19 @@ properties.
- status: Marks the node enabled/disabled.
+- domain-performance-state: A positive integer value representing the minimum
- power-domain performance level required by the device for the OPP node. The
So the above definition is when this field in in the device node rather than the OPP table entry, right ? For simplicity why not have the properties named slightly different or just use phandle to an entry in the device node for this purpose.
- The integer value '0' represents the lowest performance level and the higher
- values represent higher performance levels.
needs to be changed as OPP table entry.
When present in the OPP table of a
- power-domain, it represents the performance level of the domain. When present
again "performance level of the domain corresponding to that OPP entry" on something similar
- in the OPP table of a normal device, it represents the performance level of
what do you mean by normal device ? needs description as that's something new introduced here.
- the parent power-domain. The OPP table can contain the
- "domain-performance-state" property, only if the device node contains the
- "power-domains" or "#power-domain-cells" property.
Why such a restriction ?
The OPP nodes aren't
- allowed to contain the "domain-performance-state" property partially, i.e.
- Either all OPP nodes in the OPP table have the "domain-performance-state"
- property or none of them have it.
Example 1: Single cluster Dual-core ARM cortex A9, switch DVFS states together. / { @@ -528,3 +540,60 @@ Example 5: opp-supported-hw }; }; };
+Example 7: domain-Performance-state: +(example: For 1GHz require domain state 1 and for 1.1 & 1.2 GHz require state 2)
+/ {
- domain_opp_table: opp_table0 {
compatible = "operating-points-v2";
opp@1 {
domain-performance-state = <1>;
opp-microvolt = <975000 970000 985000>;
};
opp@2 {
domain-performance-state = <2>;
opp-microvolt = <1075000 1000000 1085000>;
};
- };
- foo_domain: power-controller@12340000 {
compatible = "foo,power-controller";
reg = <0x12340000 0x1000>;
#power-domain-cells = <0>;
operating-points-v2 = <&domain_opp_table>;
How does it scale with power domain providers with multiple power domain ?
- }
- cpu0_opp_table: opp_table1 {
compatible = "operating-points-v2";
opp-shared;
opp@1000000000 {
opp-hz = /bits/ 64 <1000000000>;
domain-performance-state = <1>;
};
opp@1100000000 {
opp-hz = /bits/ 64 <1100000000>;
domain-performance-state = <2>;
};
opp@1200000000 {
opp-hz = /bits/ 64 <1200000000>;
domain-performance-state = <2>;
};
- };
- cpus {
#address-cells = <1>;
#size-cells = <0>;
cpu@0 {
compatible = "arm,cortex-a9";
reg = <0>;
clocks = <&clk_controller 0>;
clock-names = "cpu";
operating-points-v2 = <&cpu0_opp_table>;
Do we ignore operating-points-v2 above as this device/cpu node contains power domain which has operating-points-v2 property ? In other words how do they correlate ?
power-domains = <&foo_domain>;
};
- };
+};
On 12-04-17, 17:49, Sudeep Holla wrote:
On 20/03/17 09:32, Viresh Kumar wrote:
diff --git a/Documentation/devicetree/bindings/opp/opp.txt b/Documentation/devicetree/bindings/opp/opp.txt index 63725498bd20..d0b95c9e1011 100644 --- a/Documentation/devicetree/bindings/opp/opp.txt +++ b/Documentation/devicetree/bindings/opp/opp.txt @@ -76,10 +76,9 @@ This describes the OPPs belonging to a device. This node can have following This defines voltage-current-frequency combinations along with other related properties. -Required properties: +Optional properties:
- opp-hz: Frequency in Hz, expressed as a 64-bit big-endian integer.
-Optional properties:
- opp-microvolt: voltage in micro Volts.
A single regulator's voltage is specified with an array of size one or three. @@ -154,6 +153,19 @@ properties.
- status: Marks the node enabled/disabled.
+- domain-performance-state: A positive integer value representing the minimum
- power-domain performance level required by the device for the OPP node. The
So the above definition is when this field in in the device node rather than the OPP table entry, right ?
No. We are updating the opp.txt file here and so it is not about the device node. The OPP node entries will contain this field for two cases: - The OPP table belongs to a power domain - The OPP table belongs to a device whose power domain supports performance-states.
For simplicity why not have the properties named slightly different or just use phandle to an entry in the device node for this purpose.
We really need a value here. For example, in case where the OPP table defines the states of the power-domain itself, we don't have any phandles to point to.
- The integer value '0' represents the lowest performance level and the higher
- values represent higher performance levels.
needs to be changed as OPP table entry.
Not sure I understood what change you are looking for :(
When present in the OPP table of a
- power-domain, it represents the performance level of the domain. When present
again "performance level of the domain corresponding to that OPP entry" on something similar
Ok.
- in the OPP table of a normal device, it represents the performance level of
what do you mean by normal device ? needs description as that's something new introduced here.
It should be non-power-domain node.
- the parent power-domain. The OPP table can contain the
- "domain-performance-state" property, only if the device node contains the
- "power-domains" or "#power-domain-cells" property.
Why such a restriction ?
Why would we use it for non-power-domain cases? That's not what we are looking for..
The OPP nodes aren't
- allowed to contain the "domain-performance-state" property partially, i.e.
- Either all OPP nodes in the OPP table have the "domain-performance-state"
- property or none of them have it.
Example 1: Single cluster Dual-core ARM cortex A9, switch DVFS states together. / { @@ -528,3 +540,60 @@ Example 5: opp-supported-hw }; }; };
+Example 7: domain-Performance-state: +(example: For 1GHz require domain state 1 and for 1.1 & 1.2 GHz require state 2)
+/ {
- domain_opp_table: opp_table0 {
compatible = "operating-points-v2";
opp@1 {
domain-performance-state = <1>;
opp-microvolt = <975000 970000 985000>;
};
opp@2 {
domain-performance-state = <2>;
opp-microvolt = <1075000 1000000 1085000>;
};
- };
- foo_domain: power-controller@12340000 {
compatible = "foo,power-controller";
reg = <0x12340000 0x1000>;
#power-domain-cells = <0>;
operating-points-v2 = <&domain_opp_table>;
How does it scale with power domain providers with multiple power domain ?
Devices can't have multiple power domains today. Will see this when that support is added.
Note that only the power domains can have multiple parent power domains today.
- }
- cpu0_opp_table: opp_table1 {
compatible = "operating-points-v2";
opp-shared;
opp@1000000000 {
opp-hz = /bits/ 64 <1000000000>;
domain-performance-state = <1>;
};
opp@1100000000 {
opp-hz = /bits/ 64 <1100000000>;
domain-performance-state = <2>;
};
opp@1200000000 {
opp-hz = /bits/ 64 <1200000000>;
domain-performance-state = <2>;
};
- };
- cpus {
#address-cells = <1>;
#size-cells = <0>;
cpu@0 {
compatible = "arm,cortex-a9";
reg = <0>;
clocks = <&clk_controller 0>;
clock-names = "cpu";
operating-points-v2 = <&cpu0_opp_table>;
Do we ignore operating-points-v2 above as this device/cpu node contains power domain which has operating-points-v2 property ? In other words how do they correlate ?
Devices and their power domains can both have their performance states. Just that to get the device in a particular state, we may need to get its power domain to a particular state first.
On 13/04/17 06:37, Viresh Kumar wrote:
On 12-04-17, 17:49, Sudeep Holla wrote:
On 20/03/17 09:32, Viresh Kumar wrote:
diff --git a/Documentation/devicetree/bindings/opp/opp.txt b/Documentation/devicetree/bindings/opp/opp.txt index 63725498bd20..d0b95c9e1011 100644 --- a/Documentation/devicetree/bindings/opp/opp.txt +++ b/Documentation/devicetree/bindings/opp/opp.txt @@ -76,10 +76,9 @@ This describes the OPPs belonging to a device. This node can have following This defines voltage-current-frequency combinations along with other related properties. -Required properties: +Optional properties:
- opp-hz: Frequency in Hz, expressed as a 64-bit big-endian integer.
-Optional properties:
- opp-microvolt: voltage in micro Volts.
A single regulator's voltage is specified with an array of size one or three. @@ -154,6 +153,19 @@ properties.
- status: Marks the node enabled/disabled.
+- domain-performance-state: A positive integer value representing the minimum
- power-domain performance level required by the device for the OPP node. The
So the above definition is when this field in in the device node rather than the OPP table entry, right ?
No. We are updating the opp.txt file here and so it is not about the device node. The OPP node entries will contain this field for two cases:
- The OPP table belongs to a power domain
- The OPP table belongs to a device whose power domain supports performance-states.
Understood.
For simplicity why not have the properties named slightly different or just use phandle to an entry in the device node for this purpose.
We really need a value here. For example, in case where the OPP table defines the states of the power-domain itself, we don't have any phandles to point to.
OK
- The integer value '0' represents the lowest performance level and the higher
- values represent higher performance levels.
needs to be changed as OPP table entry.
Not sure I understood what change you are looking for :(
Looks like I commented the same thing below, just redundant comment here. Sorry about that.
When present in the OPP table of a
- power-domain, it represents the performance level of the domain. When present
again "performance level of the domain corresponding to that OPP entry" on something similar
Ok.
- in the OPP table of a normal device, it represents the performance level of
what do you mean by normal device ? needs description as that's something new introduced here.
It should be non-power-domain node.
OK
- the parent power-domain. The OPP table can contain the
- "domain-performance-state" property, only if the device node contains the
- "power-domains" or "#power-domain-cells" property.
Why such a restriction ?
Why would we use it for non-power-domain cases? That's not what we are looking for..
OK I was imagining that this would abstract clocks and regulators and hence thinking of other possibilities.
The OPP nodes aren't
- allowed to contain the "domain-performance-state" property partially, i.e.
- Either all OPP nodes in the OPP table have the "domain-performance-state"
- property or none of them have it.
Example 1: Single cluster Dual-core ARM cortex A9, switch DVFS states together. / { @@ -528,3 +540,60 @@ Example 5: opp-supported-hw }; }; };
+Example 7: domain-Performance-state: +(example: For 1GHz require domain state 1 and for 1.1 & 1.2 GHz require state 2)
+/ {
- domain_opp_table: opp_table0 {
compatible = "operating-points-v2";
opp@1 {
domain-performance-state = <1>;
opp-microvolt = <975000 970000 985000>;
};
opp@2 {
domain-performance-state = <2>;
opp-microvolt = <1075000 1000000 1085000>;
};
- };
- foo_domain: power-controller@12340000 {
compatible = "foo,power-controller";
reg = <0x12340000 0x1000>;
#power-domain-cells = <0>;
operating-points-v2 = <&domain_opp_table>;
How does it scale with power domain providers with multiple power domain ?
Devices can't have multiple power domains today. Will see this when that support is added.
Agreed and I see some working already happening on that, so yes we can add that later.
What I was referring is about power domain provider with multiple power domains(simply #power-domain-cells=<1> case as explained in the power-domain specification.
Note that only the power domains can have multiple parent power domains today.
- }
- cpu0_opp_table: opp_table1 {
compatible = "operating-points-v2";
opp-shared;
opp@1000000000 {
opp-hz = /bits/ 64 <1000000000>;
domain-performance-state = <1>;
};
opp@1100000000 {
opp-hz = /bits/ 64 <1100000000>;
domain-performance-state = <2>;
};
opp@1200000000 {
opp-hz = /bits/ 64 <1200000000>;
domain-performance-state = <2>;
};
- };
- cpus {
#address-cells = <1>;
#size-cells = <0>;
cpu@0 {
compatible = "arm,cortex-a9";
reg = <0>;
clocks = <&clk_controller 0>;
clock-names = "cpu";
operating-points-v2 = <&cpu0_opp_table>;
Do we ignore operating-points-v2 above as this device/cpu node contains power domain which has operating-points-v2 property ? In other words how do they correlate ?
Devices and their power domains can both have their performance states. Just that to get the device in a particular state, we may need to get its power domain to a particular state first.
Yes. To simplify what not we just have power-domain for a device and change state of that domain to change the performance of that device. Then put this in the hierarchy. Some thing similar to what we already have with new domain-idle states. In that way, we can move any performance control to the domain and abstract the clocks and regulators from the devices as the first step and from the OSPM view if there's firmware support.
If we are looking this power-domains with performance as just some *advanced regulators*, I don't like the complexity added.
I am also looking at how does this align with other specifications like ACPI and SCMI that are trying to solve similar issues.
On 13-04-17, 14:42, Sudeep Holla wrote:
What I was referring is about power domain provider with multiple power domains(simply #power-domain-cells=<1> case as explained in the power-domain specification.
I am not sure if we should be looking to target such a situation for now, as that would be like this:
Device controlled by Domain A. Domain A itself is controlled by Domain B and Domain C.
Though we will end up converting the domain-performance-state property to an array if that is required in near future.
Yes. To simplify what not we just have power-domain for a device and change state of that domain to change the performance of that device.
Consider this case to understand what I have in Mind.
The power domain have its states as A, B, C, D. There can be multiple devices regulated by that domain and one of the devices have its power states as: A1, A2, A3, B1, B2, B3, C1, C2, C3, D1, D2, D3 and all these states have different frequency/voltages.
IOW, the devices can have regulators as well and may want to fine tune within the domain performance-state.
Then put this in the hierarchy. Some thing similar to what we already have with new domain-idle states. In that way, we can move any performance control to the domain and abstract the clocks and regulators from the devices as the first step and from the OSPM view if there's firmware support.
If we are looking this power-domains with performance as just some *advanced regulators*, I don't like the complexity added.
In the particular case I am trying to solve (Qcom), we have some sort of regulators which are only programmed by a M3 core. The M3 core needs integer numbers representing state we want the domain to be in and it will put the regulators (or whatever) in a particular state.
On 17/04/17 06:27, Viresh Kumar wrote:
On 13-04-17, 14:42, Sudeep Holla wrote:
What I was referring is about power domain provider with multiple power domains(simply #power-domain-cells=<1> case as explained in the power-domain specification.
I am not sure if we should be looking to target such a situation for now, as that would be like this:
Device controlled by Domain A. Domain A itself is controlled by Domain B and Domain C.
No, may be I was not so clear. I am just referring a power controller that provides say 3 different power domains and are indexed 0 - 2. The consumer just passes the index along with the phandle for the power domain info.
Though we will end up converting the domain-performance-state property to an array if that is required in near future.
OK, better to document that so that we know how to extend it. We have #power-domain-cells=<1> on Juno with SCPI.
Yes. To simplify what not we just have power-domain for a device and change state of that domain to change the performance of that device.
Consider this case to understand what I have in Mind.
The power domain have its states as A, B, C, D. There can be multiple devices regulated by that domain and one of the devices have its power states as: A1, A2, A3, B1, B2, B3, C1, C2, C3, D1, D2, D3 and all these states have different frequency/voltages.
IOW, the devices can have regulators as well and may want to fine tune within the domain performance-state.
Understood. I would incline towards reusing regulators we that's what is changed behind the scene. Calling this operating performance point is misleading and doesn't align well with existing specs/features.
Then put this in the hierarchy. Some thing similar to what we already have with new domain-idle states. In that way, we can move any performance control to the domain and abstract the clocks and regulators from the devices as the first step and from the OSPM view if there's firmware support.
If we are looking this power-domains with performance as just some *advanced regulators*, I don't like the complexity added.
In the particular case I am trying to solve (Qcom), we have some sort of regulators which are only programmed by a M3 core. The M3 core needs integer numbers representing state we want the domain to be in and it will put the regulators (or whatever) in a particular state.
Understood. We have exactly same thing with SCPI but it controls both frequency and voltage referred as operating points. In general, this OPP terminology is used in SCPI/ACPI/SCMI specifications as both frequency and voltage control. I am bit worried that this binding might introduce confusions on the definitions. But it can be reworded/renamed easily if required.
On 18-04-17, 17:01, Sudeep Holla wrote:
No, may be I was not so clear. I am just referring a power controller that provides say 3 different power domains and are indexed 0 - 2. The consumer just passes the index along with the phandle for the power domain info.
Ahh, I got you now. Will take care of it in next version.
On 18-04-17, 17:01, Sudeep Holla wrote:
Understood. I would incline towards reusing regulators we that's what is
It can be just a regulator, but it can be anything else as well. That entity may have its own clock/volt/current tunables, etc.
changed behind the scene. Calling this operating performance point is misleading and doesn't align well with existing specs/features.
Yeah, but there are no voltage levels available here and that doesn't fit as a regulator then.
Understood. We have exactly same thing with SCPI but it controls both frequency and voltage referred as operating points. In general, this OPP terminology is used in SCPI/ACPI/SCMI specifications as both frequency and voltage control. I am bit worried that this binding might introduce confusions on the definitions. But it can be reworded/renamed easily if required.
Yeah, so far we have been looking at OPPs as freq-voltage pairs ONLY and that is changing. I am not sure if it going in the wrong direction really. Without frequency also it is an operating point for the domain. Isn't it?
On 19/04/17 12:47, Viresh Kumar wrote:
On 18-04-17, 17:01, Sudeep Holla wrote:
Understood. I would incline towards reusing regulators we that's what is
It can be just a regulator, but it can be anything else as well. That entity may have its own clock/volt/current tunables, etc.
Agreed.
changed behind the scene. Calling this operating performance point is misleading and doesn't align well with existing specs/features.
Yeah, but there are no voltage levels available here and that doesn't fit as a regulator then.
We can't dismiss just based on that. We do have systems where performance index is mapped to clocks though it may not be 1:1 mapping. I am not disagreeing here, just trying to understand it better.
Understood. We have exactly same thing with SCPI but it controls both frequency and voltage referred as operating points. In general, this OPP terminology is used in SCPI/ACPI/SCMI specifications as both frequency and voltage control. I am bit worried that this binding might introduce confusions on the definitions. But it can be reworded/renamed easily if required.
Yeah, so far we have been looking at OPPs as freq-voltage pairs ONLY and that is changing. I am not sure if it going in the wrong direction really. Without frequency also it is an operating point for the domain. Isn't it?
Yes, I completely agree. I am not saying the direction is wrong. I am saying it's confusing and binding needs to be more clear.
On the contrary(playing devil's advocate here), we can treat all existing regulators alone as OPP then if you strip the voltages and treat it as abstract number. So if the firmware handles more than just regulators, I agree. At the same time, I would have preferred firmware to even abstract the frequency like ACPI CPPC. It would be good to get more information on what exactly that firmware handles.
I am just more cautious here since we are designing generic bindings and changing generic code, we need to understand what that firmware supports and how it may evolve(so that we can maintain DT compatibility)
I did a brief check and wanted to check if this is SMD/RPM regulators ?
On 19-04-17, 14:58, Sudeep Holla wrote:
On 19/04/17 12:47, Viresh Kumar wrote:
On 18-04-17, 17:01, Sudeep Holla wrote:
Understood. I would incline towards reusing regulators we that's what is
It can be just a regulator, but it can be anything else as well. That entity may have its own clock/volt/current tunables, etc.
changed behind the scene. Calling this operating performance point is misleading and doesn't align well with existing specs/features.
Yeah, but there are no voltage levels available here and that doesn't fit as a regulator then.
We can't dismiss just based on that. We do have systems where performance index is mapped to clocks though it may not be 1:1 mapping. I am not disagreeing here, just trying to understand it better.
@Stephen: Can you answer here please ?
Understood. We have exactly same thing with SCPI but it controls both frequency and voltage referred as operating points. In general, this OPP terminology is used in SCPI/ACPI/SCMI specifications as both frequency and voltage control. I am bit worried that this binding might introduce confusions on the definitions. But it can be reworded/renamed easily if required.
Yeah, so far we have been looking at OPPs as freq-voltage pairs ONLY and that is changing. I am not sure if it going in the wrong direction really. Without frequency also it is an operating point for the domain. Isn't it?
Yes, I completely agree. I am not saying the direction is wrong. I am saying it's confusing and binding needs to be more clear.
What exactly isn't clear? (Yeah, there had been lots of emails and I want to know what improvements are you looking for).
On the contrary(playing devil's advocate here), we can treat all existing regulators alone as OPP then if you strip the voltages and treat it as abstract number.
But then we are going to have lots of platform specific code which will program the actual hardware, etc. Which is all handled by the regulator framework. Also note that the regulator core selects the common voltage selected by all the children, while we want to select the highest performance point here.
Even if we have to configure both clock and voltage for the power domain using standard clk/regulator frameworks, OPP will work just fine as it will do that then. So, its not that we are bypassing the regulator framework here. It will be used if we have the voltages available for the power-domain's performance states.
So if the firmware handles more than just regulators, I agree.
I don't know the internals of that really.
At the same time, I would have preferred firmware to even abstract the frequency like ACPI CPPC.
Frequency isn't required to be configured for the cases I know, but it can be in future implementations.
It would be good to get more information on what exactly that firmware handles.
@Stephen ?
I am just more cautious here since we are designing generic bindings and changing generic code, we need to understand what that firmware supports and how it may evolve(so that we can maintain DT compatibility)
Sure, I am fine with more discussions on it :)
I did a brief check and wanted to check if this is SMD/RPM regulators ?
Yes, Qcom calls the external core as Resource and Power manager (RPM).
Viresh, Sudeep,
Sorry for jumping in late.
[...]
On the contrary(playing devil's advocate here), we can treat all existing regulators alone as OPP then if you strip the voltages and treat it as abstract number.
But then we are going to have lots of platform specific code which will program the actual hardware, etc. Which is all handled by the regulator framework. Also note that the regulator core selects the common voltage selected by all the children, while we want to select the highest performance point here.
If I understand correctly, Sudeep is not convinced that this is about PM domain regulator(s), right?
To me there is no doubt, these regulators is exactly the definition of PM domain regulators.
That said, long time ago we have decided PM domain regulator shall be modeled as exactly that. From DT point of view, this means the handle to the PM domain regulator belongs in the node of the PM domain controller - and not in each device's node of those belonging to the PM domain.
Isn't that what this discussion really boils down to? Or maybe I am not getting it.
Even if we have to configure both clock and voltage for the power domain using standard clk/regulator frameworks, OPP will work just fine as it will do that then. So, its not that we are bypassing the regulator framework here. It will be used if we have the voltages available for the power-domain's performance states.
So if the firmware handles more than just regulators, I agree.
I don't know the internals of that really.
At the same time, I would have preferred firmware to even abstract the frequency like ACPI CPPC.
Frequency isn't required to be configured for the cases I know, but it can be in future implementations.
To me using OPP tables makes sense as it gives us the flexibility that is needed. If I understand correct, that was also Kevin's point.
[...]
Kind regards Uffe
On 20-04-17, 10:23, Ulf Hansson wrote:
Viresh, Sudeep,
Sorry for jumping in late.
[...]
On the contrary(playing devil's advocate here), we can treat all existing regulators alone as OPP then if you strip the voltages and treat it as abstract number.
But then we are going to have lots of platform specific code which will program the actual hardware, etc. Which is all handled by the regulator framework. Also note that the regulator core selects the common voltage selected by all the children, while we want to select the highest performance point here.
If I understand correctly, Sudeep is not convinced that this is about PM domain regulator(s), right?
To me there is no doubt, these regulators is exactly the definition of PM domain regulators.
That said, long time ago we have decided PM domain regulator shall be modeled as exactly that. From DT point of view, this means the handle to the PM domain regulator belongs in the node of the PM domain controller - and not in each device's node of those belonging to the PM domain.
Isn't that what this discussion really boils down to? Or maybe I am not getting it.
Maybe not. I think Sudeep understands that this is about PM domain regulators only but he is asking why aren't we solving this problem using regulators framework but performance-levels instead.
Even if we have to configure both clock and voltage for the power domain using standard clk/regulator frameworks, OPP will work just fine as it will do that then. So, its not that we are bypassing the regulator framework here. It will be used if we have the voltages available for the power-domain's performance states.
So if the firmware handles more than just regulators, I agree.
I don't know the internals of that really.
At the same time, I would have preferred firmware to even abstract the frequency like ACPI CPPC.
Frequency isn't required to be configured for the cases I know, but it can be in future implementations.
To me using OPP tables makes sense as it gives us the flexibility that is needed. If I understand correct, that was also Kevin's point.
Right.
On 20/04/17 09:23, Ulf Hansson wrote:
Viresh, Sudeep,
Sorry for jumping in late.
[...]
On the contrary(playing devil's advocate here), we can treat all existing regulators alone as OPP then if you strip the voltages and treat it as abstract number.
But then we are going to have lots of platform specific code which will program the actual hardware, etc. Which is all handled by the regulator framework. Also note that the regulator core selects the common voltage selected by all the children, while we want to select the highest performance point here.
If I understand correctly, Sudeep is not convinced that this is about PM domain regulator(s), right?
No, I am saying that it has to be modeled as regulators or some kind of advanced regulators. I am against modeling it as some new feature and using similar terminology that are quite close to OPP/CPPC in which case it's quite hard not to misunderstand the concepts and eventually use these bindings incorrectly.
To me there is no doubt, these regulators is exactly the definition of PM domain regulators.
+1
That said, long time ago we have decided PM domain regulator shall be modeled as exactly that. From DT point of view, this means the handle to the PM domain regulator belongs in the node of the PM domain controller - and not in each device's node of those belonging to the PM domain.
Isn't that what this discussion really boils down to? Or maybe I am not getting it.
I completely agree with you on all the above points. I am against the performance state terminology. Since the regulators and OPP are already defined in the bindings, all we need to explicitly state(if not already) is that there are hierarchical.
On 20/04/17 06:25, Viresh Kumar wrote:
On 19-04-17, 14:58, Sudeep Holla wrote:
On 19/04/17 12:47, Viresh Kumar wrote:
On 18-04-17, 17:01, Sudeep Holla wrote:
[...]
Yes, I completely agree. I am not saying the direction is wrong. I am saying it's confusing and binding needs to be more clear.
What exactly isn't clear? (Yeah, there had been lots of emails and I want to know what improvements are you looking for).
Just that the term performance is closely related to frequency, it needs to be explicit on what *exactly* it means. As it stands now, it can be used for OPP as I explain which controls both but as you clarify that's not what it's designed for.
On the contrary(playing devil's advocate here), we can treat all existing regulators alone as OPP then if you strip the voltages and treat it as abstract number.
But then we are going to have lots of platform specific code which will program the actual hardware, etc. Which is all handled by the regulator framework. Also note that the regulator core selects the common voltage selected by all the children, while we want to select the highest performance point here.
I am not sure if choosing highest performance point makes it difficult to fit it in regulator framework. It could be some configuration. Also IIUC the actual programming is done in the firmware in this case and I fail to see how that adds lot of platform code.
Even if we have to configure both clock and voltage for the power domain using standard clk/regulator frameworks, OPP will work just fine as it will do that then. So, its not that we are bypassing the regulator framework here. It will be used if we have the voltages available for the power-domain's performance states.
Yes I understand that.
On 20-04-17, 10:43, Sudeep Holla wrote:
Just that the term performance is closely related to frequency, it needs to be explicit on what *exactly* it means. As it stands now, it can be used for OPP as I explain which controls both but as you clarify that's not what it's designed for.
We are talking about active states of a power domain here and *performance* is the best word I got. And yes we can still have frequency as a configurable here, just that current platforms don't have it.
I am not sure if choosing highest performance point makes it difficult to fit it in regulator framework. It could be some configuration.
I was just pointing out a difference :)
Also IIUC the actual programming is done in the firmware in this case and I fail to see how that adds lot of platform code.
Oh I meant that for converting general regulator only cases to OPP. No firmware was involved there.
Viresh Kumar viresh.kumar@linaro.org writes:
On 20-04-17, 10:43, Sudeep Holla wrote:
Just that the term performance is closely related to frequency, it needs to be explicit on what *exactly* it means. As it stands now, it can be used for OPP as I explain which controls both but as you clarify that's not what it's designed for.
We are talking about active states of a power domain here and *performance* is the best word I got.
And yes we can still have frequency as a configurable here, just that current platforms don't have it.
It's not that your platforms don't have frequency, it's just that it's hidden by firmware on an M3 in your particular case.
DT is meant to model hardware, and surely what the M3 is managing is frequency and/or voltage (IOW, an OPP).
The way I see it, the problem you're trying to solve is complicated just because you don't know the exat freq and/or voltage because they are hiddent behind the firware, and all you have control over is an index.
What if you drop the introduction of the new domain-performance-state and just stick with OPPs (and phandles to them), and for cases where you don't know the exact freq/volage pairs, just use indexes and comment what they refer to:
operating-points = < /* * NOTE: actual frequency and voltages are managed by * firmware and are hidden from HLOS, so we simply use index * here to select the OPP */ 1 1 2 2 3 3
;
Since selecting the OPP is up to the power-domain driver implementation, this should be fine, IMO.
This would mean just updating the doc to reflect the "relaxing" of these fields to reflect indexes in cases where the exact freq/voltages are not known.
IMO, this would be much simpler, as it avoids adding a new property and continues to use teriminology that people are already familiar with around OPPs.
Kevin
On 17/04/17 06:27, Viresh Kumar wrote:
On 13-04-17, 14:42, Sudeep Holla wrote:
What I was referring is about power domain provider with multiple power domains(simply #power-domain-cells=<1> case as explained in the power-domain specification.
I am not sure if we should be looking to target such a situation for now, as that would be like this:
Device controlled by Domain A. Domain A itself is controlled by Domain B and Domain C.
No, may be I was not so clear. I am just referring a power controller that provides say 3 different power domains and are indexed 0 - 2. The consumer just passes the index along with the phandle for the power domain info.
Though we will end up converting the domain-performance-state property to an array if that is required in near future.
OK, better to document that so that we know how to extend it. We have #power-domain-cells=<1> on Juno with SCPI.
Yes. To simplify what not we just have power-domain for a device and change state of that domain to change the performance of that device.
Consider this case to understand what I have in Mind.
The power domain have its states as A, B, C, D. There can be multiple devices regulated by that domain and one of the devices have its power states as: A1, A2, A3, B1, B2, B3, C1, C2, C3, D1, D2, D3 and all these states have different frequency/voltages.
IOW, the devices can have regulators as well and may want to fine tune within the domain performance-state.
Understood. I would incline towards reusing regulators we that's what is changed behind the scene. Calling this operating performance point is misleading and doesn't align well with existing specs/features.
[]...
If we are looking this power-domains with performance as just some *advanced regulators*, I don't like the complexity added.
+ Mark
I don;t see any public discussions on why we ruled out using regulators to support this but maybe there were some offline discussions on this.
Mark, this is a long thread, so just summarizing here to give you the context.
At qualcomm, we have an external M3 core (running its own firmware) which controls a few voltage rails (including AVS on those). The devices vote for the voltage levels (or performance levels) they need by passing an integer value to the M3 (not actual voltage values). Since that didn't fit well with the existing regulator apis it was proposed we look at modeling these as powerdomain performance levels (and reuse genpd framework) which is what this series from Viresh is about.
Since the discussion now is moving towards 'why not use regulator framework for this instead of adding all the complexity with powerdomain performance levels since these are regulators underneath', I looped you in so you can provide some feedback on can these really be modeled as some *advanced regulators* with some apis to set some regulator performance levels (instead of voltage levels).
On Wed, Apr 26, 2017 at 10:02:39AM +0530, Rajendra Nayak wrote:
On 17/04/17 06:27, Viresh Kumar wrote:
If we are looking this power-domains with performance as just some *advanced regulators*, I don't like the complexity added.
- Mark
I don;t see any public discussions on why we ruled out using regulators to support this but maybe there were some offline discussions on this.
Mark, this is a long thread, so just summarizing here to give you the context.
At qualcomm, we have an external M3 core (running its own firmware) which controls a few voltage rails (including AVS on those). The devices vote for the voltage levels (or performance levels) they need by passing an integer value to the M3 (not actual voltage values). Since that didn't fit well with the existing regulator apis it was
As I'm getting fed up of saying: if the values you are setting are not voltages and do not behave like voltages then the hardware should not be represented as a voltage regulator since if they are represented as voltage regulators things will expect to be able to control them as voltage regulators. This hardware is quite clearly providing OPPs directly, I would expect this to be handled in the OPP code somehow.
On 26/04/17 14:55, Mark Brown wrote:
On Wed, Apr 26, 2017 at 10:02:39AM +0530, Rajendra Nayak wrote:
On 17/04/17 06:27, Viresh Kumar wrote:
If we are looking this power-domains with performance as just some *advanced regulators*, I don't like the complexity added.
- Mark
I don;t see any public discussions on why we ruled out using regulators to support this but maybe there were some offline discussions on this.
Mark, this is a long thread, so just summarizing here to give you the context.
At qualcomm, we have an external M3 core (running its own firmware) which controls a few voltage rails (including AVS on those). The devices vote for the voltage levels
Thanks for explicitly mentioning this, but ...
(or performance levels) they need by passing an integer value to the M3 (not actual
you contradict here, is it just voltage or performance(i.e. frequency) or both ? We need clarity there to choose the right representation.
voltage values). Since that didn't fit well with the existing regulator apis it was
As I'm getting fed up of saying: if the values you are setting are not voltages and do not behave like voltages then the hardware should not be represented as a voltage regulator since if they are represented as voltage regulators things will expect to be able to control them as voltage regulators. This hardware is quite clearly providing OPPs directly, I would expect this to be handled in the OPP code somehow.
I agree with you that we need to be absolutely sure on what it actually represents.
But as more and more platform are pushing such power controls to dedicated M3 or similar processors, we need abstraction. Though we are controlling hardware, we do so indirectly. Since there were discussions around device tree representing hardware vs platform, I tend to think, we are moving towards platform(something similar to ACPI).
On 04/27/2017 03:12 PM, Sudeep Holla wrote: []..
At qualcomm, we have an external M3 core (running its own firmware) which controls a few voltage rails (including AVS on those). The devices vote for the voltage levels
Thanks for explicitly mentioning this, but ...
(or performance levels) they need by passing an integer value to the M3 (not actual
you contradict here, is it just voltage or performance(i.e. frequency) or both ? We need clarity there to choose the right representation.
Its just voltage.
On 27-04-17, 16:20, Rajendra Nayak wrote:
On 04/27/2017 03:12 PM, Sudeep Holla wrote: []..
At qualcomm, we have an external M3 core (running its own firmware) which controls a few voltage rails (including AVS on those). The devices vote for the voltage levels
Thanks for explicitly mentioning this, but ...
(or performance levels) they need by passing an integer value to the M3 (not actual
you contradict here, is it just voltage or performance(i.e. frequency) or both ? We need clarity there to choose the right representation.
Its just voltage.
Right. Its just voltage in this case, but we can't speak of future platforms here and we have to consider this thing as an operating performance point only. I still think that this thread is moving in the right direction, specially after V6 which looks much better.
If we have anything strong against the way V6 is trying to solve it, I want to talk about it right now and get inputs from all the parties involved. Scrapping all this work is fine, but I would like to do it ASAP in that case :)
On 28/04/17 06:00, Viresh Kumar wrote:
On 27-04-17, 16:20, Rajendra Nayak wrote:
On 04/27/2017 03:12 PM, Sudeep Holla wrote: []..
At qualcomm, we have an external M3 core (running its own firmware) which controls a few voltage rails (including AVS on those). The devices vote for the voltage levels
Thanks for explicitly mentioning this, but ...
(or performance levels) they need by passing an integer value to the M3 (not actual
you contradict here, is it just voltage or performance(i.e. frequency) or both ? We need clarity there to choose the right representation.
Its just voltage.
Right. Its just voltage in this case, but we can't speak of future platforms here and we have to consider this thing as an operating performance point only. I still think that this thread is moving in the right direction, specially after V6 which looks much better.
Just thinking out loud, I can see platforms with have OPPs can move to this binding in future eliminating the need to specify the clock and regulators explicitly. So, I am not saying I against this idea, but I see it might complicate the above case in terms of the precedence that we consider in DT from backward compatibility.
E.g. if you now use this for just regulators, then I assume you continue to use clocks. However, that makes it difficult for platforms implementing *real* OPPs to reuse this binding as they may expect to skip clock altogether.
Also we may need OPPs(both volt/freq), voltage only and clock only bindings though all 3 are driven by the firmware and all are at abstract levels. I am trying to broaden the scope now without having to churn this binding again in near future.
So I don't totally agree that voltage regulators much have *real* voltages and not abstract scale. Yes the correct bindings might have such restrictions but can't we extend it ?
Anyways these are just my opinion.
If we have anything strong against the way V6 is trying to solve it, I want to talk about it right now and get inputs from all the parties involved. Scrapping all this work is fine, but I would like to do it ASAP in that case :)
As I said I am not against it, but I see it useful for a different use-case, just not the one you are trying to solve here ;)
On 28-04-17, 10:44, Sudeep Holla wrote:
Just thinking out loud, I can see platforms with have OPPs can move to this binding in future eliminating the need to specify the clock and regulators explicitly. So, I am not saying I against this idea, but I see it might complicate the above case in terms of the precedence that we consider in DT from backward compatibility.
E.g. if you now use this for just regulators, then I assume you continue to use clocks. However, that makes it difficult for platforms implementing *real* OPPs to reuse this binding as they may expect to skip clock altogether.
Also we may need OPPs(both volt/freq), voltage only and clock only bindings though all 3 are driven by the firmware and all are at abstract levels. I am trying to broaden the scope now without having to churn this binding again in near future.
So I don't totally agree that voltage regulators much have *real* voltages and not abstract scale. Yes the correct bindings might have such restrictions but can't we extend it ?
Anyways these are just my opinion.
Everyone's opinion has equal merit here :)
I believe that some of your hesitation came from the point that I have made opp-hz optional. That isn't the case anymore with V6.
Can we please take the discussion to that thread now and see if you can find similar problems there as well.
Thanks a lot.
On Thu, Apr 27, 2017 at 10:42:49AM +0100, Sudeep Holla wrote:
On 26/04/17 14:55, Mark Brown wrote:
As I'm getting fed up of saying: if the values you are setting are not voltages and do not behave like voltages then the hardware should not be represented as a voltage regulator since if they are represented as voltage regulators things will expect to be able to control them as voltage regulators. This hardware is quite clearly providing OPPs directly, I would expect this to be handled in the OPP code somehow.
I agree with you that we need to be absolutely sure on what it actually represents.
But as more and more platform are pushing such power controls to dedicated M3 or similar processors, we need abstraction. Though we are controlling hardware, we do so indirectly. Since there were discussions around device tree representing hardware vs platform, I tend to think, we are moving towards platform(something similar to ACPI).
I don't think there's a meaningful hardware/platform distinction here - in terms of what DT is describing the platform bit is just what the hardware (the microcontrollers) happen to do, DT doesn't much care about that though.
On 30/04/17 13:49, Mark Brown wrote:
On Thu, Apr 27, 2017 at 10:42:49AM +0100, Sudeep Holla wrote:
On 26/04/17 14:55, Mark Brown wrote:
As I'm getting fed up of saying: if the values you are setting are not voltages and do not behave like voltages then the hardware should not be represented as a voltage regulator since if they are represented as voltage regulators things will expect to be able to control them as voltage regulators. This hardware is quite clearly providing OPPs directly, I would expect this to be handled in the OPP code somehow.
I agree with you that we need to be absolutely sure on what it actually represents.
But as more and more platform are pushing such power controls to dedicated M3 or similar processors, we need abstraction. Though we are controlling hardware, we do so indirectly. Since there were discussions around device tree representing hardware vs platform, I tend to think, we are moving towards platform(something similar to ACPI).
I don't think there's a meaningful hardware/platform distinction here - in terms of what DT is describing the platform bit is just what the hardware (the microcontrollers) happen to do,
Yes agreed. It's similar to PSCI or any other platform firmware IMO.
The question is how do we deal with such controls that needs to be done via the firmware ? We generally plug-in to the existing framework in Linux using the existing bindings. Most of the time, much simpler bindings than the one that present complete hardware description.
DT doesn't much care about that though.
No sure about that, may be doesn't care about the internals, but we need to care about interface, no ?
On Wed, May 03, 2017 at 12:21:54PM +0100, Sudeep Holla wrote:
On 30/04/17 13:49, Mark Brown wrote:
DT doesn't much care about that though.
No sure about that, may be doesn't care about the internals, but we need to care about interface, no ?
Well, we need to at least describe what's there - my point is that it's no different to describing a piece of hardware, the fact that it happens to be implemented as firmware doesn't really change the abstraction level DT is operating at.
On 20/03/17 09:32, Viresh Kumar wrote: [...]
+Example 7: domain-Performance-state: +(example: For 1GHz require domain state 1 and for 1.1 & 1.2 GHz require state 2)
+/ {
- domain_opp_table: opp_table0 {
compatible = "operating-points-v2";
opp@1 {
domain-performance-state = <1>;
opp-microvolt = <975000 970000 985000>;
};
opp@2 {
domain-performance-state = <2>;
opp-microvolt = <1075000 1000000 1085000>;
};
- };
- foo_domain: power-controller@12340000 {
compatible = "foo,power-controller";
reg = <0x12340000 0x1000>;
#power-domain-cells = <0>;
operating-points-v2 = <&domain_opp_table>;
- }
- cpu0_opp_table: opp_table1 {
compatible = "operating-points-v2";
opp-shared;
opp@1000000000 {
opp-hz = /bits/ 64 <1000000000>;
domain-performance-state = <1>;
};
opp@1100000000 {
opp-hz = /bits/ 64 <1100000000>;
domain-performance-state = <2>;
};
opp@1200000000 {
opp-hz = /bits/ 64 <1200000000>;
domain-performance-state = <2>;
};
- };
- cpus {
#address-cells = <1>;
#size-cells = <0>;
cpu@0 {
compatible = "arm,cortex-a9";
reg = <0>;
clocks = <&clk_controller 0>;
clock-names = "cpu";
operating-points-v2 = <&cpu0_opp_table>;
power-domains = <&foo_domain>;
};
- };
+};
Thinking more about this above example, I think you need more explanation. So in the above case you have cpu with clock controller, power-domain and the OPP table info, I can think of few things that need to be explicit:
1. How does the precedence look like ?
2. Since power-domains with OPP table control the performance state, do we ignore clock and operating-points-v2 in the above case completely?
3. Will the power-domain drive the OPP ?
On 12-04-17, 18:05, Sudeep Holla wrote:
On 20/03/17 09:32, Viresh Kumar wrote: [...]
+Example 7: domain-Performance-state: +(example: For 1GHz require domain state 1 and for 1.1 & 1.2 GHz require state 2)
+/ {
- domain_opp_table: opp_table0 {
compatible = "operating-points-v2";
opp@1 {
domain-performance-state = <1>;
opp-microvolt = <975000 970000 985000>;
};
opp@2 {
domain-performance-state = <2>;
opp-microvolt = <1075000 1000000 1085000>;
};
- };
- foo_domain: power-controller@12340000 {
compatible = "foo,power-controller";
reg = <0x12340000 0x1000>;
#power-domain-cells = <0>;
operating-points-v2 = <&domain_opp_table>;
- }
- cpu0_opp_table: opp_table1 {
compatible = "operating-points-v2";
opp-shared;
opp@1000000000 {
opp-hz = /bits/ 64 <1000000000>;
domain-performance-state = <1>;
};
opp@1100000000 {
opp-hz = /bits/ 64 <1100000000>;
domain-performance-state = <2>;
};
opp@1200000000 {
opp-hz = /bits/ 64 <1200000000>;
domain-performance-state = <2>;
};
- };
- cpus {
#address-cells = <1>;
#size-cells = <0>;
cpu@0 {
compatible = "arm,cortex-a9";
reg = <0>;
clocks = <&clk_controller 0>;
clock-names = "cpu";
operating-points-v2 = <&cpu0_opp_table>;
power-domains = <&foo_domain>;
};
- };
+};
Thinking more about this above example, I think you need more explanation. So in the above case you have cpu with clock controller, power-domain and the OPP table info, I can think of few things that need to be explicit:
- How does the precedence look like ?
Just think of the power-domain as a regulator here. If we are increasing frequency of the device, power-domain needs to be programmed first followed by the clock.
- Since power-domains with OPP table control the performance state, do
They control performance state of the domains, not the devices.
we ignore clock and operating-points-v2 in the above case completely?
No. They are separate.
- Will the power-domain drive the OPP ?
power-domain will driver its own state using its own OPP table. Devices may fine tune within those states.
On 13/04/17 06:50, Viresh Kumar wrote:
On 12-04-17, 18:05, Sudeep Holla wrote:
On 20/03/17 09:32, Viresh Kumar wrote:
[...]
Thinking more about this above example, I think you need more explanation. So in the above case you have cpu with clock controller, power-domain and the OPP table info, I can think of few things that need to be explicit:
- How does the precedence look like ?
Just think of the power-domain as a regulator here. If we are increasing frequency of the device, power-domain needs to be programmed first followed by the clock.
Interesting. My understand of power domain and in particular power domain performance was that it would control both. The abstract number you introduce would hide clocks and regulators.
But if the concept treats it just as yet another regulator, we do we need these at all. Why don't we relate this performance to regulator values and be done with it ?
Sorry if I am missing to understand something here. I would look this as replacement for both clocks and regulators, something similar to ACPI CPPC. If not, it looks unnecessary to me with the information I have got so far.
- Since power-domains with OPP table control the performance state, do
They control performance state of the domains, not the devices.
we ignore clock and operating-points-v2 in the above case completely?
No. They are separate.
Understood now, but still trying to understand the complexity introduced here.
- Will the power-domain drive the OPP ?
power-domain will driver its own state using its own OPP table. Devices may fine tune within those states.
I fail to understand here. This makes me think this power domain is same as regulators as you pointed out earlier. So, we do we need all these extra things. I was hoping this to be something like ACPI CPPC that hide away clock and regulators.
On 13-04-17, 14:43, Sudeep Holla wrote:
Interesting. My understand of power domain and in particular power domain performance was that it would control both. The abstract number you introduce would hide clocks and regulators.
But if the concept treats it just as yet another regulator, we do we need these at all. Why don't we relate this performance to regulator values and be done with it ?
Sorry if I am missing to understand something here. I would look this as replacement for both clocks and regulators, something similar to ACPI CPPC. If not, it looks unnecessary to me with the information I have got so far.
I kind of answered that in the other email.
Some background may be good here. So Qcom tried to solve all this with virtual regulators, but the problem was that they need to talk in terms of integer values (1, 2, 3..) and not voltages and so they can't use the regulator framework straight away. And so we are doing all this.
On 17/04/17 06:33, Viresh Kumar wrote:
On 13-04-17, 14:43, Sudeep Holla wrote:
Interesting. My understand of power domain and in particular power domain performance was that it would control both. The abstract number you introduce would hide clocks and regulators.
But if the concept treats it just as yet another regulator, we do we need these at all. Why don't we relate this performance to regulator values and be done with it ?
Sorry if I am missing to understand something here. I would look this as replacement for both clocks and regulators, something similar to ACPI CPPC. If not, it looks unnecessary to me with the information I have got so far.
I kind of answered that in the other email.
Some background may be good here. So Qcom tried to solve all this with virtual regulators, but the problem was that they need to talk in terms of integer values (1, 2, 3..) and not voltages and so they can't use the regulator framework straight away. And so we are doing all this.
Was it posted externally ? Was there any objections for that approach ? IMO that's better approach but if I am late to the party, I would like to read through the discussions that happened on it(if any)
On 18-04-17, 17:03, Sudeep Holla wrote:
Was it posted externally ? Was there any objections for that approach ? IMO that's better approach but if I am late to the party, I would like to read through the discussions that happened on it(if any)
Maybe Stephen can tell more about it. AFAIK, there were some offline discussions around it.
The OPP table bindings contains all the necessary fields to support power-domains now. Update the power-domain bindings to allow "operating-points-v2" to be present within the power-domain node.
Also allow consumer devices, that don't use OPP tables, to specify the parent power-domain's performance level using the "domain-performance-state" property.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- .../devicetree/bindings/power/power_domain.txt | 42 ++++++++++++++++++++++ 1 file changed, 42 insertions(+)
diff --git a/Documentation/devicetree/bindings/power/power_domain.txt b/Documentation/devicetree/bindings/power/power_domain.txt index 723e1ad937da..5db112fa5d7c 100644 --- a/Documentation/devicetree/bindings/power/power_domain.txt +++ b/Documentation/devicetree/bindings/power/power_domain.txt @@ -38,6 +38,9 @@ phandle arguments (so called PM domain specifiers) of length specified by the domain's idle states. In the absence of this property, the domain would be considered as capable of being powered-on or powered-off.
+- operating-points-v2 : This describes the performance states of a PM domain. + Refer to ../opp/opp.txt for more information. + Example:
power: power-controller@12340000 { @@ -118,4 +121,43 @@ The node above defines a typical PM domain consumer device, which is located inside a PM domain with index 0 of a power controller represented by a node with the label "power".
+Optional properties: +- domain-performance-state: A positive integer value representing the minimum + power-domain performance level required by the consumer device. The integer + value '0' represents the lowest performance level and the higher values + represent higher performance levels. The value of "domain-performance-state" + field should match the "domain-performance-state" field of one of the OPP + nodes in the parent power-domain's OPP table. + + + +Example: + + domain_opp_table: opp_table { + compatible = "operating-points-v2"; + + opp@1 { + domain-performance-state = <1>; + opp-microvolt = <975000 970000 985000>; + }; + opp@2 { + domain-performance-state = <2>; + opp-microvolt = <1075000 1000000 1085000>; + }; + }; + + parent: power-controller@12340000 { + compatible = "foo,power-controller"; + reg = <0x12340000 0x1000>; + #power-domain-cells = <0>; + operating-points-v2 = <&domain_opp_table>; + }; + + leaky-device@12350000 { + compatible = "foo,i-leak-current"; + reg = <0x12350000 0x1000>; + power-domains = <&power 0>; + domain-performance-state = <2>; + }; + [1]. Documentation/devicetree/bindings/power/domain-idle-state.txt
On 20/03/17 09:32, Viresh Kumar wrote:
The OPP table bindings contains all the necessary fields to support power-domains now. Update the power-domain bindings to allow "operating-points-v2" to be present within the power-domain node.
Also allow consumer devices, that don't use OPP tables, to specify the parent power-domain's performance level using the "domain-performance-state" property.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
.../devicetree/bindings/power/power_domain.txt | 42 ++++++++++++++++++++++ 1 file changed, 42 insertions(+)
diff --git a/Documentation/devicetree/bindings/power/power_domain.txt b/Documentation/devicetree/bindings/power/power_domain.txt index 723e1ad937da..5db112fa5d7c 100644 --- a/Documentation/devicetree/bindings/power/power_domain.txt +++ b/Documentation/devicetree/bindings/power/power_domain.txt @@ -38,6 +38,9 @@ phandle arguments (so called PM domain specifiers) of length specified by the domain's idle states. In the absence of this property, the domain would be considered as capable of being powered-on or powered-off. +- operating-points-v2 : This describes the performance states of a PM domain.
- Refer to ../opp/opp.txt for more information.
Example: power: power-controller@12340000 { @@ -118,4 +121,43 @@ The node above defines a typical PM domain consumer device, which is located inside a PM domain with index 0 of a power controller represented by a node with the label "power". +Optional properties: +- domain-performance-state: A positive integer value representing the minimum
- power-domain performance level required by the consumer device. The integer
- value '0' represents the lowest performance level and the higher values
- represent higher performance levels. The value of "domain-performance-state"
- field should match the "domain-performance-state" field of one of the OPP
- nodes in the parent power-domain's OPP table.
+Example:
- domain_opp_table: opp_table {
compatible = "operating-points-v2";
opp@1 {
domain-performance-state = <1>;
opp-microvolt = <975000 970000 985000>;
};
opp@2 {
domain-performance-state = <2>;
opp-microvolt = <1075000 1000000 1085000>;
};
- };
- parent: power-controller@12340000 {
compatible = "foo,power-controller";
reg = <0x12340000 0x1000>;
#power-domain-cells = <0>;
operating-points-v2 = <&domain_opp_table>;
As mentioned in the other email, it would be good to consider scalability with multiple power domains in a PM domain provider. i.e case of #power-domain-cells = <1> or more
On 12-04-17, 17:58, Sudeep Holla wrote:
On 20/03/17 09:32, Viresh Kumar wrote:
The OPP table bindings contains all the necessary fields to support power-domains now. Update the power-domain bindings to allow "operating-points-v2" to be present within the power-domain node.
Also allow consumer devices, that don't use OPP tables, to specify the parent power-domain's performance level using the "domain-performance-state" property.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
.../devicetree/bindings/power/power_domain.txt | 42 ++++++++++++++++++++++ 1 file changed, 42 insertions(+)
diff --git a/Documentation/devicetree/bindings/power/power_domain.txt b/Documentation/devicetree/bindings/power/power_domain.txt index 723e1ad937da..5db112fa5d7c 100644 --- a/Documentation/devicetree/bindings/power/power_domain.txt +++ b/Documentation/devicetree/bindings/power/power_domain.txt @@ -38,6 +38,9 @@ phandle arguments (so called PM domain specifiers) of length specified by the domain's idle states. In the absence of this property, the domain would be considered as capable of being powered-on or powered-off. +- operating-points-v2 : This describes the performance states of a PM domain.
- Refer to ../opp/opp.txt for more information.
Example: power: power-controller@12340000 { @@ -118,4 +121,43 @@ The node above defines a typical PM domain consumer device, which is located inside a PM domain with index 0 of a power controller represented by a node with the label "power". +Optional properties: +- domain-performance-state: A positive integer value representing the minimum
- power-domain performance level required by the consumer device. The integer
- value '0' represents the lowest performance level and the higher values
- represent higher performance levels. The value of "domain-performance-state"
- field should match the "domain-performance-state" field of one of the OPP
- nodes in the parent power-domain's OPP table.
+Example:
- domain_opp_table: opp_table {
compatible = "operating-points-v2";
opp@1 {
domain-performance-state = <1>;
opp-microvolt = <975000 970000 985000>;
};
opp@2 {
domain-performance-state = <2>;
opp-microvolt = <1075000 1000000 1085000>;
};
- };
- parent: power-controller@12340000 {
compatible = "foo,power-controller";
reg = <0x12340000 0x1000>;
#power-domain-cells = <0>;
operating-points-v2 = <&domain_opp_table>;
As mentioned in the other email, it would be good to consider scalability with multiple power domains in a PM domain provider. i.e case of #power-domain-cells = <1> or more
Yeah, but that isn't supported for devices today. So no point considering that today.
On 13/04/17 07:03, Viresh Kumar wrote:
On 12-04-17, 17:58, Sudeep Holla wrote:
On 20/03/17 09:32, Viresh Kumar wrote:
The OPP table bindings contains all the necessary fields to support power-domains now. Update the power-domain bindings to allow "operating-points-v2" to be present within the power-domain node.
Also allow consumer devices, that don't use OPP tables, to specify the parent power-domain's performance level using the "domain-performance-state" property.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
.../devicetree/bindings/power/power_domain.txt | 42 ++++++++++++++++++++++ 1 file changed, 42 insertions(+)
diff --git a/Documentation/devicetree/bindings/power/power_domain.txt b/Documentation/devicetree/bindings/power/power_domain.txt index 723e1ad937da..5db112fa5d7c 100644 --- a/Documentation/devicetree/bindings/power/power_domain.txt +++ b/Documentation/devicetree/bindings/power/power_domain.txt @@ -38,6 +38,9 @@ phandle arguments (so called PM domain specifiers) of length specified by the domain's idle states. In the absence of this property, the domain would be considered as capable of being powered-on or powered-off. +- operating-points-v2 : This describes the performance states of a PM domain.
- Refer to ../opp/opp.txt for more information.
Example: power: power-controller@12340000 { @@ -118,4 +121,43 @@ The node above defines a typical PM domain consumer device, which is located inside a PM domain with index 0 of a power controller represented by a node with the label "power". +Optional properties: +- domain-performance-state: A positive integer value representing the minimum
- power-domain performance level required by the consumer device. The integer
- value '0' represents the lowest performance level and the higher values
- represent higher performance levels. The value of "domain-performance-state"
- field should match the "domain-performance-state" field of one of the OPP
- nodes in the parent power-domain's OPP table.
+Example:
- domain_opp_table: opp_table {
compatible = "operating-points-v2";
opp@1 {
domain-performance-state = <1>;
opp-microvolt = <975000 970000 985000>;
};
opp@2 {
domain-performance-state = <2>;
opp-microvolt = <1075000 1000000 1085000>;
};
- };
- parent: power-controller@12340000 {
compatible = "foo,power-controller";
reg = <0x12340000 0x1000>;
#power-domain-cells = <0>;
operating-points-v2 = <&domain_opp_table>;
As mentioned in the other email, it would be good to consider scalability with multiple power domains in a PM domain provider. i.e case of #power-domain-cells = <1> or more
Yeah, but that isn't supported for devices today. So no point considering that today.
Do you mean we don't support power controllers with multiple power domains ? If yes, we do support #power-domain-cells=<1 or more> clearly from the binding and this change simple doesn't scale with such power controllers/power-domain providers.
Only the resume_latency constraint uses the notifiers right now. In order to prepare for adding new constraint types with notifiers, move to a common notifier list.
Update pm_qos_update_target() to pass a pointer to the constraint structure to the notifier callbacks. Also update the notifier callbacks as well to error out for unexpected constraints.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/base/power/domain.c | 26 +++++++++++++++++++------- drivers/base/power/qos.c | 15 ++++----------- include/linux/pm_qos.h | 7 +++++++ kernel/power/qos.c | 2 +- 4 files changed, 31 insertions(+), 19 deletions(-)
diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c index 1a0549f1944a..6e4e22aa14a2 100644 --- a/drivers/base/power/domain.c +++ b/drivers/base/power/domain.c @@ -416,14 +416,10 @@ static int genpd_power_on(struct generic_pm_domain *genpd, unsigned int depth) return ret; }
-static int genpd_dev_pm_qos_notifier(struct notifier_block *nb, - unsigned long val, void *ptr) +static int __resume_latency_notifier(struct generic_pm_domain_data *gpd_data, + unsigned long val) { - struct generic_pm_domain_data *gpd_data; - struct device *dev; - - gpd_data = container_of(nb, struct generic_pm_domain_data, nb); - dev = gpd_data->base.dev; + struct device *dev = gpd_data->base.dev;
for (;;) { struct generic_pm_domain *genpd; @@ -456,6 +452,22 @@ static int genpd_dev_pm_qos_notifier(struct notifier_block *nb, return NOTIFY_DONE; }
+static int genpd_dev_pm_qos_notifier(struct notifier_block *nb, + unsigned long val, void *ptr) +{ + struct generic_pm_domain_data *gpd_data; + struct device *dev; + + gpd_data = container_of(nb, struct generic_pm_domain_data, nb); + dev = gpd_data->base.dev; + + if (dev_pm_qos_notifier_is_resume_latency(dev, ptr)) + return __resume_latency_notifier(gpd_data, val); + + dev_err(dev, "%s: Unexpected notifier call\n", __func__); + return NOTIFY_BAD; +} + /** * genpd_power_off_work_fn - Power off PM domain whose subdomain count is 0. * @work: Work structure used for scheduling the execution of this function. diff --git a/drivers/base/power/qos.c b/drivers/base/power/qos.c index f850daeffba4..654d8a12c2e7 100644 --- a/drivers/base/power/qos.c +++ b/drivers/base/power/qos.c @@ -172,18 +172,12 @@ static int dev_pm_qos_constraints_allocate(struct device *dev) { struct dev_pm_qos *qos; struct pm_qos_constraints *c; - struct blocking_notifier_head *n;
qos = kzalloc(sizeof(*qos), GFP_KERNEL); if (!qos) return -ENOMEM;
- n = kzalloc(sizeof(*n), GFP_KERNEL); - if (!n) { - kfree(qos); - return -ENOMEM; - } - BLOCKING_INIT_NOTIFIER_HEAD(n); + BLOCKING_INIT_NOTIFIER_HEAD(&qos->notifiers);
c = &qos->resume_latency; plist_head_init(&c->list); @@ -191,7 +185,7 @@ static int dev_pm_qos_constraints_allocate(struct device *dev) c->default_value = PM_QOS_RESUME_LATENCY_DEFAULT_VALUE; c->no_constraint_value = PM_QOS_RESUME_LATENCY_DEFAULT_VALUE; c->type = PM_QOS_MIN; - c->notifiers = n; + c->notifiers = &qos->notifiers;
c = &qos->latency_tolerance; plist_head_init(&c->list); @@ -268,7 +262,6 @@ void dev_pm_qos_constraints_destroy(struct device *dev) dev->power.qos = ERR_PTR(-ENODEV); spin_unlock_irq(&dev->power.lock);
- kfree(qos->resume_latency.notifiers); kfree(qos);
out: @@ -487,7 +480,7 @@ int dev_pm_qos_add_notifier(struct device *dev, struct notifier_block *notifier) ret = dev_pm_qos_constraints_allocate(dev);
if (!ret) - ret = blocking_notifier_chain_register(dev->power.qos->resume_latency.notifiers, + ret = blocking_notifier_chain_register(&dev->power.qos->notifiers, notifier);
mutex_unlock(&dev_pm_qos_mtx); @@ -514,7 +507,7 @@ int dev_pm_qos_remove_notifier(struct device *dev,
/* Silently return if the constraints object is not present. */ if (!IS_ERR_OR_NULL(dev->power.qos)) - retval = blocking_notifier_chain_unregister(dev->power.qos->resume_latency.notifiers, + retval = blocking_notifier_chain_unregister(&dev->power.qos->notifiers, notifier);
mutex_unlock(&dev_pm_qos_mtx); diff --git a/include/linux/pm_qos.h b/include/linux/pm_qos.h index 032b55909145..bcae6abb3f21 100644 --- a/include/linux/pm_qos.h +++ b/include/linux/pm_qos.h @@ -100,6 +100,7 @@ struct dev_pm_qos { struct dev_pm_qos_request *resume_latency_req; struct dev_pm_qos_request *latency_tolerance_req; struct dev_pm_qos_request *flags_req; + struct blocking_notifier_head notifiers; /* common for all constraints */ };
/* Action requested to pm_qos_update_target */ @@ -114,6 +115,12 @@ static inline int dev_pm_qos_request_active(struct dev_pm_qos_request *req) return req->dev != NULL; }
+static inline bool dev_pm_qos_notifier_is_resume_latency(struct device *dev, + struct pm_qos_constraints *c) +{ + return &dev->power.qos->resume_latency == c; +} + int pm_qos_update_target(struct pm_qos_constraints *c, struct plist_node *node, enum pm_qos_req_action action, int value); bool pm_qos_update_flags(struct pm_qos_flags *pqf, diff --git a/kernel/power/qos.c b/kernel/power/qos.c index 97b0df71303e..073324e0c3c8 100644 --- a/kernel/power/qos.c +++ b/kernel/power/qos.c @@ -315,7 +315,7 @@ int pm_qos_update_target(struct pm_qos_constraints *c, struct plist_node *node, if (c->notifiers) blocking_notifier_call_chain(c->notifiers, (unsigned long)curr_value, - NULL); + c); } else { ret = 0; }
On 20 March 2017 at 10:32, Viresh Kumar viresh.kumar@linaro.org wrote:
Only the resume_latency constraint uses the notifiers right now. In order to prepare for adding new constraint types with notifiers, move to a common notifier list.
Update pm_qos_update_target() to pass a pointer to the constraint structure to the notifier callbacks. Also update the notifier callbacks as well to error out for unexpected constraints.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
drivers/base/power/domain.c | 26 +++++++++++++++++++------- drivers/base/power/qos.c | 15 ++++----------- include/linux/pm_qos.h | 7 +++++++ kernel/power/qos.c | 2 +- 4 files changed, 31 insertions(+), 19 deletions(-)
diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c index 1a0549f1944a..6e4e22aa14a2 100644 --- a/drivers/base/power/domain.c +++ b/drivers/base/power/domain.c @@ -416,14 +416,10 @@ static int genpd_power_on(struct generic_pm_domain *genpd, unsigned int depth) return ret; }
-static int genpd_dev_pm_qos_notifier(struct notifier_block *nb,
unsigned long val, void *ptr)
+static int __resume_latency_notifier(struct generic_pm_domain_data *gpd_data,
unsigned long val)
Could you perhaps rename this to genpd_latency_notifier(), as I think it better follows the naming conventions in genpd.
{
struct generic_pm_domain_data *gpd_data;
struct device *dev;
gpd_data = container_of(nb, struct generic_pm_domain_data, nb);
dev = gpd_data->base.dev;
struct device *dev = gpd_data->base.dev; for (;;) { struct generic_pm_domain *genpd;
@@ -456,6 +452,22 @@ static int genpd_dev_pm_qos_notifier(struct notifier_block *nb, return NOTIFY_DONE; }
+static int genpd_dev_pm_qos_notifier(struct notifier_block *nb,
unsigned long val, void *ptr)
+{
struct generic_pm_domain_data *gpd_data;
struct device *dev;
gpd_data = container_of(nb, struct generic_pm_domain_data, nb);
dev = gpd_data->base.dev;
if (dev_pm_qos_notifier_is_resume_latency(dev, ptr))
return __resume_latency_notifier(gpd_data, val);
dev_err(dev, "%s: Unexpected notifier call\n", __func__);
return NOTIFY_BAD;
+}
/**
- genpd_power_off_work_fn - Power off PM domain whose subdomain count is 0.
- @work: Work structure used for scheduling the execution of this function.
diff --git a/drivers/base/power/qos.c b/drivers/base/power/qos.c index f850daeffba4..654d8a12c2e7 100644 --- a/drivers/base/power/qos.c +++ b/drivers/base/power/qos.c @@ -172,18 +172,12 @@ static int dev_pm_qos_constraints_allocate(struct device *dev) { struct dev_pm_qos *qos; struct pm_qos_constraints *c;
struct blocking_notifier_head *n; qos = kzalloc(sizeof(*qos), GFP_KERNEL); if (!qos) return -ENOMEM;
n = kzalloc(sizeof(*n), GFP_KERNEL);
if (!n) {
kfree(qos);
return -ENOMEM;
}
BLOCKING_INIT_NOTIFIER_HEAD(n);
BLOCKING_INIT_NOTIFIER_HEAD(&qos->notifiers); c = &qos->resume_latency; plist_head_init(&c->list);
@@ -191,7 +185,7 @@ static int dev_pm_qos_constraints_allocate(struct device *dev) c->default_value = PM_QOS_RESUME_LATENCY_DEFAULT_VALUE; c->no_constraint_value = PM_QOS_RESUME_LATENCY_DEFAULT_VALUE; c->type = PM_QOS_MIN;
c->notifiers = n;
c->notifiers = &qos->notifiers; c = &qos->latency_tolerance; plist_head_init(&c->list);
@@ -268,7 +262,6 @@ void dev_pm_qos_constraints_destroy(struct device *dev) dev->power.qos = ERR_PTR(-ENODEV); spin_unlock_irq(&dev->power.lock);
kfree(qos->resume_latency.notifiers); kfree(qos);
out:
@@ -487,7 +480,7 @@ int dev_pm_qos_add_notifier(struct device *dev, struct notifier_block *notifier) ret = dev_pm_qos_constraints_allocate(dev);
if (!ret)
ret = blocking_notifier_chain_register(dev->power.qos->resume_latency.notifiers,
ret = blocking_notifier_chain_register(&dev->power.qos->notifiers, notifier); mutex_unlock(&dev_pm_qos_mtx);
@@ -514,7 +507,7 @@ int dev_pm_qos_remove_notifier(struct device *dev,
/* Silently return if the constraints object is not present. */ if (!IS_ERR_OR_NULL(dev->power.qos))
retval = blocking_notifier_chain_unregister(dev->power.qos->resume_latency.notifiers,
retval = blocking_notifier_chain_unregister(&dev->power.qos->notifiers, notifier); mutex_unlock(&dev_pm_qos_mtx);
diff --git a/include/linux/pm_qos.h b/include/linux/pm_qos.h index 032b55909145..bcae6abb3f21 100644 --- a/include/linux/pm_qos.h +++ b/include/linux/pm_qos.h @@ -100,6 +100,7 @@ struct dev_pm_qos { struct dev_pm_qos_request *resume_latency_req; struct dev_pm_qos_request *latency_tolerance_req; struct dev_pm_qos_request *flags_req;
struct blocking_notifier_head notifiers; /* common for all constraints */
};
/* Action requested to pm_qos_update_target */ @@ -114,6 +115,12 @@ static inline int dev_pm_qos_request_active(struct dev_pm_qos_request *req) return req->dev != NULL; }
+static inline bool dev_pm_qos_notifier_is_resume_latency(struct device *dev,
A quite long name... Maybe we can remove "notifier", such as :dev_pm_qos_is_resume_latency()?
struct pm_qos_constraints *c)
+{
return &dev->power.qos->resume_latency == c;
+}
int pm_qos_update_target(struct pm_qos_constraints *c, struct plist_node *node, enum pm_qos_req_action action, int value); bool pm_qos_update_flags(struct pm_qos_flags *pqf, diff --git a/kernel/power/qos.c b/kernel/power/qos.c index 97b0df71303e..073324e0c3c8 100644 --- a/kernel/power/qos.c +++ b/kernel/power/qos.c @@ -315,7 +315,7 @@ int pm_qos_update_target(struct pm_qos_constraints *c, struct plist_node *node, if (c->notifiers) blocking_notifier_call_chain(c->notifiers, (unsigned long)curr_value,
NULL);
c); } else { ret = 0; }
-- 2.12.0.432.g71c3a4f4ba37
Besides the nitpicks above, feel free to add:
Acked-by: Ulf Hansson ulf.hansson@linaro.org
Kind regards Uffe
Only the resume_latency constraint uses the notifiers right now. In order to prepare for adding new constraint types with notifiers, move to a common notifier list.
Update pm_qos_update_target() to pass a pointer to the constraint structure to the notifier callbacks. Also update the notifier callbacks as well to error out for unexpected constraints.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org Acked-by: Ulf Hansson ulf.hansson@linaro.org --- V4->V5: - s/__resume_latency_notifier/genpd_latency_notifier - drop "notifier" from dev_pm_qos_notifier_is_resume_latency
drivers/base/power/domain.c | 26 +++++++++++++++++++------- drivers/base/power/qos.c | 15 ++++----------- include/linux/pm_qos.h | 7 +++++++ kernel/power/qos.c | 2 +- 4 files changed, 31 insertions(+), 19 deletions(-)
diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c index da49a8383dc3..f6f616ac5cc2 100644 --- a/drivers/base/power/domain.c +++ b/drivers/base/power/domain.c @@ -426,14 +426,10 @@ static int genpd_power_on(struct generic_pm_domain *genpd, unsigned int depth) return ret; }
-static int genpd_dev_pm_qos_notifier(struct notifier_block *nb, - unsigned long val, void *ptr) +static int genpd_latency_notifier(struct generic_pm_domain_data *gpd_data, + unsigned long val) { - struct generic_pm_domain_data *gpd_data; - struct device *dev; - - gpd_data = container_of(nb, struct generic_pm_domain_data, nb); - dev = gpd_data->base.dev; + struct device *dev = gpd_data->base.dev;
for (;;) { struct generic_pm_domain *genpd; @@ -466,6 +462,22 @@ static int genpd_dev_pm_qos_notifier(struct notifier_block *nb, return NOTIFY_DONE; }
+static int genpd_dev_pm_qos_notifier(struct notifier_block *nb, + unsigned long val, void *ptr) +{ + struct generic_pm_domain_data *gpd_data; + struct device *dev; + + gpd_data = container_of(nb, struct generic_pm_domain_data, nb); + dev = gpd_data->base.dev; + + if (dev_pm_qos_is_resume_latency(dev, ptr)) + return genpd_latency_notifier(gpd_data, val); + + dev_err(dev, "%s: Unexpected notifier call\n", __func__); + return NOTIFY_BAD; +} + /** * genpd_power_off_work_fn - Power off PM domain whose subdomain count is 0. * @work: Work structure used for scheduling the execution of this function. diff --git a/drivers/base/power/qos.c b/drivers/base/power/qos.c index f850daeffba4..654d8a12c2e7 100644 --- a/drivers/base/power/qos.c +++ b/drivers/base/power/qos.c @@ -172,18 +172,12 @@ static int dev_pm_qos_constraints_allocate(struct device *dev) { struct dev_pm_qos *qos; struct pm_qos_constraints *c; - struct blocking_notifier_head *n;
qos = kzalloc(sizeof(*qos), GFP_KERNEL); if (!qos) return -ENOMEM;
- n = kzalloc(sizeof(*n), GFP_KERNEL); - if (!n) { - kfree(qos); - return -ENOMEM; - } - BLOCKING_INIT_NOTIFIER_HEAD(n); + BLOCKING_INIT_NOTIFIER_HEAD(&qos->notifiers);
c = &qos->resume_latency; plist_head_init(&c->list); @@ -191,7 +185,7 @@ static int dev_pm_qos_constraints_allocate(struct device *dev) c->default_value = PM_QOS_RESUME_LATENCY_DEFAULT_VALUE; c->no_constraint_value = PM_QOS_RESUME_LATENCY_DEFAULT_VALUE; c->type = PM_QOS_MIN; - c->notifiers = n; + c->notifiers = &qos->notifiers;
c = &qos->latency_tolerance; plist_head_init(&c->list); @@ -268,7 +262,6 @@ void dev_pm_qos_constraints_destroy(struct device *dev) dev->power.qos = ERR_PTR(-ENODEV); spin_unlock_irq(&dev->power.lock);
- kfree(qos->resume_latency.notifiers); kfree(qos);
out: @@ -487,7 +480,7 @@ int dev_pm_qos_add_notifier(struct device *dev, struct notifier_block *notifier) ret = dev_pm_qos_constraints_allocate(dev);
if (!ret) - ret = blocking_notifier_chain_register(dev->power.qos->resume_latency.notifiers, + ret = blocking_notifier_chain_register(&dev->power.qos->notifiers, notifier);
mutex_unlock(&dev_pm_qos_mtx); @@ -514,7 +507,7 @@ int dev_pm_qos_remove_notifier(struct device *dev,
/* Silently return if the constraints object is not present. */ if (!IS_ERR_OR_NULL(dev->power.qos)) - retval = blocking_notifier_chain_unregister(dev->power.qos->resume_latency.notifiers, + retval = blocking_notifier_chain_unregister(&dev->power.qos->notifiers, notifier);
mutex_unlock(&dev_pm_qos_mtx); diff --git a/include/linux/pm_qos.h b/include/linux/pm_qos.h index 032b55909145..e546d1a2f237 100644 --- a/include/linux/pm_qos.h +++ b/include/linux/pm_qos.h @@ -100,6 +100,7 @@ struct dev_pm_qos { struct dev_pm_qos_request *resume_latency_req; struct dev_pm_qos_request *latency_tolerance_req; struct dev_pm_qos_request *flags_req; + struct blocking_notifier_head notifiers; /* common for all constraints */ };
/* Action requested to pm_qos_update_target */ @@ -114,6 +115,12 @@ static inline int dev_pm_qos_request_active(struct dev_pm_qos_request *req) return req->dev != NULL; }
+static inline bool dev_pm_qos_is_resume_latency(struct device *dev, + struct pm_qos_constraints *c) +{ + return &dev->power.qos->resume_latency == c; +} + int pm_qos_update_target(struct pm_qos_constraints *c, struct plist_node *node, enum pm_qos_req_action action, int value); bool pm_qos_update_flags(struct pm_qos_flags *pqf, diff --git a/kernel/power/qos.c b/kernel/power/qos.c index 97b0df71303e..073324e0c3c8 100644 --- a/kernel/power/qos.c +++ b/kernel/power/qos.c @@ -315,7 +315,7 @@ int pm_qos_update_target(struct pm_qos_constraints *c, struct plist_node *node, if (c->notifiers) blocking_notifier_call_chain(c->notifiers, (unsigned long)curr_value, - NULL); + c); } else { ret = 0; }
Some platforms have the capability to configure the performance state of their Power Domains. The performance levels are identified by positive integer values, a lower value represents lower performance state. The power domain driver should be able to retrieve all information required to configure the performance state of the power domain, with the help of the performance constraint's target value.
This patch adds a new QOS request type: DEV_PM_QOS_PERFORMANCE to support runtime performance constraints for the devices. Also allow notifiers to be registered against it, which will be used by frameworks like genpd.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- Documentation/power/pm_qos_interface.txt | 2 +- drivers/base/power/qos.c | 21 +++++++++++++++++++++ include/linux/pm_qos.h | 10 ++++++++++ 3 files changed, 32 insertions(+), 1 deletion(-)
diff --git a/Documentation/power/pm_qos_interface.txt b/Documentation/power/pm_qos_interface.txt index 21d2d48f87a2..4b7decdebf98 100644 --- a/Documentation/power/pm_qos_interface.txt +++ b/Documentation/power/pm_qos_interface.txt @@ -168,7 +168,7 @@ The per-device PM QoS framework has a per-device notification tree. int dev_pm_qos_add_notifier(device, notifier): Adds a notification callback function for the device. The callback is called when the aggregated value of the device constraints list -is changed (for resume latency device PM QoS only). +is changed (for resume latency and performance device PM QoS only).
int dev_pm_qos_remove_notifier(device, notifier): Removes the notification callback function for the device. diff --git a/drivers/base/power/qos.c b/drivers/base/power/qos.c index 654d8a12c2e7..084d26960dae 100644 --- a/drivers/base/power/qos.c +++ b/drivers/base/power/qos.c @@ -150,6 +150,10 @@ static int apply_constraint(struct dev_pm_qos_request *req, req->dev->power.set_latency_tolerance(req->dev, value); } break; + case DEV_PM_QOS_PERFORMANCE: + ret = pm_qos_update_target(&qos->performance, &req->data.pnode, + action, value); + break; case DEV_PM_QOS_FLAGS: ret = pm_qos_update_flags(&qos->flags, &req->data.flr, action, value); @@ -194,6 +198,14 @@ static int dev_pm_qos_constraints_allocate(struct device *dev) c->no_constraint_value = PM_QOS_LATENCY_TOLERANCE_NO_CONSTRAINT; c->type = PM_QOS_MIN;
+ c = &qos->performance; + plist_head_init(&c->list); + c->target_value = PM_QOS_PERFORMANCE_DEFAULT_VALUE; + c->default_value = PM_QOS_PERFORMANCE_DEFAULT_VALUE; + c->no_constraint_value = PM_QOS_PERFORMANCE_DEFAULT_VALUE; + c->type = PM_QOS_MAX; + c->notifiers = &qos->notifiers; + INIT_LIST_HEAD(&qos->flags.list);
spin_lock_irq(&dev->power.lock); @@ -252,6 +264,11 @@ void dev_pm_qos_constraints_destroy(struct device *dev) apply_constraint(req, PM_QOS_REMOVE_REQ, PM_QOS_DEFAULT_VALUE); memset(req, 0, sizeof(*req)); } + c = &qos->performance; + plist_for_each_entry_safe(req, tmp, &c->list, data.pnode) { + apply_constraint(req, PM_QOS_REMOVE_REQ, PM_QOS_DEFAULT_VALUE); + memset(req, 0, sizeof(*req)); + } f = &qos->flags; list_for_each_entry_safe(req, tmp, &f->list, data.flr.node) { apply_constraint(req, PM_QOS_REMOVE_REQ, PM_QOS_DEFAULT_VALUE); @@ -362,6 +379,7 @@ static int __dev_pm_qos_update_request(struct dev_pm_qos_request *req, switch(req->type) { case DEV_PM_QOS_RESUME_LATENCY: case DEV_PM_QOS_LATENCY_TOLERANCE: + case DEV_PM_QOS_PERFORMANCE: curr_value = req->data.pnode.prio; break; case DEV_PM_QOS_FLAGS: @@ -571,6 +589,9 @@ static void __dev_pm_qos_drop_user_request(struct device *dev, req = dev->power.qos->flags_req; dev->power.qos->flags_req = NULL; break; + case DEV_PM_QOS_PERFORMANCE: + dev_err(dev, "Invalid user request (performance)\n"); + return; } __dev_pm_qos_remove_request(req); kfree(req); diff --git a/include/linux/pm_qos.h b/include/linux/pm_qos.h index bcae6abb3f21..0f5135d55406 100644 --- a/include/linux/pm_qos.h +++ b/include/linux/pm_qos.h @@ -36,6 +36,7 @@ enum pm_qos_flags_status { #define PM_QOS_RESUME_LATENCY_DEFAULT_VALUE 0 #define PM_QOS_LATENCY_TOLERANCE_DEFAULT_VALUE 0 #define PM_QOS_LATENCY_TOLERANCE_NO_CONSTRAINT (-1) +#define PM_QOS_PERFORMANCE_DEFAULT_VALUE 0 #define PM_QOS_LATENCY_ANY ((s32)(~(__u32)0 >> 1))
#define PM_QOS_FLAG_NO_POWER_OFF (1 << 0) @@ -55,6 +56,7 @@ struct pm_qos_flags_request { enum dev_pm_qos_req_type { DEV_PM_QOS_RESUME_LATENCY = 1, DEV_PM_QOS_LATENCY_TOLERANCE, + DEV_PM_QOS_PERFORMANCE, DEV_PM_QOS_FLAGS, };
@@ -96,9 +98,11 @@ struct pm_qos_flags { struct dev_pm_qos { struct pm_qos_constraints resume_latency; struct pm_qos_constraints latency_tolerance; + struct pm_qos_constraints performance; struct pm_qos_flags flags; struct dev_pm_qos_request *resume_latency_req; struct dev_pm_qos_request *latency_tolerance_req; + struct dev_pm_qos_request *performance_req; struct dev_pm_qos_request *flags_req; struct blocking_notifier_head notifiers; /* common for all constraints */ }; @@ -121,6 +125,12 @@ static inline bool dev_pm_qos_notifier_is_resume_latency(struct device *dev, return &dev->power.qos->resume_latency == c; }
+static inline bool dev_pm_qos_notifier_is_performance(struct device *dev, + struct pm_qos_constraints *c) +{ + return &dev->power.qos->performance == c; +} + int pm_qos_update_target(struct pm_qos_constraints *c, struct plist_node *node, enum pm_qos_req_action action, int value); bool pm_qos_update_flags(struct pm_qos_flags *pqf,
On 20 March 2017 at 10:32, Viresh Kumar viresh.kumar@linaro.org wrote:
Some platforms have the capability to configure the performance state of their Power Domains. The performance levels are identified by positive integer values, a lower value represents lower performance state. The power domain driver should be able to retrieve all information required to configure the performance state of the power domain, with the help of the performance constraint's target value.
This patch adds a new QOS request type: DEV_PM_QOS_PERFORMANCE to support runtime performance constraints for the devices. Also allow notifiers to be registered against it, which will be used by frameworks like genpd.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
Documentation/power/pm_qos_interface.txt | 2 +- drivers/base/power/qos.c | 21 +++++++++++++++++++++ include/linux/pm_qos.h | 10 ++++++++++ 3 files changed, 32 insertions(+), 1 deletion(-)
diff --git a/Documentation/power/pm_qos_interface.txt b/Documentation/power/pm_qos_interface.txt index 21d2d48f87a2..4b7decdebf98 100644 --- a/Documentation/power/pm_qos_interface.txt +++ b/Documentation/power/pm_qos_interface.txt @@ -168,7 +168,7 @@ The per-device PM QoS framework has a per-device notification tree. int dev_pm_qos_add_notifier(device, notifier): Adds a notification callback function for the device. The callback is called when the aggregated value of the device constraints list -is changed (for resume latency device PM QoS only). +is changed (for resume latency and performance device PM QoS only).
/s/ only/
int dev_pm_qos_remove_notifier(device, notifier): Removes the notification callback function for the device. diff --git a/drivers/base/power/qos.c b/drivers/base/power/qos.c index 654d8a12c2e7..084d26960dae 100644 --- a/drivers/base/power/qos.c +++ b/drivers/base/power/qos.c @@ -150,6 +150,10 @@ static int apply_constraint(struct dev_pm_qos_request *req, req->dev->power.set_latency_tolerance(req->dev, value); } break;
case DEV_PM_QOS_PERFORMANCE:
ret = pm_qos_update_target(&qos->performance, &req->data.pnode,
action, value);
break; case DEV_PM_QOS_FLAGS: ret = pm_qos_update_flags(&qos->flags, &req->data.flr, action, value);
@@ -194,6 +198,14 @@ static int dev_pm_qos_constraints_allocate(struct device *dev) c->no_constraint_value = PM_QOS_LATENCY_TOLERANCE_NO_CONSTRAINT; c->type = PM_QOS_MIN;
c = &qos->performance;
plist_head_init(&c->list);
c->target_value = PM_QOS_PERFORMANCE_DEFAULT_VALUE;
c->default_value = PM_QOS_PERFORMANCE_DEFAULT_VALUE;
c->no_constraint_value = PM_QOS_PERFORMANCE_DEFAULT_VALUE;
c->type = PM_QOS_MAX;
c->notifiers = &qos->notifiers;
INIT_LIST_HEAD(&qos->flags.list); spin_lock_irq(&dev->power.lock);
@@ -252,6 +264,11 @@ void dev_pm_qos_constraints_destroy(struct device *dev) apply_constraint(req, PM_QOS_REMOVE_REQ, PM_QOS_DEFAULT_VALUE); memset(req, 0, sizeof(*req)); }
c = &qos->performance;
plist_for_each_entry_safe(req, tmp, &c->list, data.pnode) {
apply_constraint(req, PM_QOS_REMOVE_REQ, PM_QOS_DEFAULT_VALUE);
memset(req, 0, sizeof(*req));
} f = &qos->flags; list_for_each_entry_safe(req, tmp, &f->list, data.flr.node) { apply_constraint(req, PM_QOS_REMOVE_REQ, PM_QOS_DEFAULT_VALUE);
@@ -362,6 +379,7 @@ static int __dev_pm_qos_update_request(struct dev_pm_qos_request *req, switch(req->type) { case DEV_PM_QOS_RESUME_LATENCY: case DEV_PM_QOS_LATENCY_TOLERANCE:
case DEV_PM_QOS_PERFORMANCE: curr_value = req->data.pnode.prio; break; case DEV_PM_QOS_FLAGS:
@@ -571,6 +589,9 @@ static void __dev_pm_qos_drop_user_request(struct device *dev, req = dev->power.qos->flags_req; dev->power.qos->flags_req = NULL; break;
case DEV_PM_QOS_PERFORMANCE:
dev_err(dev, "Invalid user request (performance)\n");
return;
Isn't it possible to drop a performance request?
} __dev_pm_qos_remove_request(req); kfree(req);
diff --git a/include/linux/pm_qos.h b/include/linux/pm_qos.h index bcae6abb3f21..0f5135d55406 100644 --- a/include/linux/pm_qos.h +++ b/include/linux/pm_qos.h @@ -36,6 +36,7 @@ enum pm_qos_flags_status { #define PM_QOS_RESUME_LATENCY_DEFAULT_VALUE 0 #define PM_QOS_LATENCY_TOLERANCE_DEFAULT_VALUE 0 #define PM_QOS_LATENCY_TOLERANCE_NO_CONSTRAINT (-1) +#define PM_QOS_PERFORMANCE_DEFAULT_VALUE 0 #define PM_QOS_LATENCY_ANY ((s32)(~(__u32)0 >> 1))
#define PM_QOS_FLAG_NO_POWER_OFF (1 << 0) @@ -55,6 +56,7 @@ struct pm_qos_flags_request { enum dev_pm_qos_req_type { DEV_PM_QOS_RESUME_LATENCY = 1, DEV_PM_QOS_LATENCY_TOLERANCE,
DEV_PM_QOS_PERFORMANCE, DEV_PM_QOS_FLAGS,
};
@@ -96,9 +98,11 @@ struct pm_qos_flags { struct dev_pm_qos { struct pm_qos_constraints resume_latency; struct pm_qos_constraints latency_tolerance;
struct pm_qos_constraints performance; struct pm_qos_flags flags; struct dev_pm_qos_request *resume_latency_req; struct dev_pm_qos_request *latency_tolerance_req;
struct dev_pm_qos_request *performance_req;
I didn't find performance_req being used at all...
struct dev_pm_qos_request *flags_req; struct blocking_notifier_head notifiers; /* common for all constraints */
}; @@ -121,6 +125,12 @@ static inline bool dev_pm_qos_notifier_is_resume_latency(struct device *dev, return &dev->power.qos->resume_latency == c; }
+static inline bool dev_pm_qos_notifier_is_performance(struct device *dev,
Similar comment as for patch 3, perhaps remove "notifier" from the name of the function.
struct pm_qos_constraints *c)
+{
return &dev->power.qos->performance == c;
+}
int pm_qos_update_target(struct pm_qos_constraints *c, struct plist_node *node, enum pm_qos_req_action action, int value); bool pm_qos_update_flags(struct pm_qos_flags *pqf, -- 2.12.0.432.g71c3a4f4ba37
Kind regards Uffe
On 19-04-17, 16:07, Ulf Hansson wrote:
On 20 March 2017 at 10:32, Viresh Kumar viresh.kumar@linaro.org wrote:
@@ -571,6 +589,9 @@ static void __dev_pm_qos_drop_user_request(struct device *dev, req = dev->power.qos->flags_req; dev->power.qos->flags_req = NULL; break;
case DEV_PM_QOS_PERFORMANCE:
dev_err(dev, "Invalid user request (performance)\n");
return;
Isn't it possible to drop a performance request?
I am not exposing the performance QOS via sysfs. Should we ? I thought this has to be worked out within kernel only and so haven't provided any user interface.
@@ -96,9 +98,11 @@ struct pm_qos_flags { struct dev_pm_qos { struct pm_qos_constraints resume_latency; struct pm_qos_constraints latency_tolerance;
struct pm_qos_constraints performance; struct pm_qos_flags flags; struct dev_pm_qos_request *resume_latency_req; struct dev_pm_qos_request *latency_tolerance_req;
struct dev_pm_qos_request *performance_req;
I didn't find performance_req being used at all...
I just over-copied it seems. The OPP framework creates its own request structure and so this should be dropped.
Some platforms have the capability to configure the performance state of their Power Domains. The performance levels are identified by positive integer values, a lower value represents lower performance state. The power domain driver should be able to retrieve all information required to configure the performance state of the power domain, with the help of the performance constraint's target value.
This patch adds a new QOS request type: DEV_PM_QOS_PERFORMANCE to support runtime performance constraints for the devices. Also allow notifiers to be registered against it, which will be used by frameworks like genpd.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- V4->V5: - s/ only/ - drop performance_req field - drop "notifier" from dev_pm_qos_notifier_is_performance
Documentation/power/pm_qos_interface.txt | 2 +- drivers/base/power/qos.c | 21 +++++++++++++++++++++ include/linux/pm_qos.h | 9 +++++++++ 3 files changed, 31 insertions(+), 1 deletion(-)
diff --git a/Documentation/power/pm_qos_interface.txt b/Documentation/power/pm_qos_interface.txt index 21d2d48f87a2..42870d28fc3c 100644 --- a/Documentation/power/pm_qos_interface.txt +++ b/Documentation/power/pm_qos_interface.txt @@ -168,7 +168,7 @@ The per-device PM QoS framework has a per-device notification tree. int dev_pm_qos_add_notifier(device, notifier): Adds a notification callback function for the device. The callback is called when the aggregated value of the device constraints list -is changed (for resume latency device PM QoS only). +is changed (for resume latency and performance device PM QoS).
int dev_pm_qos_remove_notifier(device, notifier): Removes the notification callback function for the device. diff --git a/drivers/base/power/qos.c b/drivers/base/power/qos.c index 654d8a12c2e7..084d26960dae 100644 --- a/drivers/base/power/qos.c +++ b/drivers/base/power/qos.c @@ -150,6 +150,10 @@ static int apply_constraint(struct dev_pm_qos_request *req, req->dev->power.set_latency_tolerance(req->dev, value); } break; + case DEV_PM_QOS_PERFORMANCE: + ret = pm_qos_update_target(&qos->performance, &req->data.pnode, + action, value); + break; case DEV_PM_QOS_FLAGS: ret = pm_qos_update_flags(&qos->flags, &req->data.flr, action, value); @@ -194,6 +198,14 @@ static int dev_pm_qos_constraints_allocate(struct device *dev) c->no_constraint_value = PM_QOS_LATENCY_TOLERANCE_NO_CONSTRAINT; c->type = PM_QOS_MIN;
+ c = &qos->performance; + plist_head_init(&c->list); + c->target_value = PM_QOS_PERFORMANCE_DEFAULT_VALUE; + c->default_value = PM_QOS_PERFORMANCE_DEFAULT_VALUE; + c->no_constraint_value = PM_QOS_PERFORMANCE_DEFAULT_VALUE; + c->type = PM_QOS_MAX; + c->notifiers = &qos->notifiers; + INIT_LIST_HEAD(&qos->flags.list);
spin_lock_irq(&dev->power.lock); @@ -252,6 +264,11 @@ void dev_pm_qos_constraints_destroy(struct device *dev) apply_constraint(req, PM_QOS_REMOVE_REQ, PM_QOS_DEFAULT_VALUE); memset(req, 0, sizeof(*req)); } + c = &qos->performance; + plist_for_each_entry_safe(req, tmp, &c->list, data.pnode) { + apply_constraint(req, PM_QOS_REMOVE_REQ, PM_QOS_DEFAULT_VALUE); + memset(req, 0, sizeof(*req)); + } f = &qos->flags; list_for_each_entry_safe(req, tmp, &f->list, data.flr.node) { apply_constraint(req, PM_QOS_REMOVE_REQ, PM_QOS_DEFAULT_VALUE); @@ -362,6 +379,7 @@ static int __dev_pm_qos_update_request(struct dev_pm_qos_request *req, switch(req->type) { case DEV_PM_QOS_RESUME_LATENCY: case DEV_PM_QOS_LATENCY_TOLERANCE: + case DEV_PM_QOS_PERFORMANCE: curr_value = req->data.pnode.prio; break; case DEV_PM_QOS_FLAGS: @@ -571,6 +589,9 @@ static void __dev_pm_qos_drop_user_request(struct device *dev, req = dev->power.qos->flags_req; dev->power.qos->flags_req = NULL; break; + case DEV_PM_QOS_PERFORMANCE: + dev_err(dev, "Invalid user request (performance)\n"); + return; } __dev_pm_qos_remove_request(req); kfree(req); diff --git a/include/linux/pm_qos.h b/include/linux/pm_qos.h index e546d1a2f237..665f90face40 100644 --- a/include/linux/pm_qos.h +++ b/include/linux/pm_qos.h @@ -36,6 +36,7 @@ enum pm_qos_flags_status { #define PM_QOS_RESUME_LATENCY_DEFAULT_VALUE 0 #define PM_QOS_LATENCY_TOLERANCE_DEFAULT_VALUE 0 #define PM_QOS_LATENCY_TOLERANCE_NO_CONSTRAINT (-1) +#define PM_QOS_PERFORMANCE_DEFAULT_VALUE 0 #define PM_QOS_LATENCY_ANY ((s32)(~(__u32)0 >> 1))
#define PM_QOS_FLAG_NO_POWER_OFF (1 << 0) @@ -55,6 +56,7 @@ struct pm_qos_flags_request { enum dev_pm_qos_req_type { DEV_PM_QOS_RESUME_LATENCY = 1, DEV_PM_QOS_LATENCY_TOLERANCE, + DEV_PM_QOS_PERFORMANCE, DEV_PM_QOS_FLAGS, };
@@ -96,6 +98,7 @@ struct pm_qos_flags { struct dev_pm_qos { struct pm_qos_constraints resume_latency; struct pm_qos_constraints latency_tolerance; + struct pm_qos_constraints performance; struct pm_qos_flags flags; struct dev_pm_qos_request *resume_latency_req; struct dev_pm_qos_request *latency_tolerance_req; @@ -121,6 +124,12 @@ static inline bool dev_pm_qos_is_resume_latency(struct device *dev, return &dev->power.qos->resume_latency == c; }
+static inline bool dev_pm_qos_is_performance(struct device *dev, + struct pm_qos_constraints *c) +{ + return &dev->power.qos->performance == c; +} + int pm_qos_update_target(struct pm_qos_constraints *c, struct plist_node *node, enum pm_qos_req_action action, int value); bool pm_qos_update_flags(struct pm_qos_flags *pqf,
On 20 April 2017 at 06:46, Viresh Kumar viresh.kumar@linaro.org wrote:
Some platforms have the capability to configure the performance state of their Power Domains. The performance levels are identified by positive integer values, a lower value represents lower performance state. The power domain driver should be able to retrieve all information required to configure the performance state of the power domain, with the help of the performance constraint's target value.
This patch adds a new QOS request type: DEV_PM_QOS_PERFORMANCE to support runtime performance constraints for the devices. Also allow notifiers to be registered against it, which will be used by frameworks like genpd.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
Acked-by: Ulf Hansson ulf.hansson@linaro.org
Kind regards Uffe
V4->V5:
- s/ only/
- drop performance_req field
- drop "notifier" from dev_pm_qos_notifier_is_performance
Documentation/power/pm_qos_interface.txt | 2 +- drivers/base/power/qos.c | 21 +++++++++++++++++++++ include/linux/pm_qos.h | 9 +++++++++ 3 files changed, 31 insertions(+), 1 deletion(-)
diff --git a/Documentation/power/pm_qos_interface.txt b/Documentation/power/pm_qos_interface.txt index 21d2d48f87a2..42870d28fc3c 100644 --- a/Documentation/power/pm_qos_interface.txt +++ b/Documentation/power/pm_qos_interface.txt @@ -168,7 +168,7 @@ The per-device PM QoS framework has a per-device notification tree. int dev_pm_qos_add_notifier(device, notifier): Adds a notification callback function for the device. The callback is called when the aggregated value of the device constraints list -is changed (for resume latency device PM QoS only). +is changed (for resume latency and performance device PM QoS).
int dev_pm_qos_remove_notifier(device, notifier): Removes the notification callback function for the device. diff --git a/drivers/base/power/qos.c b/drivers/base/power/qos.c index 654d8a12c2e7..084d26960dae 100644 --- a/drivers/base/power/qos.c +++ b/drivers/base/power/qos.c @@ -150,6 +150,10 @@ static int apply_constraint(struct dev_pm_qos_request *req, req->dev->power.set_latency_tolerance(req->dev, value); } break;
case DEV_PM_QOS_PERFORMANCE:
ret = pm_qos_update_target(&qos->performance, &req->data.pnode,
action, value);
break; case DEV_PM_QOS_FLAGS: ret = pm_qos_update_flags(&qos->flags, &req->data.flr, action, value);
@@ -194,6 +198,14 @@ static int dev_pm_qos_constraints_allocate(struct device *dev) c->no_constraint_value = PM_QOS_LATENCY_TOLERANCE_NO_CONSTRAINT; c->type = PM_QOS_MIN;
c = &qos->performance;
plist_head_init(&c->list);
c->target_value = PM_QOS_PERFORMANCE_DEFAULT_VALUE;
c->default_value = PM_QOS_PERFORMANCE_DEFAULT_VALUE;
c->no_constraint_value = PM_QOS_PERFORMANCE_DEFAULT_VALUE;
c->type = PM_QOS_MAX;
c->notifiers = &qos->notifiers;
INIT_LIST_HEAD(&qos->flags.list); spin_lock_irq(&dev->power.lock);
@@ -252,6 +264,11 @@ void dev_pm_qos_constraints_destroy(struct device *dev) apply_constraint(req, PM_QOS_REMOVE_REQ, PM_QOS_DEFAULT_VALUE); memset(req, 0, sizeof(*req)); }
c = &qos->performance;
plist_for_each_entry_safe(req, tmp, &c->list, data.pnode) {
apply_constraint(req, PM_QOS_REMOVE_REQ, PM_QOS_DEFAULT_VALUE);
memset(req, 0, sizeof(*req));
} f = &qos->flags; list_for_each_entry_safe(req, tmp, &f->list, data.flr.node) { apply_constraint(req, PM_QOS_REMOVE_REQ, PM_QOS_DEFAULT_VALUE);
@@ -362,6 +379,7 @@ static int __dev_pm_qos_update_request(struct dev_pm_qos_request *req, switch(req->type) { case DEV_PM_QOS_RESUME_LATENCY: case DEV_PM_QOS_LATENCY_TOLERANCE:
case DEV_PM_QOS_PERFORMANCE: curr_value = req->data.pnode.prio; break; case DEV_PM_QOS_FLAGS:
@@ -571,6 +589,9 @@ static void __dev_pm_qos_drop_user_request(struct device *dev, req = dev->power.qos->flags_req; dev->power.qos->flags_req = NULL; break;
case DEV_PM_QOS_PERFORMANCE:
dev_err(dev, "Invalid user request (performance)\n");
return; } __dev_pm_qos_remove_request(req); kfree(req);
diff --git a/include/linux/pm_qos.h b/include/linux/pm_qos.h index e546d1a2f237..665f90face40 100644 --- a/include/linux/pm_qos.h +++ b/include/linux/pm_qos.h @@ -36,6 +36,7 @@ enum pm_qos_flags_status { #define PM_QOS_RESUME_LATENCY_DEFAULT_VALUE 0 #define PM_QOS_LATENCY_TOLERANCE_DEFAULT_VALUE 0 #define PM_QOS_LATENCY_TOLERANCE_NO_CONSTRAINT (-1) +#define PM_QOS_PERFORMANCE_DEFAULT_VALUE 0 #define PM_QOS_LATENCY_ANY ((s32)(~(__u32)0 >> 1))
#define PM_QOS_FLAG_NO_POWER_OFF (1 << 0) @@ -55,6 +56,7 @@ struct pm_qos_flags_request { enum dev_pm_qos_req_type { DEV_PM_QOS_RESUME_LATENCY = 1, DEV_PM_QOS_LATENCY_TOLERANCE,
DEV_PM_QOS_PERFORMANCE, DEV_PM_QOS_FLAGS,
};
@@ -96,6 +98,7 @@ struct pm_qos_flags { struct dev_pm_qos { struct pm_qos_constraints resume_latency; struct pm_qos_constraints latency_tolerance;
struct pm_qos_constraints performance; struct pm_qos_flags flags; struct dev_pm_qos_request *resume_latency_req; struct dev_pm_qos_request *latency_tolerance_req;
@@ -121,6 +124,12 @@ static inline bool dev_pm_qos_is_resume_latency(struct device *dev, return &dev->power.qos->resume_latency == c; }
+static inline bool dev_pm_qos_is_performance(struct device *dev,
struct pm_qos_constraints *c)
+{
return &dev->power.qos->performance == c;
+}
int pm_qos_update_target(struct pm_qos_constraints *c, struct plist_node *node, enum pm_qos_req_action action, int value); bool pm_qos_update_flags(struct pm_qos_flags *pqf, -- 2.12.0.432.g71c3a4f4ba37
Power domains can also represent their active states with the help of OPP tables now and this patch enhances the OPP core to support that.
The OPP nodes are allowed to have the "domain-performance-state" property, only if the device node contains a "power-domains" or "#power-domain-cells" property. The OPP nodes aren't allowed to contain this property partially, i.e. Either all OPP nodes in the OPP table have the "domain-performance-state" property or none of them have it.
The "opp-hz" property isn't mandatory anymore. It is still required for non-power-domain devices though. The power-domain devices need the unique "domain-performance-state" property per OPP node. The OPP core errors out if these rules aren't obeyed.
The per-OPP debugfs directories are also named based on domain-performance-state for power-domain devices.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/base/power/opp/core.c | 163 +++++++++++++++++++++++++++++++++++---- drivers/base/power/opp/debugfs.c | 9 ++- drivers/base/power/opp/of.c | 80 ++++++++++++++++--- drivers/base/power/opp/opp.h | 14 ++++ 4 files changed, 240 insertions(+), 26 deletions(-)
diff --git a/drivers/base/power/opp/core.c b/drivers/base/power/opp/core.c index dae61720b314..c435acb21a47 100644 --- a/drivers/base/power/opp/core.c +++ b/drivers/base/power/opp/core.c @@ -543,6 +543,63 @@ _generic_set_opp_clk_only(struct device *dev, struct clk *clk, return ret; }
+static int _update_pm_qos_request(struct device *dev, + struct dev_pm_qos_request *req, + unsigned int perf) +{ + int ret; + + if (likely(dev_pm_qos_request_active(req))) + ret = dev_pm_qos_update_request(req, perf); + else + ret = dev_pm_qos_add_request(dev, req, DEV_PM_QOS_PERFORMANCE, + perf); + + if (ret < 0) + return ret; + + return 0; +} + +static int _generic_set_opp_domain(struct device *dev, struct clk *clk, + struct dev_pm_qos_request *req, + unsigned long old_freq, unsigned long freq, + int old_dps, int new_dps) +{ + int ret; + + /* Scaling up? Scale voltage before frequency */ + if (freq > old_freq) { + ret = _update_pm_qos_request(dev, req, new_dps); + if (ret) + return ret; + } + + /* Change frequency */ + ret = _generic_set_opp_clk_only(dev, clk, old_freq, freq); + if (ret) + goto restore_dps; + + /* Scaling down? Scale voltage after frequency */ + if (freq < old_freq) { + ret = _update_pm_qos_request(dev, req, new_dps); + if (ret) + goto restore_freq; + } + + return 0; + +restore_freq: + if (_generic_set_opp_clk_only(dev, clk, freq, old_freq)) + dev_err(dev, "%s: failed to restore old-freq (%lu Hz)\n", + __func__, old_freq); +restore_dps: + if (old_dps != -1) + _update_pm_qos_request(dev, req, old_dps); + + return ret; +} + static int _generic_set_opp(struct dev_pm_set_opp_data *data) { struct dev_pm_opp_supply *old_supply = data->old_opp.supplies; @@ -663,6 +720,19 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
regulators = opp_table->regulators;
+ /* Has power domains performance states */ + if (opp_table->has_domain_perf_states) { + int old_dps = -1, new_dps; + struct dev_pm_qos_request *req = &opp_table->qos_request; + + new_dps = opp->domain_perf_state; + if (!IS_ERR(old_opp)) + old_dps = old_opp->domain_perf_state; + + return _generic_set_opp_domain(dev, clk, req, old_freq, freq, + old_dps, new_dps); + } + /* Only frequency scaling */ if (!regulators) { ret = _generic_set_opp_clk_only(dev, clk, old_freq, freq); @@ -808,6 +878,9 @@ static void _opp_table_kref_release(struct kref *kref) struct opp_table *opp_table = container_of(kref, struct opp_table, kref); struct opp_device *opp_dev;
+ if (dev_pm_qos_request_active(&opp_table->qos_request)) + dev_pm_qos_remove_request(&opp_table->qos_request); + /* Release clk */ if (!IS_ERR(opp_table->clk)) clk_put(opp_table->clk); @@ -950,18 +1023,8 @@ static bool _opp_supported_by_regulators(struct dev_pm_opp *opp, return true; }
-/* - * Returns: - * 0: On success. And appropriate error message for duplicate OPPs. - * -EBUSY: For OPP with same freq/volt and is available. The callers of - * _opp_add() must return 0 if they receive -EBUSY from it. This is to make - * sure we don't print error messages unnecessarily if different parts of - * kernel try to initialize the OPP table. - * -EEXIST: For OPP with same freq but different volt or is unavailable. This - * should be considered an error by the callers of _opp_add(). - */ -int _opp_add(struct device *dev, struct dev_pm_opp *new_opp, - struct opp_table *opp_table) +struct list_head *_opp_add_freq(struct device *dev, struct dev_pm_opp *new_opp, + struct opp_table *opp_table) { struct dev_pm_opp *opp; struct list_head *head; @@ -975,7 +1038,6 @@ int _opp_add(struct device *dev, struct dev_pm_opp *new_opp, * loop, don't replace it with head otherwise it will become an infinite * loop. */ - mutex_lock(&opp_table->lock); head = &opp_table->opp_list;
list_for_each_entry(opp, &opp_table->opp_list, node) { @@ -997,8 +1059,81 @@ int _opp_add(struct device *dev, struct dev_pm_opp *new_opp, ret = opp->available && new_opp->supplies[0].u_volt == opp->supplies[0].u_volt ? -EBUSY : -EEXIST;
+ return ERR_PTR(ret); + } + + return head; +} + +struct list_head *_opp_add_domain(struct device *dev, struct dev_pm_opp *new_opp, + struct opp_table *opp_table) +{ + struct dev_pm_opp *opp; + struct list_head *head; + int ret; + + /* + * Insert new OPP in order of increasing performance level and discard + * if already present. + * + * Need to use &opp_table->opp_list in the condition part of the 'for' + * loop, don't replace it with head otherwise it will become an infinite + * loop. + */ + head = &opp_table->opp_list; + + list_for_each_entry(opp, &opp_table->opp_list, node) { + if (new_opp->domain_perf_state > opp->domain_perf_state) { + head = &opp->node; + continue; + } + + if (new_opp->domain_perf_state < opp->domain_perf_state) + break; + + /* Duplicate OPPs */ + dev_warn(dev, "%s: duplicate OPPs detected. Existing: DPS: %u, volt: %lu, enabled: %d. New-DPS: %u, volt: %lu, enabled: %d\n", + __func__, opp->domain_perf_state, + opp->supplies[0].u_volt, opp->available, + new_opp->domain_perf_state, + new_opp->supplies[0].u_volt, new_opp->available); + + /* Should we compare voltages for all regulators here ? */ + ret = opp->available && + new_opp->supplies[0].u_volt == opp->supplies[0].u_volt ? -EBUSY : -EEXIST; + + return ERR_PTR(ret); + } + + return head; +} + +/* + * Returns: + * 0: On success. And appropriate error message for duplicate OPPs. + * -EBUSY: For OPP with same freq/dps and volt and is available. The callers of + * _opp_add() must return 0 if they receive -EBUSY from it. This is to make + * sure we don't print error messages unnecessarily if different parts of + * kernel try to initialize the OPP table. + * -EEXIST: For OPP with same freq/dps but different volt or is unavailable. + * This should be considered an error by the callers of _opp_add(). + */ +int _opp_add(struct device *dev, struct dev_pm_opp *new_opp, + struct opp_table *opp_table) +{ + struct list_head *head; + int ret; + + mutex_lock(&opp_table->lock); + + if (new_opp->rate) + head = _opp_add_freq(dev, new_opp, opp_table); + else + head = _opp_add_domain(dev, new_opp, opp_table); + + if (IS_ERR(head)) { mutex_unlock(&opp_table->lock); - return ret; + return PTR_ERR(head); }
list_add(&new_opp->node, head); diff --git a/drivers/base/power/opp/debugfs.c b/drivers/base/power/opp/debugfs.c index 95f433db4ac7..779f911fdf38 100644 --- a/drivers/base/power/opp/debugfs.c +++ b/drivers/base/power/opp/debugfs.c @@ -81,8 +81,9 @@ int opp_debug_create_one(struct dev_pm_opp *opp, struct opp_table *opp_table) struct dentry *d; char name[25]; /* 20 chars for 64 bit value + 5 (opp:\0) */
- /* Rate is unique to each OPP, use it to give opp-name */ - snprintf(name, sizeof(name), "opp:%lu", opp->rate); + /* Rate and perf-state are unique to each OPP, use them for opp-name */ + snprintf(name, sizeof(name), "opp:%lu", + opp->rate ? opp->rate : opp->domain_perf_state);
/* Create per-opp directory */ d = debugfs_create_dir(name, pdentry); @@ -104,6 +105,10 @@ int opp_debug_create_one(struct dev_pm_opp *opp, struct opp_table *opp_table) if (!debugfs_create_ulong("rate_hz", S_IRUGO, d, &opp->rate)) return -ENOMEM;
+ if (!debugfs_create_u32("power_domain_perf_state", S_IRUGO, d, + &opp->domain_perf_state)) + return -ENOMEM; + if (!opp_debug_create_supplies(opp, opp_table, d)) return -ENOMEM;
diff --git a/drivers/base/power/opp/of.c b/drivers/base/power/opp/of.c index 779428676f63..15c62010e816 100644 --- a/drivers/base/power/opp/of.c +++ b/drivers/base/power/opp/of.c @@ -284,24 +284,70 @@ static int _opp_add_static_v2(struct opp_table *opp_table, struct device *dev, if (!new_opp) return -ENOMEM;
- ret = of_property_read_u64(np, "opp-hz", &rate); - if (ret < 0) { - dev_err(dev, "%s: opp-hz not found\n", __func__); + /* Check if the OPP supports hardware's hierarchy of versions or not */ + if (!_opp_is_supported(dev, opp_table, np)) { + dev_dbg(dev, "OPP %s not supported by hardware\n", + np->full_name); + ret = 0; goto free_opp; }
- /* Check if the OPP supports hardware's hierarchy of versions or not */ - if (!_opp_is_supported(dev, opp_table, np)) { - dev_dbg(dev, "OPP not supported by hardware: %llu\n", rate); + ret = of_property_read_u64(np, "opp-hz", &rate); + if (!ret) { + /* + * Rate is defined as an unsigned long in clk API, and so + * casting explicitly to its type. Must be fixed once rate is 64 + * bit guaranteed in clk API. + */ + new_opp->rate = (unsigned long)rate; + } else if (unlikely(!opp_table->is_domain)) { + /* All devices except power-domains must have opp-hz */ + dev_err(dev, "%s: opp-hz not found\n", __func__); goto free_opp; }
/* - * Rate is defined as an unsigned long in clk API, and so casting - * explicitly to its type. Must be fixed once rate is 64 bit - * guaranteed in clk API. + * Nodes can contain domain-performance-state property only if they are + * power-domains or they have parent power domain. And either all nodes + * must have domain-performance-state property or none. */ - new_opp->rate = (unsigned long)rate; + if (!of_property_read_u32(np, "domain-performance-state", + &new_opp->domain_perf_state)) { + if (unlikely(!(opp_table->has_domain || + opp_table->is_domain))) { + ret = -EINVAL; + dev_err(dev, "%s: OPP node can't have domain-performance-state\n", + __func__); + goto free_opp; + } + + if (opp_table->has_domain_perf_states == -1) { + opp_table->has_domain_perf_states = 1; + } else if (unlikely(!opp_table->has_domain_perf_states)) { + ret = -EINVAL; + dev_err(dev, "%s: Not all OPP nodes have domain-performance-state\n", + __func__); + goto free_opp; + } + } else { + /* Power-domains must have this property */ + if (unlikely(opp_table->is_domain)) { + ret = -EINVAL; + dev_err(dev, "%s: OPP node doesn't have domain-performance-state property\n", + __func__); + goto free_opp; + } + + if (opp_table->has_domain_perf_states == -1) { + opp_table->has_domain_perf_states = 0; + } else if (unlikely(opp_table->has_domain_perf_states)) { + ret = -EINVAL; + dev_err(dev, "%s: Not all OPP nodes have domain-performance-state\n", + __func__); + goto free_opp; + } + } + new_opp->turbo = of_property_read_bool(np, "turbo-mode");
new_opp->np = np; @@ -375,6 +421,20 @@ static int _of_add_opp_table_v2(struct device *dev, struct device_node *opp_np) if (!opp_table) return -ENOMEM;
+ /* + * Only power domains or devices with parent power-domains can have + * domain-performance states. + */ + if (of_find_property(dev->of_node, "power-domains", NULL)) { + opp_table->has_domain = true; + opp_table->has_domain_perf_states = -1; + } + + if (of_find_property(dev->of_node, "#power-domain-cells", NULL)) { + opp_table->is_domain = true; + opp_table->has_domain_perf_states = -1; + } + /* We have opp-table node now, iterate over it and add OPPs */ for_each_available_child_of_node(opp_np, np) { count++; diff --git a/drivers/base/power/opp/opp.h b/drivers/base/power/opp/opp.h index 166eef990599..1d1e9ea8cda5 100644 --- a/drivers/base/power/opp/opp.h +++ b/drivers/base/power/opp/opp.h @@ -20,6 +20,7 @@ #include <linux/list.h> #include <linux/limits.h> #include <linux/pm_opp.h> +#include <linux/pm_qos.h> #include <linux/notifier.h>
struct clk; @@ -58,6 +59,7 @@ extern struct list_head opp_tables; * @dynamic: not-created from static DT entries. * @turbo: true if turbo (boost) OPP * @suspend: true if suspend OPP + * @domain_perf_state: Performance state of power domain * @rate: Frequency in hertz * @supplies: Power supplies voltage/current values * @clock_latency_ns: Latency (in nanoseconds) of switching to this OPP's @@ -76,6 +78,7 @@ struct dev_pm_opp { bool dynamic; bool turbo; bool suspend; + unsigned int domain_perf_state; unsigned long rate;
struct dev_pm_opp_supply *supplies; @@ -137,6 +140,12 @@ enum opp_table_access { * @regulator_count: Number of power supply regulators * @set_opp: Platform specific set_opp callback * @set_opp_data: Data to be passed to set_opp callback + * @is_domain: True if the device node contains "#power-domain-cells" property + * @has_domain: True if the device node contains "power-domain" property + * @has_domain_perf_states: Can have value of 0, 1 or -1. -1 means uninitialized + * state, 0 means that OPP nodes don't have perf states and 1 means that OPP + * nodes have perf states. + * @qos_request: Qos request. * @dentry: debugfs dentry pointer of the real device directory (not links). * @dentry_name: Name of the real dentry. * @@ -174,6 +183,11 @@ struct opp_table { int (*set_opp)(struct dev_pm_set_opp_data *data); struct dev_pm_set_opp_data *set_opp_data;
+ bool is_domain; + bool has_domain; + int has_domain_perf_states; + struct dev_pm_qos_request qos_request; + #ifdef CONFIG_DEBUG_FS struct dentry *dentry; char dentry_name[NAME_MAX];
This patch adds dev_pm_opp_find_dps() helper to get the OPP node for a domain-performance-state. This helper is only supported for tables representing power domains.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/base/power/opp/core.c | 66 +++++++++++++++++++++++++++++++++++++++++++ include/linux/pm_opp.h | 8 ++++++ 2 files changed, 74 insertions(+)
diff --git a/drivers/base/power/opp/core.c b/drivers/base/power/opp/core.c index c435acb21a47..212f11d65790 100644 --- a/drivers/base/power/opp/core.c +++ b/drivers/base/power/opp/core.c @@ -503,6 +503,72 @@ struct dev_pm_opp *dev_pm_opp_find_freq_floor(struct device *dev, } EXPORT_SYMBOL_GPL(dev_pm_opp_find_freq_floor);
+/** + * dev_pm_opp_find_dps() - search for an exact domain-performance-state + * @dev: device for which we do this operation + * @dps: domain-performance-state + * @available: true/false - match for available opp + * + * Return: Searches for exact match in the opp table and returns pointer to the + * matching opp if found, else returns ERR_PTR in case of error and should + * be handled using IS_ERR. Error return values can be: + * EINVAL: for bad pointer + * ERANGE: no match found for search + * ENODEV: if device not found in list of registered devices + * + * Note: available is a modifier for the search. if available=true, then the + * match is for exact matching domain-performance-state and is available in the + * stored OPP table. if false, the match is for exact domain-performance-state + * which is not available. + * + * This provides a mechanism to enable an opp which is not available currently + * or the opposite as well. + * + * The callers are required to call dev_pm_opp_put() for the returned OPP after + * use. + */ +struct dev_pm_opp *dev_pm_opp_find_dps(struct device *dev, unsigned int dps, + bool available) +{ + struct opp_table *opp_table; + struct dev_pm_opp *temp_opp, *opp = ERR_PTR(-EINVAL); + + opp_table = _find_opp_table(dev); + if (IS_ERR(opp_table)) { + int r = PTR_ERR(opp_table); + + dev_err(dev, "%s: OPP table not found (%d)\n", __func__, r); + return ERR_PTR(r); + } + + /* This API is only supported for tables representing power domains */ + if (WARN_ON(!opp_table->is_domain)) + goto put_table; + + opp = ERR_PTR(-ERANGE); + + mutex_lock(&opp_table->lock); + + list_for_each_entry(temp_opp, &opp_table->opp_list, node) { + if (temp_opp->available == available && + temp_opp->domain_perf_state == dps) { + opp = temp_opp; + + /* Increment the reference count of OPP */ + dev_pm_opp_get(opp); + break; + } + } + + mutex_unlock(&opp_table->lock); + +put_table: + dev_pm_opp_put_opp_table(opp_table); + + return opp; +} +EXPORT_SYMBOL_GPL(dev_pm_opp_find_dps); + static int _set_opp_voltage(struct device *dev, struct regulator *reg, struct dev_pm_opp_supply *supply) { diff --git a/include/linux/pm_opp.h b/include/linux/pm_opp.h index a6685b3dde26..11d3ff4de4b0 100644 --- a/include/linux/pm_opp.h +++ b/include/linux/pm_opp.h @@ -102,6 +102,8 @@ struct dev_pm_opp *dev_pm_opp_find_freq_floor(struct device *dev,
struct dev_pm_opp *dev_pm_opp_find_freq_ceil(struct device *dev, unsigned long *freq); +struct dev_pm_opp *dev_pm_opp_find_dps(struct device *dev, unsigned int freq, + bool available); void dev_pm_opp_put(struct dev_pm_opp *opp);
int dev_pm_opp_add(struct device *dev, unsigned long freq, @@ -194,6 +196,12 @@ static inline struct dev_pm_opp *dev_pm_opp_find_freq_ceil(struct device *dev, return ERR_PTR(-ENOTSUPP); }
+static inline struct dev_pm_opp *dev_pm_opp_find_dps(struct device *dev, + unsigned int freq, bool available) +{ + return ERR_PTR(-ENOTSUPP); +} + static inline void dev_pm_opp_put(struct dev_pm_opp *opp) {}
static inline int dev_pm_opp_add(struct device *dev, unsigned long freq,
Some platforms have the capability to configure the performance state of their Power Domains. The performance levels are identified by positive integer values, a lower value represents lower performance state. The power domain driver should be able to retrieve all information required to configure the performance state of the power domain, with the help of the performance constraint's target value.
This patch implements performance state management in PM domain core. The performance QOS uses the common QOS notifier list and we call __performance_notifier() if the notifier is issued for performance constraint.
This also allows the power domain drivers to implement a ->set_performance_state() callback, which will be called by the power domain core from within the notifier routine. If a domain doesn't implement ->set_performance_state() callback, then it is assumed that its parents are responsible for performance state configuration. Both devices and sub-domains are accounted for while finding the highest performance state requested.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/base/power/domain.c | 77 +++++++++++++++++++++++++++++++++++++++++++++ include/linux/pm_domain.h | 4 +++ 2 files changed, 81 insertions(+)
diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c index 6e4e22aa14a2..03dd7a61f08a 100644 --- a/drivers/base/power/domain.c +++ b/drivers/base/power/domain.c @@ -452,6 +452,79 @@ static int __resume_latency_notifier(struct generic_pm_domain_data *gpd_data, return NOTIFY_DONE; }
+static void __update_domain_performance_state(struct generic_pm_domain *genpd, + int depth) +{ + struct generic_pm_domain_data *pd_data; + struct generic_pm_domain *subdomain; + struct pm_domain_data *pdd; + unsigned int state = 0; + struct gpd_link *link; + + /* Traverse all devices within the domain */ + list_for_each_entry(pdd, &genpd->dev_list, list_node) { + pd_data = to_gpd_data(pdd); + + if (pd_data->performance_state > state) + state = pd_data->performance_state; + } + + /* Traverse all subdomains within the domain */ + list_for_each_entry(link, &genpd->master_links, master_node) { + subdomain = link->slave; + + if (subdomain->performance_state > state) + state = subdomain->performance_state; + } + + if (genpd->performance_state == state) + return; + + genpd->performance_state = state; + + if (genpd->set_performance_state) { + genpd->set_performance_state(genpd, state); + return; + } + + /* Propagate to parent power domains */ + list_for_each_entry(link, &genpd->slave_links, slave_node) { + struct generic_pm_domain *master = link->master; + + genpd_lock_nested(master, depth + 1); + __update_domain_performance_state(master, depth + 1); + genpd_unlock(master); + } +} + +static int __performance_notifier(struct generic_pm_domain_data *gpd_data, + unsigned long val) +{ + struct generic_pm_domain *genpd = ERR_PTR(-ENODATA); + struct device *dev = gpd_data->base.dev; + struct pm_domain_data *pdd; + + spin_lock_irq(&dev->power.lock); + + pdd = dev->power.subsys_data ? + dev->power.subsys_data->domain_data : NULL; + + if (pdd && pdd->dev) + genpd = dev_to_genpd(dev); + + spin_unlock_irq(&dev->power.lock); + + if (IS_ERR(genpd)) + return NOTIFY_DONE; + + genpd_lock(genpd); + gpd_data->performance_state = val; + __update_domain_performance_state(genpd, 0); + genpd_unlock(genpd); + + return NOTIFY_DONE; +} + static int genpd_dev_pm_qos_notifier(struct notifier_block *nb, unsigned long val, void *ptr) { @@ -464,6 +537,9 @@ static int genpd_dev_pm_qos_notifier(struct notifier_block *nb, if (dev_pm_qos_notifier_is_resume_latency(dev, ptr)) return __resume_latency_notifier(gpd_data, val);
+ if (dev_pm_qos_notifier_is_performance(dev, ptr)) + return __performance_notifier(gpd_data, val); + dev_err(dev, "%s: Unexpected notifier call\n", __func__); return NOTIFY_BAD; } @@ -1157,6 +1233,7 @@ static struct generic_pm_domain_data *genpd_alloc_dev_data(struct device *dev, gpd_data->td.constraint_changed = true; gpd_data->td.effective_constraint_ns = -1; gpd_data->nb.notifier_call = genpd_dev_pm_qos_notifier; + gpd_data->performance_state = 0;
spin_lock_irq(&dev->power.lock);
diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h index 5339ed5bd6f9..83795935709e 100644 --- a/include/linux/pm_domain.h +++ b/include/linux/pm_domain.h @@ -62,8 +62,11 @@ struct generic_pm_domain { unsigned int device_count; /* Number of devices */ unsigned int suspended_count; /* System suspend device counter */ unsigned int prepared_count; /* Suspend counter of prepared devices */ + unsigned int performance_state; /* Max requested performance state */ int (*power_off)(struct generic_pm_domain *domain); int (*power_on)(struct generic_pm_domain *domain); + int (*set_performance_state)(struct generic_pm_domain *domain, + unsigned int state); struct gpd_dev_ops dev_ops; s64 max_off_time_ns; /* Maximum allowed "suspended" time. */ bool max_off_time_changed; @@ -117,6 +120,7 @@ struct generic_pm_domain_data { struct pm_domain_data base; struct gpd_timing_data td; struct notifier_block nb; + unsigned int performance_state; };
#ifdef CONFIG_PM_GENERIC_DOMAINS
Some platforms have the capability to configure the performance state of their Power Domains. The performance levels are identified by positive integer values, a lower value represents lower performance state. The power domain driver should be able to retrieve all information required to configure the performance state of the power domain, with the help of the performance constraint's target value.
This patch implements performance state management in PM domain core. The performance QOS uses the common QOS notifier list and we call __performance_notifier() if the notifier is issued for performance constraint.
This also allows the power domain drivers to implement a ->set_performance_state() callback, which will be called by the power domain core from within the notifier routine. If a domain doesn't implement ->set_performance_state() callback, then it is assumed that its parents are responsible for performance state configuration. Both devices and sub-domains are accounted for while finding the highest performance state requested.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- V4->V5: - drop "notifier" from dev_pm_qos_notifier_is_performance
drivers/base/power/domain.c | 77 +++++++++++++++++++++++++++++++++++++++++++++ include/linux/pm_domain.h | 4 +++ 2 files changed, 81 insertions(+)
diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c index f6f616ac5cc2..7d35dafe8c97 100644 --- a/drivers/base/power/domain.c +++ b/drivers/base/power/domain.c @@ -462,6 +462,79 @@ static int genpd_latency_notifier(struct generic_pm_domain_data *gpd_data, return NOTIFY_DONE; }
+static void __update_domain_performance_state(struct generic_pm_domain *genpd, + int depth) +{ + struct generic_pm_domain_data *pd_data; + struct generic_pm_domain *subdomain; + struct pm_domain_data *pdd; + unsigned int state = 0; + struct gpd_link *link; + + /* Traverse all devices within the domain */ + list_for_each_entry(pdd, &genpd->dev_list, list_node) { + pd_data = to_gpd_data(pdd); + + if (pd_data->performance_state > state) + state = pd_data->performance_state; + } + + /* Traverse all subdomains within the domain */ + list_for_each_entry(link, &genpd->master_links, master_node) { + subdomain = link->slave; + + if (subdomain->performance_state > state) + state = subdomain->performance_state; + } + + if (genpd->performance_state == state) + return; + + genpd->performance_state = state; + + if (genpd->set_performance_state) { + genpd->set_performance_state(genpd, state); + return; + } + + /* Propagate to parent power domains */ + list_for_each_entry(link, &genpd->slave_links, slave_node) { + struct generic_pm_domain *master = link->master; + + genpd_lock_nested(master, depth + 1); + __update_domain_performance_state(master, depth + 1); + genpd_unlock(master); + } +} + +static int __performance_notifier(struct generic_pm_domain_data *gpd_data, + unsigned long val) +{ + struct generic_pm_domain *genpd = ERR_PTR(-ENODATA); + struct device *dev = gpd_data->base.dev; + struct pm_domain_data *pdd; + + spin_lock_irq(&dev->power.lock); + + pdd = dev->power.subsys_data ? + dev->power.subsys_data->domain_data : NULL; + + if (pdd && pdd->dev) + genpd = dev_to_genpd(dev); + + spin_unlock_irq(&dev->power.lock); + + if (IS_ERR(genpd)) + return NOTIFY_DONE; + + genpd_lock(genpd); + gpd_data->performance_state = val; + __update_domain_performance_state(genpd, 0); + genpd_unlock(genpd); + + return NOTIFY_DONE; +} + static int genpd_dev_pm_qos_notifier(struct notifier_block *nb, unsigned long val, void *ptr) { @@ -474,6 +547,9 @@ static int genpd_dev_pm_qos_notifier(struct notifier_block *nb, if (dev_pm_qos_is_resume_latency(dev, ptr)) return genpd_latency_notifier(gpd_data, val);
+ if (dev_pm_qos_is_performance(dev, ptr)) + return __performance_notifier(gpd_data, val); + dev_err(dev, "%s: Unexpected notifier call\n", __func__); return NOTIFY_BAD; } @@ -1168,6 +1244,7 @@ static struct generic_pm_domain_data *genpd_alloc_dev_data(struct device *dev, gpd_data->td.constraint_changed = true; gpd_data->td.effective_constraint_ns = -1; gpd_data->nb.notifier_call = genpd_dev_pm_qos_notifier; + gpd_data->performance_state = 0;
spin_lock_irq(&dev->power.lock);
diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h index b7803a251044..84ee474e66d0 100644 --- a/include/linux/pm_domain.h +++ b/include/linux/pm_domain.h @@ -63,8 +63,11 @@ struct generic_pm_domain { unsigned int device_count; /* Number of devices */ unsigned int suspended_count; /* System suspend device counter */ unsigned int prepared_count; /* Suspend counter of prepared devices */ + unsigned int performance_state; /* Max requested performance state */ int (*power_off)(struct generic_pm_domain *domain); int (*power_on)(struct generic_pm_domain *domain); + int (*set_performance_state)(struct generic_pm_domain *domain, + unsigned int state); struct gpd_dev_ops dev_ops; s64 max_off_time_ns; /* Maximum allowed "suspended" time. */ bool max_off_time_changed; @@ -118,6 +121,7 @@ struct generic_pm_domain_data { struct pm_domain_data base; struct gpd_timing_data td; struct notifier_block nb; + unsigned int performance_state; void *data; };
The power-domain core would be using the OPP core going forward and the OPP core has a basic requirement of a device structure for its working.
Add a struct device to the genpd structure and also add a genpd bus type for the devices.
Note that the of_node field of the device is only set when separate DT node is present for the power-domain, otherwise the of node is common across multiple genpd devices and filling the of_node field with it doesn't sound right.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/base/power/domain.c | 36 ++++++++++++++++++++++++++++++++++++ include/linux/pm_domain.h | 1 + 2 files changed, 37 insertions(+)
diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c index 03dd7a61f08a..51d3afc0476d 100644 --- a/drivers/base/power/domain.c +++ b/drivers/base/power/domain.c @@ -1536,6 +1536,10 @@ static void genpd_lock_init(struct generic_pm_domain *genpd) } }
+static struct bus_type genpd_bus_type = { + .name = "genpd", +}; + /** * pm_genpd_init - Initialize a generic I/O PM domain object. * @genpd: PM domain object to initialize. @@ -1588,6 +1592,18 @@ int pm_genpd_init(struct generic_pm_domain *genpd, return ret; }
+ genpd->dev.bus = &genpd_bus_type; + device_initialize(&genpd->dev); + dev_set_name(&genpd->dev, "%s", genpd->name); + + ret = device_add(&genpd->dev); + if (ret) { + dev_err(&genpd->dev, "failed to add device: %d\n", ret); + put_device(&genpd->dev); + kfree(genpd->free); + return ret; + } + mutex_lock(&gpd_list_lock); list_add(&genpd->gpd_list_node, &gpd_list); mutex_unlock(&gpd_list_lock); @@ -1625,6 +1641,7 @@ static int genpd_remove(struct generic_pm_domain *genpd)
list_del(&genpd->gpd_list_node); genpd_unlock(genpd); + device_del(&genpd->dev); cancel_work_sync(&genpd->power_off_work); kfree(genpd->free); pr_debug("%s: removed %s\n", __func__, genpd->name); @@ -1794,6 +1811,7 @@ int of_genpd_add_provider_simple(struct device_node *np, if (!ret) { genpd->provider = &np->fwnode; genpd->has_provider = true; + genpd->dev->of_node = np; } }
@@ -2407,3 +2425,21 @@ static void __exit pm_genpd_debug_exit(void) } __exitcall(pm_genpd_debug_exit); #endif /* CONFIG_DEBUG_FS */ + +static int __init pm_genpd_core_init(void) +{ + int ret; + + ret = bus_register(&genpd_bus_type); + if (ret) + pr_err("bus_register failed (%d)\n", ret); + + return ret; +} +pure_initcall(pm_genpd_core_init); + +static void __exit pm_genpd_core_exit(void) +{ + bus_unregister(&genpd_bus_type); +} +__exitcall(pm_genpd_core_exit); diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h index 83795935709e..d55c0112dcde 100644 --- a/include/linux/pm_domain.h +++ b/include/linux/pm_domain.h @@ -47,6 +47,7 @@ struct genpd_power_state { struct genpd_lock_ops;
struct generic_pm_domain { + struct device dev; struct dev_pm_domain domain; /* PM domain operations */ struct list_head gpd_list_node; /* Node in the global PM domains list */ struct list_head master_links; /* Links with PM domain as a master */
Parse the OPP table from of_genpd_add_provider_simple() by calling dev_pm_opp_of_add_table(), if the power domain supports changing of performance states.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/base/power/domain.c | 46 +++++++++++++++++++++++++++++++++++++-------- include/linux/pm_domain.h | 1 + 2 files changed, 39 insertions(+), 8 deletions(-)
diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c index 51d3afc0476d..8155f95b0db2 100644 --- a/drivers/base/power/domain.c +++ b/drivers/base/power/domain.c @@ -10,6 +10,7 @@ #include <linux/kernel.h> #include <linux/io.h> #include <linux/platform_device.h> +#include <linux/pm_opp.h> #include <linux/pm_runtime.h> #include <linux/pm_domain.h> #include <linux/pm_qos.h> @@ -1806,15 +1807,37 @@ int of_genpd_add_provider_simple(struct device_node *np,
mutex_lock(&gpd_list_lock);
- if (pm_genpd_present(genpd)) { - ret = genpd_add_provider(np, genpd_xlate_simple, genpd); - if (!ret) { - genpd->provider = &np->fwnode; - genpd->has_provider = true; - genpd->dev->of_node = np; + if (!pm_genpd_present(genpd)) + goto unlock; + + genpd->dev.of_node = np; + + /* Parse genpd OPP table */ + if (genpd->set_performance_state) { + ret = dev_pm_opp_of_add_table(&genpd->dev); + if (ret) { + dev_err(&genpd->dev, "Failed to add OPP table: %d\n", + ret); + goto unlock; + } + + genpd->has_opp_table = true; + } + + ret = genpd_add_provider(np, genpd_xlate_simple, genpd); + if (ret) { + if (genpd->has_opp_table) { + genpd->has_opp_table = false; + dev_pm_opp_of_remove_table(&genpd->dev); } + + goto unlock; }
+ genpd->provider = &np->fwnode; + genpd->has_provider = true; + +unlock: mutex_unlock(&gpd_list_lock);
return ret; @@ -1887,10 +1910,17 @@ void of_genpd_del_provider(struct device_node *np) * provider, set the 'has_provider' to false * so that the PM domain can be safely removed. */ - list_for_each_entry(gpd, &gpd_list, gpd_list_node) - if (gpd->provider == &np->fwnode) + list_for_each_entry(gpd, &gpd_list, gpd_list_node) { + if (gpd->provider == &np->fwnode) { gpd->has_provider = false;
+ if (!gpd->has_opp_table) + continue; + + dev_pm_opp_of_remove_table(&gpd->dev); + } + } + list_del(&cp->link); of_node_put(cp->node); kfree(cp); diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h index d55c0112dcde..821d7cf5974b 100644 --- a/include/linux/pm_domain.h +++ b/include/linux/pm_domain.h @@ -57,6 +57,7 @@ struct generic_pm_domain { struct work_struct power_off_work; struct fwnode_handle *provider; /* Identity of the domain provider */ bool has_provider; + bool has_opp_table; const char *name; atomic_t sd_count; /* Number of subdomains with power "on" */ enum gpd_status status; /* Current state of the domain */
On 20-03-17, 15:02, Viresh Kumar wrote:
Hi,
The main feedback I got for the V3 series came from Kevin, who suggested that we should reuse OPP tables for genpd devices as well, instead of creating a new table type. And that's what this version is trying to do.
Some platforms have the capability to configure the performance state of their power domains. The process of configuring the performance state is pretty much platform dependent and we may need to work with a wide range of configurables. For some platforms, like Qcom, it can be a positive integer value alone, while in other cases it can be voltage levels, etc.
The power-domain framework until now was only designed for the idle state management of the device and this needs to change in order to reuse the power-domain framework for active state management of the devices.
Hi Ulf/Kevin,
Over 3 weeks since the time this version was posted :( Can we get some reviews of this stuff and decide on how we are supposed to proceed on this ?
Its getting delayed a lot unnecessarily. Thanks.
linaro-kernel@lists.linaro.org