Hi,
Here is the 6th version of the series, which incorporates feedback from Kevin and Sudeep:
- Use freq/voltage in OPP table as it is for power domain and don't create "domain-performance-level" property - Take care of domain providers that provide multiple domains
Here is a brief summary of the problem I am trying to solve.
Some platforms have the capability to configure the performance state of their power domains. The process of configuring the performance state is pretty much platform dependent and we may need to work with a wide range of configurables. For some platforms, like Qcom, it can be a positive integer value alone, while in other cases it can be voltage levels, etc.
The power-domain framework until now was only designed for the idle state management of the device and this needs to change in order to reuse the power-domain framework for active state management of the devices.
This series adapts the genpd and OPP frameworks to allow OPP tables to be used for the genpd devices as well.
The first 2 patches update the DT bindings of the power-domains and OPP tables. And the other 7 patches implement the details in QoS, genpd and OPP frameworks.
This is tested currently by hacking the kernel a bit with virtual power-domains for the dual A15 exynos platform. The earlier version of patches was also tested by Rajendra Nayak (Qcom) on *real* Qualcomm hardware for which this work is getting done. Hope this version should work as well.
Here is sample DT and C code we need to write for platforms:
DT: ---
/ { domain_opp_table: opp_table0 { compatible = "operating-points-v2";
domain_opp_1: opp-1 { opp-hz = /bits/ 64 <1>; opp-microvolt = <975000 970000 985000>; }; domain_opp_2: opp-2 { opp-hz = /bits/ 64 <2>; opp-microvolt = <1075000 1000000 1085000>; }; };
foo_domain: power-controller@12340000 { compatible = "foo,power-controller"; reg = <0x12340000 0x1000>; #power-domain-cells = <0>; operating-points-v2 = <&domain_opp_table>; }
cpu0_opp_table: opp_table1 { compatible = "operating-points-v2"; opp-shared;
opp-1000000000 { opp-hz = /bits/ 64 <1000000000>; power-domain-opp = <&domain_opp_1>; }; opp-1100000000 { opp-hz = /bits/ 64 <1100000000>; power-domain-opp = <&domain_opp_2>; }; opp-1200000000 { opp-hz = /bits/ 64 <1200000000>; power-domain-opp = <&domain_opp_2>; }; };
cpus { #address-cells = <1>; #size-cells = <0>;
cpu@0 { compatible = "arm,cortex-a9"; reg = <0>; clocks = <&clk_controller 0>; clock-names = "cpu"; operating-points-v2 = <&cpu0_opp_table>; power-domains = <&foo_domain>; }; }; };
Driver code: ------------
static int pd_performance(struct generic_pm_domain *domain, unsigned int state) { struct dev_pm_opp *opp;
opp = dev_pm_opp_find_freq_exact(&domain->dev, state, true);
/* Use OPP and state in platform specific way */
return 0; }
static const struct of_device_id pm_domain_of_match[] __initconst = { { .compatible = "foo,genpd", }, { }, };
static int __init genpd_test_init(void) { struct device *dev = get_cpu_device(0); struct device_node *np; const struct of_device_id *match; int n; int ret;
for_each_matching_node_and_match(np, pm_domain_of_match, &match) { pd.name = kstrdup_const(strrchr(np->full_name, '/') + 1, GFP_KERNEL); if (!pd.name) { of_node_put(np); return -ENOMEM; }
pd.set_performance_state = pd_performance;
pm_genpd_init(&pd, NULL, false); of_genpd_add_provider_simple(np, &pd); }
ret = dev_pm_domain_attach(dev, false);
return ret; }
Pushed here as well:
https://git.linaro.org/people/viresh.kumar/linux.git/log/?h=opp/genpd-perfor...
V5->V6: - Use freq/voltage in OPP table as it is for power domain and don't create "domain-performance-level" property - Create new "power-domain-opp" property for the devices. - Take care of domain providers that provide multiple domains and extend "operating-points-v2" property to contain a list of phandles - Update code according to those bindings.
V4->V5: - Only 3 patches were resent and 2 of them are Acked from Ulf.
V3->V4: - Use OPP table for genpd devices as well. - Add struct device to genpd, in order to reuse OPP infrastructure. - Based over: https://marc.info/?l=linux-kernel&m=148972988002317&w=2 - Fixed examples in DT document to have voltage in target,min,max order.
V2->V3: - Based over latest pm/linux-next - Bindings and code are merged together - Lots of updates in bindings - the performance-states node is present within the power-domain now, instead of its phandle. - performance-level property is replaced by "reg". - domain-performance-state property of the consumers contain an integer value now instead of phandle. - Lots of updates to the code as well - Patch "PM / QOS: Add default case to the switch" is merged with other patches and the code is changed a bit as well. - Don't pass 'type' to dev_pm_qos_add_notifier(), rather handle all notifiers with a single list. A new patch is added for that. - The OPP framework patch can be applied now and has proper SoB from me. - Dropped "PM / domain: Save/restore performance state at runtime suspend/resume". - Drop all WARN(). - Tested-by Rajendra nayak.
V1->V2: - Based over latest pm/linux-next - It is mostly a resend of what is sent earlier as this series hasn't got any reviews so far and Rafael suggested that its better I resend it. - Only the 4/6 patch got an update, which was shared earlier as reply to V1 as well. It has got several fixes for taking care of power domain hierarchy, etc.
-- viresh
Viresh Kumar (9): PM / OPP: Introduce "power-domain-opp" property PM / Domains: Allow OPP table to be used for power-domains PM / QOS: Keep common notifier list for genpd constraints PM / QOS: Add DEV_PM_QOS_PERFORMANCE request PM / OPP: Add support to parse "power-domain-opp" property PM / OPP: Implement dev_pm_opp_of_add_table_indexed() PM / domain: Register PM QOS performance notifier PM / Domain: Add struct device to genpd PM / Domain: Add support to parse domain's OPP table
Documentation/devicetree/bindings/opp/opp.txt | 74 ++++++- .../devicetree/bindings/power/power_domain.txt | 106 ++++++++++ Documentation/power/pm_qos_interface.txt | 2 +- drivers/base/power/domain.c | 222 +++++++++++++++++++-- drivers/base/power/opp/core.c | 72 +++++++ drivers/base/power/opp/debugfs.c | 3 + drivers/base/power/opp/of.c | 123 +++++++++++- drivers/base/power/opp/opp.h | 12 ++ drivers/base/power/qos.c | 36 +++- include/linux/pm_domain.h | 6 + include/linux/pm_opp.h | 6 + include/linux/pm_qos.h | 16 ++ kernel/power/qos.c | 2 +- 13 files changed, 641 insertions(+), 39 deletions(-)
Power-domains need to express their active states in DT and the devices within the power-domain need to express their dependency on those active states. The power-domains can use the OPP tables without any modifications to the bindings.
Add a new property "power-domain-opp", which will contain phandle to the OPP node of the parent power domain. This is required for devices which have dependency on the configured active state of the power domain for their working.
For some platforms the actual frequency and voltages of the power domains are managed by the firmware and are so hidden from the high level operating system. The "opp-hz" property is relaxed a bit to contain indexes instead of actual frequency values to support such platforms.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- Documentation/devicetree/bindings/opp/opp.txt | 74 ++++++++++++++++++++++++++- 1 file changed, 73 insertions(+), 1 deletion(-)
diff --git a/Documentation/devicetree/bindings/opp/opp.txt b/Documentation/devicetree/bindings/opp/opp.txt index 63725498bd20..6e30cae2a936 100644 --- a/Documentation/devicetree/bindings/opp/opp.txt +++ b/Documentation/devicetree/bindings/opp/opp.txt @@ -77,7 +77,10 @@ This defines voltage-current-frequency combinations along with other related properties.
Required properties: -- opp-hz: Frequency in Hz, expressed as a 64-bit big-endian integer. +- opp-hz: Frequency in Hz, expressed as a 64-bit big-endian integer. In some + cases the exact frequency in Hz may be hidden from the OS by the firmware and + this field may contain values that represent the frequency in a firmware + dependent way, for example an index of an array in the firmware.
Optional properties: - opp-microvolt: voltage in micro Volts. @@ -154,6 +157,13 @@ properties.
- status: Marks the node enabled/disabled.
+- power-domain-opp: Phandle to the OPP node of the parent power-domain. The + parent power-domain should be configured to the OPP whose node is pointed by + the phandle, in order to configure the device for the OPP node that contains + this property. The order in which the device and power domain should be + configured is implementation defined. The OPP table of a device can set this + property only if the device node contains "power-domains" property. + Example 1: Single cluster Dual-core ARM cortex A9, switch DVFS states together.
/ { @@ -528,3 +538,65 @@ Example 5: opp-supported-hw }; }; }; + +Example 7: Power domains with their own OPP tables: +(example: For 1GHz device require domain state 1 and for 1.1 & 1.2 GHz device require state 2) + +/ { + domain_opp_table: opp_table0 { + compatible = "operating-points-v2"; + + /* + * NOTE: Actual frequency is managed by firmware and is hidden + * from HLOS, so we simply use index in the opp-hz field to + * select the OPP. + */ + domain_opp_1: opp-1 { + opp-hz = /bits/ 64 <1>; + opp-microvolt = <975000 970000 985000>; + }; + domain_opp_2: opp-2 { + opp-hz = /bits/ 64 <2>; + opp-microvolt = <1075000 1000000 1085000>; + }; + }; + + foo_domain: power-controller@12340000 { + compatible = "foo,power-controller"; + reg = <0x12340000 0x1000>; + #power-domain-cells = <0>; + operating-points-v2 = <&domain_opp_table>; + } + + cpu0_opp_table: opp_table1 { + compatible = "operating-points-v2"; + opp-shared; + + opp-1000000000 { + opp-hz = /bits/ 64 <1000000000>; + power-domain-opp = <&domain_opp_1>; + }; + opp-1100000000 { + opp-hz = /bits/ 64 <1100000000>; + power-domain-opp = <&domain_opp_2>; + }; + opp-1200000000 { + opp-hz = /bits/ 64 <1200000000>; + power-domain-opp = <&domain_opp_2>; + }; + }; + + cpus { + #address-cells = <1>; + #size-cells = <0>; + + cpu@0 { + compatible = "arm,cortex-a9"; + reg = <0>; + clocks = <&clk_controller 0>; + clock-names = "cpu"; + operating-points-v2 = <&cpu0_opp_table>; + power-domains = <&foo_domain>; + }; + }; +};
On Wed, Apr 26, 2017 at 04:27:05PM +0530, Viresh Kumar wrote:
Power-domains need to express their active states in DT and the devices within the power-domain need to express their dependency on those active states. The power-domains can use the OPP tables without any modifications to the bindings.
Add a new property "power-domain-opp", which will contain phandle to the OPP node of the parent power domain. This is required for devices which have dependency on the configured active state of the power domain for their working.
For some platforms the actual frequency and voltages of the power domains are managed by the firmware and are so hidden from the high level operating system. The "opp-hz" property is relaxed a bit to contain indexes instead of actual frequency values to support such platforms.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
Documentation/devicetree/bindings/opp/opp.txt | 74 ++++++++++++++++++++++++++- 1 file changed, 73 insertions(+), 1 deletion(-)
diff --git a/Documentation/devicetree/bindings/opp/opp.txt b/Documentation/devicetree/bindings/opp/opp.txt index 63725498bd20..6e30cae2a936 100644 --- a/Documentation/devicetree/bindings/opp/opp.txt +++ b/Documentation/devicetree/bindings/opp/opp.txt @@ -77,7 +77,10 @@ This defines voltage-current-frequency combinations along with other related properties. Required properties: -- opp-hz: Frequency in Hz, expressed as a 64-bit big-endian integer. +- opp-hz: Frequency in Hz, expressed as a 64-bit big-endian integer. In some
- cases the exact frequency in Hz may be hidden from the OS by the firmware and
- this field may contain values that represent the frequency in a firmware
- dependent way, for example an index of an array in the firmware.
Not really sure OPP binding makes sense here. What about all the other properties. We expose voltage, but not freq?
Optional properties:
- opp-microvolt: voltage in micro Volts.
@@ -154,6 +157,13 @@ properties.
- status: Marks the node enabled/disabled.
+- power-domain-opp: Phandle to the OPP node of the parent power-domain. The
- parent power-domain should be configured to the OPP whose node is pointed by
- the phandle, in order to configure the device for the OPP node that contains
- this property. The order in which the device and power domain should be
- configured is implementation defined. The OPP table of a device can set this
- property only if the device node contains "power-domains" property.
I don't even know what to say on this. The continual evolution of OPP bindings continues. This seems like further abuse of DT power-domains (being a region in a chip that can be powergated) with Linux PM domains.
Rob
On 28/04/17 21:48, Rob Herring wrote:
On Wed, Apr 26, 2017 at 04:27:05PM +0530, Viresh Kumar wrote:
Power-domains need to express their active states in DT and the devices within the power-domain need to express their dependency on those active states. The power-domains can use the OPP tables without any modifications to the bindings.
Add a new property "power-domain-opp", which will contain phandle to the OPP node of the parent power domain. This is required for devices which have dependency on the configured active state of the power domain for their working.
For some platforms the actual frequency and voltages of the power domains are managed by the firmware and are so hidden from the high level operating system. The "opp-hz" property is relaxed a bit to contain indexes instead of actual frequency values to support such platforms.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
Documentation/devicetree/bindings/opp/opp.txt | 74 ++++++++++++++++++++++++++- 1 file changed, 73 insertions(+), 1 deletion(-)
diff --git a/Documentation/devicetree/bindings/opp/opp.txt b/Documentation/devicetree/bindings/opp/opp.txt index 63725498bd20..6e30cae2a936 100644 --- a/Documentation/devicetree/bindings/opp/opp.txt +++ b/Documentation/devicetree/bindings/opp/opp.txt @@ -77,7 +77,10 @@ This defines voltage-current-frequency combinations along with other related properties. Required properties: -- opp-hz: Frequency in Hz, expressed as a 64-bit big-endian integer. +- opp-hz: Frequency in Hz, expressed as a 64-bit big-endian integer. In some
- cases the exact frequency in Hz may be hidden from the OS by the firmware and
- this field may contain values that represent the frequency in a firmware
- dependent way, for example an index of an array in the firmware.
Not really sure OPP binding makes sense here. What about all the other properties. We expose voltage, but not freq?
I completely agree with that and I have been pushing this to be represented as just regulators[0]. Mark B seem to dislike that idea [1]
Sudeep Holla sudeep.holla@arm.com writes:
On 28/04/17 21:48, Rob Herring wrote:
On Wed, Apr 26, 2017 at 04:27:05PM +0530, Viresh Kumar wrote:
Power-domains need to express their active states in DT and the devices within the power-domain need to express their dependency on those active states. The power-domains can use the OPP tables without any modifications to the bindings.
Add a new property "power-domain-opp", which will contain phandle to the OPP node of the parent power domain. This is required for devices which have dependency on the configured active state of the power domain for their working.
For some platforms the actual frequency and voltages of the power domains are managed by the firmware and are so hidden from the high level operating system. The "opp-hz" property is relaxed a bit to contain indexes instead of actual frequency values to support such platforms.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
Documentation/devicetree/bindings/opp/opp.txt | 74 ++++++++++++++++++++++++++- 1 file changed, 73 insertions(+), 1 deletion(-)
diff --git a/Documentation/devicetree/bindings/opp/opp.txt b/Documentation/devicetree/bindings/opp/opp.txt index 63725498bd20..6e30cae2a936 100644 --- a/Documentation/devicetree/bindings/opp/opp.txt +++ b/Documentation/devicetree/bindings/opp/opp.txt @@ -77,7 +77,10 @@ This defines voltage-current-frequency combinations along with other related properties. Required properties: -- opp-hz: Frequency in Hz, expressed as a 64-bit big-endian integer. +- opp-hz: Frequency in Hz, expressed as a 64-bit big-endian integer. In some
- cases the exact frequency in Hz may be hidden from the OS by the firmware and
- this field may contain values that represent the frequency in a firmware
- dependent way, for example an index of an array in the firmware.
Not really sure OPP binding makes sense here. What about all the other properties. We expose voltage, but not freq?
I completely agree with that and I have been pushing this to be represented as just regulators[0]. Mark B seem to dislike that idea [1]
And Mark is right, because what's being described is not (simply) a voltage regultor. While it might be "just" voltage on some SoCs (for now), it is clearly about performance (a.k.a. OPP) on others.
Kevin
On 06/05/17 10:39, Kevin Hilman wrote:
Sudeep Holla sudeep.holla@arm.com writes:
On 28/04/17 21:48, Rob Herring wrote:
On Wed, Apr 26, 2017 at 04:27:05PM +0530, Viresh Kumar wrote:
Power-domains need to express their active states in DT and the devices within the power-domain need to express their dependency on those active states. The power-domains can use the OPP tables without any modifications to the bindings.
Add a new property "power-domain-opp", which will contain phandle to the OPP node of the parent power domain. This is required for devices which have dependency on the configured active state of the power domain for their working.
For some platforms the actual frequency and voltages of the power domains are managed by the firmware and are so hidden from the high level operating system. The "opp-hz" property is relaxed a bit to contain indexes instead of actual frequency values to support such platforms.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
Documentation/devicetree/bindings/opp/opp.txt | 74 ++++++++++++++++++++++++++- 1 file changed, 73 insertions(+), 1 deletion(-)
diff --git a/Documentation/devicetree/bindings/opp/opp.txt b/Documentation/devicetree/bindings/opp/opp.txt index 63725498bd20..6e30cae2a936 100644 --- a/Documentation/devicetree/bindings/opp/opp.txt +++ b/Documentation/devicetree/bindings/opp/opp.txt @@ -77,7 +77,10 @@ This defines voltage-current-frequency combinations along with other related properties. Required properties: -- opp-hz: Frequency in Hz, expressed as a 64-bit big-endian integer. +- opp-hz: Frequency in Hz, expressed as a 64-bit big-endian integer. In some
- cases the exact frequency in Hz may be hidden from the OS by the firmware and
- this field may contain values that represent the frequency in a firmware
- dependent way, for example an index of an array in the firmware.
Not really sure OPP binding makes sense here. What about all the other properties. We expose voltage, but not freq?
I completely agree with that and I have been pushing this to be represented as just regulators[0]. Mark B seem to dislike that idea [1]
And Mark is right, because what's being described is not (simply) a voltage regultor. While it might be "just" voltage on some SoCs (for now), it is clearly about performance (a.k.a. OPP) on others.
Agreed. What I was against in this particular case was it was just voltage for the domain and the devices had their own OPP with clocks described which looks really weird when both are represented as OPPs.
I am fine with OPP representation in all such cases provide the bindings are well defined especially if they are hierarchical, what takes precedence, ...etc.
On 03-05-17, 12:29, Sudeep Holla wrote:
On 28/04/17 21:48, Rob Herring wrote:
On Wed, Apr 26, 2017 at 04:27:05PM +0530, Viresh Kumar wrote:
Power-domains need to express their active states in DT and the devices within the power-domain need to express their dependency on those active states. The power-domains can use the OPP tables without any modifications to the bindings.
Add a new property "power-domain-opp", which will contain phandle to the OPP node of the parent power domain. This is required for devices which have dependency on the configured active state of the power domain for their working.
For some platforms the actual frequency and voltages of the power domains are managed by the firmware and are so hidden from the high level operating system. The "opp-hz" property is relaxed a bit to contain indexes instead of actual frequency values to support such platforms.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
Documentation/devicetree/bindings/opp/opp.txt | 74 ++++++++++++++++++++++++++- 1 file changed, 73 insertions(+), 1 deletion(-)
diff --git a/Documentation/devicetree/bindings/opp/opp.txt b/Documentation/devicetree/bindings/opp/opp.txt index 63725498bd20..6e30cae2a936 100644 --- a/Documentation/devicetree/bindings/opp/opp.txt +++ b/Documentation/devicetree/bindings/opp/opp.txt @@ -77,7 +77,10 @@ This defines voltage-current-frequency combinations along with other related properties. Required properties: -- opp-hz: Frequency in Hz, expressed as a 64-bit big-endian integer. +- opp-hz: Frequency in Hz, expressed as a 64-bit big-endian integer. In some
- cases the exact frequency in Hz may be hidden from the OS by the firmware and
- this field may contain values that represent the frequency in a firmware
- dependent way, for example an index of an array in the firmware.
Not really sure OPP binding makes sense here. What about all the other properties. We expose voltage, but not freq?
I completely agree with that and I have been pushing this to be represented as just regulators[0]. Mark B seem to dislike that idea [1]
Just as an update, Rajendra confirmed (offline) that for some of the implementations, the microcontroller handles both frequency and voltages of a device. So it isn't just a regulator anymore and as me and Kevin were saying, we need a complete OPP here.
On 08/05/17 08:13, Viresh Kumar wrote:
On 03-05-17, 12:29, Sudeep Holla wrote:
On 28/04/17 21:48, Rob Herring wrote:
On Wed, Apr 26, 2017 at 04:27:05PM +0530, Viresh Kumar wrote:
Power-domains need to express their active states in DT and the devices within the power-domain need to express their dependency on those active states. The power-domains can use the OPP tables without any modifications to the bindings.
Add a new property "power-domain-opp", which will contain phandle to the OPP node of the parent power domain. This is required for devices which have dependency on the configured active state of the power domain for their working.
For some platforms the actual frequency and voltages of the power domains are managed by the firmware and are so hidden from the high level operating system. The "opp-hz" property is relaxed a bit to contain indexes instead of actual frequency values to support such platforms.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
Documentation/devicetree/bindings/opp/opp.txt | 74 ++++++++++++++++++++++++++- 1 file changed, 73 insertions(+), 1 deletion(-)
diff --git a/Documentation/devicetree/bindings/opp/opp.txt b/Documentation/devicetree/bindings/opp/opp.txt index 63725498bd20..6e30cae2a936 100644 --- a/Documentation/devicetree/bindings/opp/opp.txt +++ b/Documentation/devicetree/bindings/opp/opp.txt @@ -77,7 +77,10 @@ This defines voltage-current-frequency combinations along with other related properties. Required properties: -- opp-hz: Frequency in Hz, expressed as a 64-bit big-endian integer. +- opp-hz: Frequency in Hz, expressed as a 64-bit big-endian integer. In some
- cases the exact frequency in Hz may be hidden from the OS by the firmware and
- this field may contain values that represent the frequency in a firmware
- dependent way, for example an index of an array in the firmware.
Not really sure OPP binding makes sense here. What about all the other properties. We expose voltage, but not freq?
I completely agree with that and I have been pushing this to be represented as just regulators[0]. Mark B seem to dislike that idea [1]
Just as an update, Rajendra confirmed (offline) that for some of the implementations, the microcontroller handles both frequency and voltages of a device. So it isn't just a regulator anymore and as me and Kevin were saying, we need a complete OPP here.
Yes, I followed the thread and figured that out. But Rajendra also raised "What if the microcontroller firmware maps the performance-index to voltage but expects linux to scale the frequency? There is no way to specify a performance-index *and* a frequency for a OPP now I guess? So this needs to be addressd now IIUC.
So as Kevin pointed out, we need to experiment and look at all possibilities before finalizing the bindings. Better to have examples for all these and describe how bindings are be used including how to distinguish between these use-case from the bindings if it's not implicit.
On 08-05-17, 14:57, Sudeep Holla wrote:
Yes, I followed the thread and figured that out. But Rajendra also raised "What if the microcontroller firmware maps the performance-index to voltage but expects linux to scale the frequency? There is no way to specify a performance-index *and* a frequency for a OPP now I guess? So this needs to be addressd now IIUC.
No, he misunderstood it. He was saying that the domain needs a performance-index and the device needs freq-scaling, how do we do that? He thought that there will be just one OPP table for the device here, but we will actually have two and that would work.
So as Kevin pointed out, we need to experiment and look at all possibilities before finalizing the bindings. Better to have examples for all these and describe how bindings are be used including how to distinguish between these use-case from the bindings if it's not implicit.
Yeah, I have some doubts on how we are going to implement that and looking for more input from him.
Rob Herring robh@kernel.org writes:
On Wed, Apr 26, 2017 at 04:27:05PM +0530, Viresh Kumar wrote:
Power-domains need to express their active states in DT and the devices within the power-domain need to express their dependency on those active states. The power-domains can use the OPP tables without any modifications to the bindings.
Add a new property "power-domain-opp", which will contain phandle to the OPP node of the parent power domain. This is required for devices which have dependency on the configured active state of the power domain for their working.
For some platforms the actual frequency and voltages of the power domains are managed by the firmware and are so hidden from the high level operating system. The "opp-hz" property is relaxed a bit to contain indexes instead of actual frequency values to support such platforms.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
Documentation/devicetree/bindings/opp/opp.txt | 74 ++++++++++++++++++++++++++- 1 file changed, 73 insertions(+), 1 deletion(-)
diff --git a/Documentation/devicetree/bindings/opp/opp.txt b/Documentation/devicetree/bindings/opp/opp.txt index 63725498bd20..6e30cae2a936 100644 --- a/Documentation/devicetree/bindings/opp/opp.txt +++ b/Documentation/devicetree/bindings/opp/opp.txt @@ -77,7 +77,10 @@ This defines voltage-current-frequency combinations along with other related properties. Required properties: -- opp-hz: Frequency in Hz, expressed as a 64-bit big-endian integer. +- opp-hz: Frequency in Hz, expressed as a 64-bit big-endian integer. In some
- cases the exact frequency in Hz may be hidden from the OS by the firmware and
- this field may contain values that represent the frequency in a firmware
- dependent way, for example an index of an array in the firmware.
Not really sure OPP binding makes sense here.
I think OPP makes perfect sense here, because microcontroller firmware is managaging OPPs in hardware. We just may not know the exact voltage and/or frequency (and the firmware/hardware may even be doing AVS for micro-adjustments.)
What about all the other properties. We expose voltage, but not freq?
I had the same question. Seems the same comment about an abstract "index" is needed for voltage also.
Optional properties:
- opp-microvolt: voltage in micro Volts.
@@ -154,6 +157,13 @@ properties.
- status: Marks the node enabled/disabled.
+- power-domain-opp: Phandle to the OPP node of the parent power-domain. The
- parent power-domain should be configured to the OPP whose node is pointed by
- the phandle, in order to configure the device for the OPP node that contains
- this property. The order in which the device and power domain should be
- configured is implementation defined. The OPP table of a device can set this
- property only if the device node contains "power-domains" property.
I do understand the need to map a device OPP to a parent power-domain OPP, but I really don't like another phandle.
First, just because a device OPP changes does not mean that a power-domain OPP has to change. What really needs to be specified is a minimum requirement, not an exact OPP. IOW, if a device changes OPP, the power-domain OPP has to be *at least* an OPP that can guarantee that level of performance, but could also be a more performant OPP, right?
Also, the parent power-domain driver will have a list of all its devices, and be able to get OPPs from those devices.
IMO, we should do the first (few) implementations of this feature from the power-domain driver itself, and not try to figure out how to define this for everyone in DT until we have a better handle on it (pun intended) ;)
I don't even know what to say on this. The continual evolution of OPP bindings continues. This seems like further abuse of DT power-domains (being a region in a chip that can be powergated) with Linux PM domains.
Agreed.
Kevin
On 06-05-17, 11:58, Kevin Hilman wrote:
Rob Herring robh@kernel.org writes:
On Wed, Apr 26, 2017 at 04:27:05PM +0530, Viresh Kumar wrote:
Power-domains need to express their active states in DT and the devices within the power-domain need to express their dependency on those active states. The power-domains can use the OPP tables without any modifications to the bindings.
Add a new property "power-domain-opp", which will contain phandle to the OPP node of the parent power domain. This is required for devices which have dependency on the configured active state of the power domain for their working.
For some platforms the actual frequency and voltages of the power domains are managed by the firmware and are so hidden from the high level operating system. The "opp-hz" property is relaxed a bit to contain indexes instead of actual frequency values to support such platforms.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
Documentation/devicetree/bindings/opp/opp.txt | 74 ++++++++++++++++++++++++++- 1 file changed, 73 insertions(+), 1 deletion(-)
diff --git a/Documentation/devicetree/bindings/opp/opp.txt b/Documentation/devicetree/bindings/opp/opp.txt index 63725498bd20..6e30cae2a936 100644 --- a/Documentation/devicetree/bindings/opp/opp.txt +++ b/Documentation/devicetree/bindings/opp/opp.txt @@ -77,7 +77,10 @@ This defines voltage-current-frequency combinations along with other related properties. Required properties: -- opp-hz: Frequency in Hz, expressed as a 64-bit big-endian integer. +- opp-hz: Frequency in Hz, expressed as a 64-bit big-endian integer. In some
- cases the exact frequency in Hz may be hidden from the OS by the firmware and
- this field may contain values that represent the frequency in a firmware
- dependent way, for example an index of an array in the firmware.
Not really sure OPP binding makes sense here.
I think OPP makes perfect sense here, because microcontroller firmware is managaging OPPs in hardware. We just may not know the exact voltage and/or frequency (and the firmware/hardware may even be doing AVS for micro-adjustments.)
Yes, AVS is being done for the Qcom SoC as well.
What about all the other properties. We expose voltage, but not freq?
I had the same question. Seems the same comment about an abstract "index" is needed for voltage also.
Why should we do that? Here are the cases that I had in mind while writing this:
- DT only contains the performance-index and nothing else (i.e. voltages aren't exposed).
We wouldn't be required to fill the microvolt property as it is optional.
- DT contains both performance-index and voltages.
The microvolts property will contain the actual voltages and opp-hz will contain the index.
I don't see why would we like to put some index value in the microvolts property. We are setting the index value in the opp-hz property to avoid adding extra fields and making sure opp-hz is still the unique property for the nodes.
Optional properties:
- opp-microvolt: voltage in micro Volts.
@@ -154,6 +157,13 @@ properties.
- status: Marks the node enabled/disabled.
+- power-domain-opp: Phandle to the OPP node of the parent power-domain. The
- parent power-domain should be configured to the OPP whose node is pointed by
- the phandle, in order to configure the device for the OPP node that contains
- this property. The order in which the device and power domain should be
- configured is implementation defined. The OPP table of a device can set this
- property only if the device node contains "power-domains" property.
I do understand the need to map a device OPP to a parent power-domain OPP, but I really don't like another phandle.
First, just because a device OPP changes does not mean that a power-domain OPP has to change. What really needs to be specified is a minimum requirement, not an exact OPP. IOW, if a device changes OPP, the power-domain OPP has to be *at least* an OPP that can guarantee that level of performance, but could also be a more performant OPP, right?
Right and that's how the code is interpreting it right now. Yes, the description above should have been more clear on that though.
Also, the parent power-domain driver will have a list of all its devices, and be able to get OPPs from those devices.
IMO, we should do the first (few) implementations of this feature from the power-domain driver itself, and not try to figure out how to define this for everyone in DT until we have a better handle on it (pun intended) ;)
Hmm, I am not sure how things are going to work in that case. The opp-hz value read from the phandle is passed to the QoS framework in this series, which makes sure that we select the highest requested performance point for a particular power-domain. The index value is required to be present with the OPP framework to make it all work, at least based on the way I have designed it for now.
On 05/08/2017 09:45 AM, Viresh Kumar wrote:
On 06-05-17, 11:58, Kevin Hilman wrote:
Rob Herring robh@kernel.org writes:
On Wed, Apr 26, 2017 at 04:27:05PM +0530, Viresh Kumar wrote:
Power-domains need to express their active states in DT and the devices within the power-domain need to express their dependency on those active states. The power-domains can use the OPP tables without any modifications to the bindings.
Add a new property "power-domain-opp", which will contain phandle to the OPP node of the parent power domain. This is required for devices which have dependency on the configured active state of the power domain for their working.
For some platforms the actual frequency and voltages of the power domains are managed by the firmware and are so hidden from the high level operating system. The "opp-hz" property is relaxed a bit to contain indexes instead of actual frequency values to support such platforms.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
Documentation/devicetree/bindings/opp/opp.txt | 74 ++++++++++++++++++++++++++- 1 file changed, 73 insertions(+), 1 deletion(-)
diff --git a/Documentation/devicetree/bindings/opp/opp.txt b/Documentation/devicetree/bindings/opp/opp.txt index 63725498bd20..6e30cae2a936 100644 --- a/Documentation/devicetree/bindings/opp/opp.txt +++ b/Documentation/devicetree/bindings/opp/opp.txt @@ -77,7 +77,10 @@ This defines voltage-current-frequency combinations along with other related properties. Required properties: -- opp-hz: Frequency in Hz, expressed as a 64-bit big-endian integer. +- opp-hz: Frequency in Hz, expressed as a 64-bit big-endian integer. In some
- cases the exact frequency in Hz may be hidden from the OS by the firmware and
- this field may contain values that represent the frequency in a firmware
- dependent way, for example an index of an array in the firmware.
Not really sure OPP binding makes sense here.
I think OPP makes perfect sense here, because microcontroller firmware is managaging OPPs in hardware. We just may not know the exact voltage and/or frequency (and the firmware/hardware may even be doing AVS for micro-adjustments.)
Yes, AVS is being done for the Qcom SoC as well.
What about all the other properties. We expose voltage, but not freq?
I had the same question. Seems the same comment about an abstract "index" is needed for voltage also.
Why should we do that? Here are the cases that I had in mind while writing this:
DT only contains the performance-index and nothing else (i.e. voltages aren't exposed).
We wouldn't be required to fill the microvolt property as it is optional.
So the performance-index is specified in opp-hz property? What if the microcontroller firmware maps the performance-index to voltage but expects linux to scale the frequency? There is no way to specify a performance-index *and* a frequency for a OPP now I guess?
DT contains both performance-index and voltages.
The microvolts property will contain the actual voltages and opp-hz will contain the index.
So this is for cases where the performance-index maps to a freq managed by the microcontroller and voltages managed by linux? I have a case of exact opposite and I don't see now how to handle it now with these bindings.
I don't see why would we like to put some index value in the microvolts property. We are setting the index value in the opp-hz property to avoid adding extra fields and making sure opp-hz is still the unique property for the nodes.
Maybe to handle the case like what I described above?
I had a long chat with Rajendra offline and clarified few things..
On 08-05-17, 11:06, Rajendra Nayak wrote:
On 05/08/2017 09:45 AM, Viresh Kumar wrote:
On 06-05-17, 11:58, Kevin Hilman wrote:
I had the same question. Seems the same comment about an abstract "index" is needed for voltage also.
Why should we do that? Here are the cases that I had in mind while writing this:
DT only contains the performance-index and nothing else (i.e. voltages aren't exposed).
We wouldn't be required to fill the microvolt property as it is optional.
So the performance-index is specified in opp-hz property?
Yes, but in the OPP table of the power-domain and not the device. The device can still have its own OPP table with normal freq/voltage values (for a separate regulator).
What if the microcontroller firmware maps the performance-index to voltage but expects linux to scale the frequency?
As you clarified on the chat, you were talking about the device here. It isn't a problem as we will have two separate tables here, one for the device and one for the domain.
Viresh Kumar viresh.kumar@linaro.org writes:
On 06-05-17, 11:58, Kevin Hilman wrote:
Rob Herring robh@kernel.org writes:
On Wed, Apr 26, 2017 at 04:27:05PM +0530, Viresh Kumar wrote:
Power-domains need to express their active states in DT and the devices within the power-domain need to express their dependency on those active states. The power-domains can use the OPP tables without any modifications to the bindings.
Add a new property "power-domain-opp", which will contain phandle to the OPP node of the parent power domain. This is required for devices which have dependency on the configured active state of the power domain for their working.
For some platforms the actual frequency and voltages of the power domains are managed by the firmware and are so hidden from the high level operating system. The "opp-hz" property is relaxed a bit to contain indexes instead of actual frequency values to support such platforms.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
Documentation/devicetree/bindings/opp/opp.txt | 74 ++++++++++++++++++++++++++- 1 file changed, 73 insertions(+), 1 deletion(-)
diff --git a/Documentation/devicetree/bindings/opp/opp.txt b/Documentation/devicetree/bindings/opp/opp.txt index 63725498bd20..6e30cae2a936 100644 --- a/Documentation/devicetree/bindings/opp/opp.txt +++ b/Documentation/devicetree/bindings/opp/opp.txt @@ -77,7 +77,10 @@ This defines voltage-current-frequency combinations along with other related properties. Required properties: -- opp-hz: Frequency in Hz, expressed as a 64-bit big-endian integer. +- opp-hz: Frequency in Hz, expressed as a 64-bit big-endian integer. In some
- cases the exact frequency in Hz may be hidden from the OS by the firmware and
- this field may contain values that represent the frequency in a firmware
- dependent way, for example an index of an array in the firmware.
Not really sure OPP binding makes sense here.
I think OPP makes perfect sense here, because microcontroller firmware is managaging OPPs in hardware. We just may not know the exact voltage and/or frequency (and the firmware/hardware may even be doing AVS for micro-adjustments.)
Yes, AVS is being done for the Qcom SoC as well.
What about all the other properties. We expose voltage, but not freq?
I had the same question. Seems the same comment about an abstract "index" is needed for voltage also.
Why should we do that?
For starters, because the lack of it looks very strange upon first read (notice that both Rob and I pointed that out), and because you didn't explain why in the first place, it draws attention.
Here are the cases that I had in mind while writing this:
DT only contains the performance-index and nothing else (i.e. voltages aren't exposed).
We wouldn't be required to fill the microvolt property as it is optional.
DT contains both performance-index and voltages.
The microvolts property will contain the actual voltages and opp-hz will contain the index.
I don't see why would we like to put some index value in the microvolts property. We are setting the index value in the opp-hz property to avoid adding extra fields and making sure opp-hz is still the unique property for the nodes.
What about the case where firmware wants exact frequencies, and microvolts property is just an index?
The point is, you have a very specific SoC and use-case in mind, but the goal of a binding change like this is to make something that could be generically useful.
Optional properties:
- opp-microvolt: voltage in micro Volts.
@@ -154,6 +157,13 @@ properties.
- status: Marks the node enabled/disabled.
+- power-domain-opp: Phandle to the OPP node of the parent power-domain. The
- parent power-domain should be configured to the OPP whose node is pointed by
- the phandle, in order to configure the device for the OPP node that contains
- this property. The order in which the device and power domain should be
- configured is implementation defined. The OPP table of a device can set this
- property only if the device node contains "power-domains" property.
I do understand the need to map a device OPP to a parent power-domain OPP, but I really don't like another phandle.
First, just because a device OPP changes does not mean that a power-domain OPP has to change. What really needs to be specified is a minimum requirement, not an exact OPP. IOW, if a device changes OPP, the power-domain OPP has to be *at least* an OPP that can guarantee that level of performance, but could also be a more performant OPP, right?
Right and that's how the code is interpreting it right now. Yes, the description above should have been more clear on that though.
Also, the parent power-domain driver will have a list of all its devices, and be able to get OPPs from those devices.
IMO, we should do the first (few) implementations of this feature from the power-domain driver itself, and not try to figure out how to define this for everyone in DT until we have a better handle on it (pun intended) ;)
Hmm, I am not sure how things are going to work in that case. The opp-hz value read from the phandle is passed to the QoS framework in this series, which makes sure that we select the highest requested performance point for a particular power-domain. The index value is required to be present with the OPP framework to make it all work, at least based on the way I have designed it for now.
IMO, this kind of dependency isn't the job of the OPP framework, it's the job of the power-domain governor.
Kevin
On 12 May 2017 at 20:29, Kevin Hilman khilman@baylibre.com wrote:
Viresh Kumar viresh.kumar@linaro.org writes:
Why should we do that?
For starters, because the lack of it looks very strange upon first read (notice that both Rob and I pointed that out), and because you didn't explain why in the first place, it draws attention.
:)
I don't see why would we like to put some index value in the microvolts property. We are setting the index value in the opp-hz property to avoid adding extra fields and making sure opp-hz is still the unique property for the nodes.
What about the case where firmware wants exact frequencies, and microvolts property is just an index?
The point is, you have a very specific SoC and use-case in mind, but the goal of a binding change like this is to make something that could be generically useful.
I agree, but I am not sure of having such a case in very near future at least. Wouldn't it be wise to not touch opp-microvolt for now and update it only when needed? Its not a big change anyway..
Hmm, I am not sure how things are going to work in that case. The opp-hz value read from the phandle is passed to the QoS framework in this series, which makes sure that we select the highest requested performance point for a particular power-domain. The index value is required to be present with the OPP framework to make it all work, at least based on the way I have designed it for now.
IMO, this kind of dependency isn't the job of the OPP framework, it's the job of the power-domain governor.
Okay. So the way it will work with the current suggestions is:
- OPP framework gets DVFS update request for device X - OPP framework finds that the device has a power-domain and so it asks the power-domain framework to set the device in a particular state corresponding to the OPP (if we are going to a higher OPP). - If the power-domain supports state selection, it does that or returns error. (Actually we can optimize this by asking the genpd initially if state selection is possible, only then OPP core calls the genpd API). - The genpd API will manage a list of all devices in the domain (which it already does) and also the states selected for them. It finds the max of the requested states and selects that. - Note that the QoS framework isn't there in the picture anymore.
Will that be fine ?
-- viresh
Update the power-domain bindings to allow "operating-points-v2" to be present within the power-domain's provider node.
Also allow consumer devices that don't use OPP tables, to specify the parent power-domain's OPP node in their "power-domain-opp" property.
Also note that the "operating-points-v2" property is extended to support an array for the power domain providers.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- .../devicetree/bindings/power/power_domain.txt | 106 +++++++++++++++++++++ 1 file changed, 106 insertions(+)
diff --git a/Documentation/devicetree/bindings/power/power_domain.txt b/Documentation/devicetree/bindings/power/power_domain.txt index 14bd9e945ff6..730af0afc09a 100644 --- a/Documentation/devicetree/bindings/power/power_domain.txt +++ b/Documentation/devicetree/bindings/power/power_domain.txt @@ -40,6 +40,10 @@ phandle arguments (so called PM domain specifiers) of length specified by the domain's idle states. In the absence of this property, the domain would be considered as capable of being powered-on or powered-off.
+- operating-points-v2 : Phandles to the OPP tables for a power domain provider. + If the provider provides a single power domain, then this shall contain a + single phandle. Refer to ../opp/opp.txt for more information. + Example:
power: power-controller@12340000 { @@ -120,4 +124,106 @@ The node above defines a typical PM domain consumer device, which is located inside a PM domain with index 0 of a power controller represented by a node with the label "power".
+Optional properties: +- power-domain-opp: Phandle to the OPP node of the parent power-domain. The + parent power-domain should be configured to the OPP whose node is pointed by + the phandle, in order to use the device that contains this property. + + +Example: +- Device with parent power domain with two active states represented by OPP + table. + + domain_opp_table: opp_table { + compatible = "operating-points-v2"; + + /* + * NOTE: Actual frequency is managed by firmware and is hidden + * from HLOS, so we simply use index in the opp-hz field to + * select the OPP. + */ + domain_opp_1: opp-1 { + opp-hz = /bits/ 64 <1>; + opp-microvolt = <975000 970000 985000>; + }; + domain_opp_2: opp-2 { + opp-hz = /bits/ 64 <2>; + opp-microvolt = <1075000 1000000 1085000>; + }; + }; + + + parent: power-controller@12340000 { + compatible = "foo,power-controller"; + reg = <0x12340000 0x1000>; + #power-domain-cells = <0>; + operating-points-v2 = <&domain_opp_table>; + }; + + leaky-device@12350000 { + compatible = "foo,i-leak-current"; + reg = <0x12350000 0x1000>; + power-domains = <&parent>; + power-domain-opp = <&domain_opp_2>; + }; + +- OPP table for domain provider that provides two domains. + + domain0_opp_table: opp_table0 { + compatible = "operating-points-v2"; + + /* + * NOTE: Actual frequency is managed by firmware and is hidden + * from HLOS, so we simply use index in the opp-hz field to + * select the OPP. + */ + domain0_opp_1: opp-1 { + opp-hz = /bits/ 64 <1>; + opp-microvolt = <975000 970000 985000>; + }; + domain0_opp_2: opp-2 { + opp-hz = /bits/ 64 <2>; + opp-microvolt = <1075000 1000000 1085000>; + }; + }; + + domain1_opp_table: opp_table1 { + compatible = "operating-points-v2"; + + /* + * NOTE: Actual frequency is managed by firmware and is hidden + * from HLOS, so we simply use index in the opp-hz field to + * select the OPP. + */ + domain1_opp_1: opp-1 { + opp-hz = /bits/ 64 <1>; + opp-microvolt = <975000 970000 985000>; + }; + domain1_opp_2: opp-2 { + opp-hz = /bits/ 64 <2>; + opp-microvolt = <1075000 1000000 1085000>; + }; + }; + + parent: power-controller@12340000 { + compatible = "foo,power-controller"; + reg = <0x12340000 0x1000>; + #power-domain-cells = <1>; + operating-points-v2 = <&domain0_opp_table>, <&domain1_opp_table>; + }; + + leaky-device0@12350000 { + compatible = "foo,i-leak-current"; + reg = <0x12350000 0x1000>; + power-domains = <&parent 0>; + power-domain-opp = <&domain0_opp_2>; + }; + + leaky-device1@12350000 { + compatible = "foo,i-leak-current"; + reg = <0x12350000 0x1000>; + power-domains = <&parent 1>; + power-domain-opp = <&domain1_opp_2>; + }; + [1]. Documentation/devicetree/bindings/power/domain-idle-state.txt
Only the resume_latency constraint uses the notifiers right now. In order to prepare for adding new constraint types with notifiers, move to a common notifier list.
Update pm_qos_update_target() to pass a pointer to the constraint structure to the notifier callbacks. Also update the notifier callbacks as well to error out for unexpected constraints.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org Acked-by: Ulf Hansson ulf.hansson@linaro.org --- drivers/base/power/domain.c | 26 +++++++++++++++++++------- drivers/base/power/qos.c | 15 ++++----------- include/linux/pm_qos.h | 7 +++++++ kernel/power/qos.c | 2 +- 4 files changed, 31 insertions(+), 19 deletions(-)
diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c index da49a8383dc3..f6f616ac5cc2 100644 --- a/drivers/base/power/domain.c +++ b/drivers/base/power/domain.c @@ -426,14 +426,10 @@ static int genpd_power_on(struct generic_pm_domain *genpd, unsigned int depth) return ret; }
-static int genpd_dev_pm_qos_notifier(struct notifier_block *nb, - unsigned long val, void *ptr) +static int genpd_latency_notifier(struct generic_pm_domain_data *gpd_data, + unsigned long val) { - struct generic_pm_domain_data *gpd_data; - struct device *dev; - - gpd_data = container_of(nb, struct generic_pm_domain_data, nb); - dev = gpd_data->base.dev; + struct device *dev = gpd_data->base.dev;
for (;;) { struct generic_pm_domain *genpd; @@ -466,6 +462,22 @@ static int genpd_dev_pm_qos_notifier(struct notifier_block *nb, return NOTIFY_DONE; }
+static int genpd_dev_pm_qos_notifier(struct notifier_block *nb, + unsigned long val, void *ptr) +{ + struct generic_pm_domain_data *gpd_data; + struct device *dev; + + gpd_data = container_of(nb, struct generic_pm_domain_data, nb); + dev = gpd_data->base.dev; + + if (dev_pm_qos_is_resume_latency(dev, ptr)) + return genpd_latency_notifier(gpd_data, val); + + dev_err(dev, "%s: Unexpected notifier call\n", __func__); + return NOTIFY_BAD; +} + /** * genpd_power_off_work_fn - Power off PM domain whose subdomain count is 0. * @work: Work structure used for scheduling the execution of this function. diff --git a/drivers/base/power/qos.c b/drivers/base/power/qos.c index f850daeffba4..654d8a12c2e7 100644 --- a/drivers/base/power/qos.c +++ b/drivers/base/power/qos.c @@ -172,18 +172,12 @@ static int dev_pm_qos_constraints_allocate(struct device *dev) { struct dev_pm_qos *qos; struct pm_qos_constraints *c; - struct blocking_notifier_head *n;
qos = kzalloc(sizeof(*qos), GFP_KERNEL); if (!qos) return -ENOMEM;
- n = kzalloc(sizeof(*n), GFP_KERNEL); - if (!n) { - kfree(qos); - return -ENOMEM; - } - BLOCKING_INIT_NOTIFIER_HEAD(n); + BLOCKING_INIT_NOTIFIER_HEAD(&qos->notifiers);
c = &qos->resume_latency; plist_head_init(&c->list); @@ -191,7 +185,7 @@ static int dev_pm_qos_constraints_allocate(struct device *dev) c->default_value = PM_QOS_RESUME_LATENCY_DEFAULT_VALUE; c->no_constraint_value = PM_QOS_RESUME_LATENCY_DEFAULT_VALUE; c->type = PM_QOS_MIN; - c->notifiers = n; + c->notifiers = &qos->notifiers;
c = &qos->latency_tolerance; plist_head_init(&c->list); @@ -268,7 +262,6 @@ void dev_pm_qos_constraints_destroy(struct device *dev) dev->power.qos = ERR_PTR(-ENODEV); spin_unlock_irq(&dev->power.lock);
- kfree(qos->resume_latency.notifiers); kfree(qos);
out: @@ -487,7 +480,7 @@ int dev_pm_qos_add_notifier(struct device *dev, struct notifier_block *notifier) ret = dev_pm_qos_constraints_allocate(dev);
if (!ret) - ret = blocking_notifier_chain_register(dev->power.qos->resume_latency.notifiers, + ret = blocking_notifier_chain_register(&dev->power.qos->notifiers, notifier);
mutex_unlock(&dev_pm_qos_mtx); @@ -514,7 +507,7 @@ int dev_pm_qos_remove_notifier(struct device *dev,
/* Silently return if the constraints object is not present. */ if (!IS_ERR_OR_NULL(dev->power.qos)) - retval = blocking_notifier_chain_unregister(dev->power.qos->resume_latency.notifiers, + retval = blocking_notifier_chain_unregister(&dev->power.qos->notifiers, notifier);
mutex_unlock(&dev_pm_qos_mtx); diff --git a/include/linux/pm_qos.h b/include/linux/pm_qos.h index 032b55909145..e546d1a2f237 100644 --- a/include/linux/pm_qos.h +++ b/include/linux/pm_qos.h @@ -100,6 +100,7 @@ struct dev_pm_qos { struct dev_pm_qos_request *resume_latency_req; struct dev_pm_qos_request *latency_tolerance_req; struct dev_pm_qos_request *flags_req; + struct blocking_notifier_head notifiers; /* common for all constraints */ };
/* Action requested to pm_qos_update_target */ @@ -114,6 +115,12 @@ static inline int dev_pm_qos_request_active(struct dev_pm_qos_request *req) return req->dev != NULL; }
+static inline bool dev_pm_qos_is_resume_latency(struct device *dev, + struct pm_qos_constraints *c) +{ + return &dev->power.qos->resume_latency == c; +} + int pm_qos_update_target(struct pm_qos_constraints *c, struct plist_node *node, enum pm_qos_req_action action, int value); bool pm_qos_update_flags(struct pm_qos_flags *pqf, diff --git a/kernel/power/qos.c b/kernel/power/qos.c index 97b0df71303e..073324e0c3c8 100644 --- a/kernel/power/qos.c +++ b/kernel/power/qos.c @@ -315,7 +315,7 @@ int pm_qos_update_target(struct pm_qos_constraints *c, struct plist_node *node, if (c->notifiers) blocking_notifier_call_chain(c->notifiers, (unsigned long)curr_value, - NULL); + c); } else { ret = 0; }
Some platforms have the capability to configure the performance state of their Power Domains. The performance levels are identified by positive integer values, a lower value represents lower performance state. The power domain driver should be able to retrieve all information required to configure the performance state of the power domain, with the help of the performance constraint's target value.
This patch adds a new QOS request type: DEV_PM_QOS_PERFORMANCE to support runtime performance constraints for the devices. Also allow notifiers to be registered against it, which will be used by frameworks like genpd.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org Acked-by: Ulf Hansson ulf.hansson@linaro.org --- Documentation/power/pm_qos_interface.txt | 2 +- drivers/base/power/qos.c | 21 +++++++++++++++++++++ include/linux/pm_qos.h | 9 +++++++++ 3 files changed, 31 insertions(+), 1 deletion(-)
diff --git a/Documentation/power/pm_qos_interface.txt b/Documentation/power/pm_qos_interface.txt index 21d2d48f87a2..42870d28fc3c 100644 --- a/Documentation/power/pm_qos_interface.txt +++ b/Documentation/power/pm_qos_interface.txt @@ -168,7 +168,7 @@ The per-device PM QoS framework has a per-device notification tree. int dev_pm_qos_add_notifier(device, notifier): Adds a notification callback function for the device. The callback is called when the aggregated value of the device constraints list -is changed (for resume latency device PM QoS only). +is changed (for resume latency and performance device PM QoS).
int dev_pm_qos_remove_notifier(device, notifier): Removes the notification callback function for the device. diff --git a/drivers/base/power/qos.c b/drivers/base/power/qos.c index 654d8a12c2e7..084d26960dae 100644 --- a/drivers/base/power/qos.c +++ b/drivers/base/power/qos.c @@ -150,6 +150,10 @@ static int apply_constraint(struct dev_pm_qos_request *req, req->dev->power.set_latency_tolerance(req->dev, value); } break; + case DEV_PM_QOS_PERFORMANCE: + ret = pm_qos_update_target(&qos->performance, &req->data.pnode, + action, value); + break; case DEV_PM_QOS_FLAGS: ret = pm_qos_update_flags(&qos->flags, &req->data.flr, action, value); @@ -194,6 +198,14 @@ static int dev_pm_qos_constraints_allocate(struct device *dev) c->no_constraint_value = PM_QOS_LATENCY_TOLERANCE_NO_CONSTRAINT; c->type = PM_QOS_MIN;
+ c = &qos->performance; + plist_head_init(&c->list); + c->target_value = PM_QOS_PERFORMANCE_DEFAULT_VALUE; + c->default_value = PM_QOS_PERFORMANCE_DEFAULT_VALUE; + c->no_constraint_value = PM_QOS_PERFORMANCE_DEFAULT_VALUE; + c->type = PM_QOS_MAX; + c->notifiers = &qos->notifiers; + INIT_LIST_HEAD(&qos->flags.list);
spin_lock_irq(&dev->power.lock); @@ -252,6 +264,11 @@ void dev_pm_qos_constraints_destroy(struct device *dev) apply_constraint(req, PM_QOS_REMOVE_REQ, PM_QOS_DEFAULT_VALUE); memset(req, 0, sizeof(*req)); } + c = &qos->performance; + plist_for_each_entry_safe(req, tmp, &c->list, data.pnode) { + apply_constraint(req, PM_QOS_REMOVE_REQ, PM_QOS_DEFAULT_VALUE); + memset(req, 0, sizeof(*req)); + } f = &qos->flags; list_for_each_entry_safe(req, tmp, &f->list, data.flr.node) { apply_constraint(req, PM_QOS_REMOVE_REQ, PM_QOS_DEFAULT_VALUE); @@ -362,6 +379,7 @@ static int __dev_pm_qos_update_request(struct dev_pm_qos_request *req, switch(req->type) { case DEV_PM_QOS_RESUME_LATENCY: case DEV_PM_QOS_LATENCY_TOLERANCE: + case DEV_PM_QOS_PERFORMANCE: curr_value = req->data.pnode.prio; break; case DEV_PM_QOS_FLAGS: @@ -571,6 +589,9 @@ static void __dev_pm_qos_drop_user_request(struct device *dev, req = dev->power.qos->flags_req; dev->power.qos->flags_req = NULL; break; + case DEV_PM_QOS_PERFORMANCE: + dev_err(dev, "Invalid user request (performance)\n"); + return; } __dev_pm_qos_remove_request(req); kfree(req); diff --git a/include/linux/pm_qos.h b/include/linux/pm_qos.h index e546d1a2f237..665f90face40 100644 --- a/include/linux/pm_qos.h +++ b/include/linux/pm_qos.h @@ -36,6 +36,7 @@ enum pm_qos_flags_status { #define PM_QOS_RESUME_LATENCY_DEFAULT_VALUE 0 #define PM_QOS_LATENCY_TOLERANCE_DEFAULT_VALUE 0 #define PM_QOS_LATENCY_TOLERANCE_NO_CONSTRAINT (-1) +#define PM_QOS_PERFORMANCE_DEFAULT_VALUE 0 #define PM_QOS_LATENCY_ANY ((s32)(~(__u32)0 >> 1))
#define PM_QOS_FLAG_NO_POWER_OFF (1 << 0) @@ -55,6 +56,7 @@ struct pm_qos_flags_request { enum dev_pm_qos_req_type { DEV_PM_QOS_RESUME_LATENCY = 1, DEV_PM_QOS_LATENCY_TOLERANCE, + DEV_PM_QOS_PERFORMANCE, DEV_PM_QOS_FLAGS, };
@@ -96,6 +98,7 @@ struct pm_qos_flags { struct dev_pm_qos { struct pm_qos_constraints resume_latency; struct pm_qos_constraints latency_tolerance; + struct pm_qos_constraints performance; struct pm_qos_flags flags; struct dev_pm_qos_request *resume_latency_req; struct dev_pm_qos_request *latency_tolerance_req; @@ -121,6 +124,12 @@ static inline bool dev_pm_qos_is_resume_latency(struct device *dev, return &dev->power.qos->resume_latency == c; }
+static inline bool dev_pm_qos_is_performance(struct device *dev, + struct pm_qos_constraints *c) +{ + return &dev->power.qos->performance == c; +} + int pm_qos_update_target(struct pm_qos_constraints *c, struct plist_node *node, enum pm_qos_req_action action, int value); bool pm_qos_update_flags(struct pm_qos_flags *pqf,
The devices can specify phandle to their power-domain's OPP node in their OPP nodes under the "power-domain-opp" property. This patch updates the OPP core to parse it.
The OPP nodes are allowed to have the "power-domain-opp" property, only if the device node contains the "power-domains" property. The OPP nodes aren't allowed to contain this property partially, i.e. Either all OPP nodes in the OPP table have the "power-domain-opp" property or none of them have it.
The QoS framework represents the request values by s32 type variables and so we are forced to convert the 64 bit values read from DT (from the "opp-hz" property) into s32. It shouldn't be a problem unless someone uses real frequency values in "opp-hz" property for the power domains. A comment is added in the code to take a note of that. We can fix that later once we have real platforms that want it.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/base/power/opp/core.c | 72 +++++++++++++++++++++++++++++++++++++ drivers/base/power/opp/debugfs.c | 3 ++ drivers/base/power/opp/of.c | 77 ++++++++++++++++++++++++++++++++++++++++ drivers/base/power/opp/opp.h | 12 +++++++ 4 files changed, 164 insertions(+)
diff --git a/drivers/base/power/opp/core.c b/drivers/base/power/opp/core.c index dae61720b314..dc8b7bc0061a 100644 --- a/drivers/base/power/opp/core.c +++ b/drivers/base/power/opp/core.c @@ -543,6 +543,62 @@ _generic_set_opp_clk_only(struct device *dev, struct clk *clk, return ret; }
+static int _update_pm_qos_request(struct device *dev, + struct dev_pm_qos_request *req, int perf) +{ + int ret; + + if (likely(dev_pm_qos_request_active(req))) + ret = dev_pm_qos_update_request(req, perf); + else + ret = dev_pm_qos_add_request(dev, req, DEV_PM_QOS_PERFORMANCE, + perf); + + if (ret < 0) + return ret; + + return 0; +} + +static int _generic_set_opp_domain(struct device *dev, struct clk *clk, + struct dev_pm_qos_request *req, + unsigned long old_freq, unsigned long freq, + int old_dfreq, int new_dfreq) +{ + int ret; + + /* Scaling up? Scale voltage before frequency */ + if (freq > old_freq) { + ret = _update_pm_qos_request(dev, req, new_dfreq); + if (ret) + return ret; + } + + /* Change frequency */ + ret = _generic_set_opp_clk_only(dev, clk, old_freq, freq); + if (ret) + goto restore_dfreq; + + /* Scaling down? Scale voltage after frequency */ + if (freq < old_freq) { + ret = _update_pm_qos_request(dev, req, new_dfreq); + if (ret) + goto restore_freq; + } + + return 0; + +restore_freq: + if (_generic_set_opp_clk_only(dev, clk, freq, old_freq)) + dev_err(dev, "%s: failed to restore old-freq (%lu Hz)\n", + __func__, old_freq); +restore_dfreq: + if (old_dfreq != -1) + _update_pm_qos_request(dev, req, old_dfreq); + + return ret; +} + static int _generic_set_opp(struct dev_pm_set_opp_data *data) { struct dev_pm_opp_supply *old_supply = data->old_opp.supplies; @@ -663,6 +719,19 @@ int dev_pm_opp_set_rate(struct device *dev, unsigned long target_freq)
regulators = opp_table->regulators;
+ /* Need to configure power domain performance state */ + if (opp_table->has_domain_opp) { + int old_dfreq = -1, new_dfreq; + struct dev_pm_qos_request *req = &opp_table->qos_request; + + new_dfreq = opp->domain_rate; + if (!IS_ERR(old_opp)) + old_dfreq = old_opp->domain_rate; + + return _generic_set_opp_domain(dev, clk, req, old_freq, freq, + old_dfreq, new_dfreq); + } + /* Only frequency scaling */ if (!regulators) { ret = _generic_set_opp_clk_only(dev, clk, old_freq, freq); @@ -808,6 +877,9 @@ static void _opp_table_kref_release(struct kref *kref) struct opp_table *opp_table = container_of(kref, struct opp_table, kref); struct opp_device *opp_dev;
+ if (dev_pm_qos_request_active(&opp_table->qos_request)) + dev_pm_qos_remove_request(&opp_table->qos_request); + /* Release clk */ if (!IS_ERR(opp_table->clk)) clk_put(opp_table->clk); diff --git a/drivers/base/power/opp/debugfs.c b/drivers/base/power/opp/debugfs.c index 95f433db4ac7..4b7eb379c84f 100644 --- a/drivers/base/power/opp/debugfs.c +++ b/drivers/base/power/opp/debugfs.c @@ -104,6 +104,9 @@ int opp_debug_create_one(struct dev_pm_opp *opp, struct opp_table *opp_table) if (!debugfs_create_ulong("rate_hz", S_IRUGO, d, &opp->rate)) return -ENOMEM;
+ if (!debugfs_create_u32("domain_rate", S_IRUGO, d, &opp->domain_rate)) + return -ENOMEM; + if (!opp_debug_create_supplies(opp, opp_table, d)) return -ENOMEM;
diff --git a/drivers/base/power/opp/of.c b/drivers/base/power/opp/of.c index 779428676f63..77693ba3ed55 100644 --- a/drivers/base/power/opp/of.c +++ b/drivers/base/power/opp/of.c @@ -254,6 +254,70 @@ struct device_node *dev_pm_opp_of_get_opp_desc_node(struct device *dev) } EXPORT_SYMBOL_GPL(dev_pm_opp_of_get_opp_desc_node);
+static int _parse_domain_opp(struct dev_pm_opp *opp, + struct opp_table *opp_table, struct device *dev, + struct device_node *np) +{ + struct device_node *dnp; + u64 rate; + int ret; + + if (!of_find_property(np, "power-domain-opp", NULL)) { + if (unlikely(opp_table->has_domain_opp == 1)) { + dev_err(dev, "%s: Not all OPP nodes have power-domain-opp\n", + __func__); + return -EINVAL; + } + + /* overwrite to avoid conditional statement */ + opp_table->has_domain_opp = 0; + return 0; + } + + if (unlikely(!opp_table->has_domain)) { + dev_err(dev, "%s: OPP node can't have power-domain-opp property without power domain\n", + __func__); + return -EINVAL; + } + + if (unlikely(!opp_table->has_domain_opp)) { + dev_err(dev, "%s: Not all OPP nodes have power-domain-opp\n", + __func__); + return -EINVAL; + } + + dnp = of_parse_phandle(np, "power-domain-opp", 0); + if (unlikely(!dnp)) { + dev_err(dev, "%s: Unable to parse phandle of power-domain-opp\n", + __func__); + return -EINVAL; + } + + /* Read opp-hz from domain's OPP table */ + ret = of_property_read_u64(dnp, "opp-hz", &rate); + if (ret < 0) { + dev_err(dev, "%s: opp-hz not found in domain's node\n", + __func__); + goto put_node; + } + + /* + * The "domain_rate" field is directly passed to the QoS APIs and they + * accept s32 values only. Will check this again once we have platforms + * that really keep u64 values for power domains. + */ + opp->domain_rate = (int)rate; + + /* overwrite to avoid conditional statement */ + opp_table->has_domain_opp = 1; + + ret = 0; + +put_node: + of_node_put(dnp); + return ret; +} + /** * _opp_add_static_v2() - Allocate static OPPs (As per 'v2' DT bindings) * @opp_table: OPP table @@ -296,6 +360,10 @@ static int _opp_add_static_v2(struct opp_table *opp_table, struct device *dev, goto free_opp; }
+ ret = _parse_domain_opp(new_opp, opp_table, dev, np); + if (ret) + goto free_opp; + /* * Rate is defined as an unsigned long in clk API, and so casting * explicitly to its type. Must be fixed once rate is 64 bit @@ -375,6 +443,15 @@ static int _of_add_opp_table_v2(struct device *dev, struct device_node *opp_np) if (!opp_table) return -ENOMEM;
+ /* + * Only devices with parent power-domains can have "power-domain-opp" + * property. + */ + if (of_find_property(dev->of_node, "power-domains", NULL)) { + opp_table->has_domain = true; + opp_table->has_domain_opp = -1; + } + /* We have opp-table node now, iterate over it and add OPPs */ for_each_available_child_of_node(opp_np, np) { count++; diff --git a/drivers/base/power/opp/opp.h b/drivers/base/power/opp/opp.h index 166eef990599..5350eb4eedd0 100644 --- a/drivers/base/power/opp/opp.h +++ b/drivers/base/power/opp/opp.h @@ -20,6 +20,7 @@ #include <linux/list.h> #include <linux/limits.h> #include <linux/pm_opp.h> +#include <linux/pm_qos.h> #include <linux/notifier.h>
struct clk; @@ -59,6 +60,7 @@ extern struct list_head opp_tables; * @turbo: true if turbo (boost) OPP * @suspend: true if suspend OPP * @rate: Frequency in hertz + * @domain_rate: Copy of domain's rate * @supplies: Power supplies voltage/current values * @clock_latency_ns: Latency (in nanoseconds) of switching to this OPP's * frequency from any other OPP's frequency. @@ -77,6 +79,7 @@ struct dev_pm_opp { bool turbo; bool suspend; unsigned long rate; + int domain_rate;
struct dev_pm_opp_supply *supplies;
@@ -137,6 +140,11 @@ enum opp_table_access { * @regulator_count: Number of power supply regulators * @set_opp: Platform specific set_opp callback * @set_opp_data: Data to be passed to set_opp callback + * @has_domain: True if the device node contains "power-domain" property + * @has_domain_opp: Can have value of 0, 1 or -1. -1 means uninitialized state, + * 0 means that OPP nodes don't have "power-domain-opp" property and 1 means + * that OPP nodes have it. + * @qos_request: Qos request. * @dentry: debugfs dentry pointer of the real device directory (not links). * @dentry_name: Name of the real dentry. * @@ -174,6 +182,10 @@ struct opp_table { int (*set_opp)(struct dev_pm_set_opp_data *data); struct dev_pm_set_opp_data *set_opp_data;
+ bool has_domain; + int has_domain_opp; + struct dev_pm_qos_request qos_request; + #ifdef CONFIG_DEBUG_FS struct dentry *dentry; char dentry_name[NAME_MAX];
The "operating-points-v2" property can contain a list of phandles now, specifically for the power domain providers that provide multiple domains.
Add support to parse that.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/base/power/opp/of.c | 50 +++++++++++++++++++++++++++++++++++++++------ include/linux/pm_opp.h | 6 ++++++ 2 files changed, 50 insertions(+), 6 deletions(-)
diff --git a/drivers/base/power/opp/of.c b/drivers/base/power/opp/of.c index 77693ba3ed55..9cdf3a848e69 100644 --- a/drivers/base/power/opp/of.c +++ b/drivers/base/power/opp/of.c @@ -243,14 +243,17 @@ void dev_pm_opp_of_remove_table(struct device *dev) EXPORT_SYMBOL_GPL(dev_pm_opp_of_remove_table);
/* Returns opp descriptor node for a device, caller must do of_node_put() */ -struct device_node *dev_pm_opp_of_get_opp_desc_node(struct device *dev) +static struct device_node *_of_get_opp_desc_node_indexed(struct device *dev, + int index) { - /* - * There should be only ONE phandle present in "operating-points-v2" - * property. - */ + /* "operating-points-v2" can be an array for power domain providers */ + return of_parse_phandle(dev->of_node, "operating-points-v2", index); +}
- return of_parse_phandle(dev->of_node, "operating-points-v2", 0); +/* Returns opp descriptor node for a device, caller must do of_node_put() */ +struct device_node *dev_pm_opp_of_get_opp_desc_node(struct device *dev) +{ + return _of_get_opp_desc_node_indexed(dev, 0); } EXPORT_SYMBOL_GPL(dev_pm_opp_of_get_opp_desc_node);
@@ -572,6 +575,41 @@ int dev_pm_opp_of_add_table(struct device *dev) } EXPORT_SYMBOL_GPL(dev_pm_opp_of_add_table);
+/** + * dev_pm_opp_of_add_table_indexed() - Initialize indexed opp table from device tree + * @dev: device pointer used to lookup OPP table. + * @index: Index number. + * + * Register the initial OPP table with the OPP library for given device only + * using the "operating-points-v2" property. + * + * Return: + * 0 On success OR + * Duplicate OPPs (both freq and volt are same) and opp->available + * -EEXIST Freq are same and volt are different OR + * Duplicate OPPs (both freq and volt are same) and !opp->available + * -ENOMEM Memory allocation failure + * -ENODEV when 'operating-points' property is not found or is invalid data + * in device node. + * -ENODATA when empty 'operating-points' property is found + * -EINVAL when invalid entries are found in opp-v2 table + */ +int dev_pm_opp_of_add_table_indexed(struct device *dev, int index) +{ + struct device_node *opp_np; + int ret; + + opp_np = _of_get_opp_desc_node_indexed(dev, index); + if (!opp_np) + return -ENODEV; + + ret = _of_add_opp_table_v2(dev, opp_np); + of_node_put(opp_np); + + return ret; +} +EXPORT_SYMBOL_GPL(dev_pm_opp_of_add_table_indexed); + /* CPU device specific helpers */
/** diff --git a/include/linux/pm_opp.h b/include/linux/pm_opp.h index a6685b3dde26..8263d831715c 100644 --- a/include/linux/pm_opp.h +++ b/include/linux/pm_opp.h @@ -284,6 +284,7 @@ static inline void dev_pm_opp_cpumask_remove_table(const struct cpumask *cpumask
#if defined(CONFIG_PM_OPP) && defined(CONFIG_OF) int dev_pm_opp_of_add_table(struct device *dev); +int dev_pm_opp_of_add_table_indexed(struct device *dev, int index); void dev_pm_opp_of_remove_table(struct device *dev); int dev_pm_opp_of_cpumask_add_table(const struct cpumask *cpumask); void dev_pm_opp_of_cpumask_remove_table(const struct cpumask *cpumask); @@ -295,6 +296,11 @@ static inline int dev_pm_opp_of_add_table(struct device *dev) return -ENOTSUPP; }
+static inline int dev_pm_opp_of_add_table_indexed(struct device *dev, int index) +{ + return -ENOTSUPP; +} + static inline void dev_pm_opp_of_remove_table(struct device *dev) { }
Some platforms have the capability to configure the performance state of their Power Domains. The performance levels are identified by positive integer values, a lower value represents lower performance state. The power domain driver should be able to retrieve all information required to configure the performance state of the power domain, with the help of the performance constraint's target value.
This patch implements performance state management in PM domain core. The performance QOS uses the common QOS notifier list and we call __performance_notifier() if the notifier is issued for performance constraint.
This also allows the power domain drivers to implement a ->set_performance_state() callback, which will be called by the power domain core from within the notifier routine. If a domain doesn't implement ->set_performance_state() callback, then it is assumed that its parents are responsible for performance state configuration. Both devices and sub-domains are accounted for while finding the highest performance state requested.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/base/power/domain.c | 77 +++++++++++++++++++++++++++++++++++++++++++++ include/linux/pm_domain.h | 4 +++ 2 files changed, 81 insertions(+)
diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c index f6f616ac5cc2..7d35dafe8c97 100644 --- a/drivers/base/power/domain.c +++ b/drivers/base/power/domain.c @@ -462,6 +462,79 @@ static int genpd_latency_notifier(struct generic_pm_domain_data *gpd_data, return NOTIFY_DONE; }
+static void __update_domain_performance_state(struct generic_pm_domain *genpd, + int depth) +{ + struct generic_pm_domain_data *pd_data; + struct generic_pm_domain *subdomain; + struct pm_domain_data *pdd; + unsigned int state = 0; + struct gpd_link *link; + + /* Traverse all devices within the domain */ + list_for_each_entry(pdd, &genpd->dev_list, list_node) { + pd_data = to_gpd_data(pdd); + + if (pd_data->performance_state > state) + state = pd_data->performance_state; + } + + /* Traverse all subdomains within the domain */ + list_for_each_entry(link, &genpd->master_links, master_node) { + subdomain = link->slave; + + if (subdomain->performance_state > state) + state = subdomain->performance_state; + } + + if (genpd->performance_state == state) + return; + + genpd->performance_state = state; + + if (genpd->set_performance_state) { + genpd->set_performance_state(genpd, state); + return; + } + + /* Propagate to parent power domains */ + list_for_each_entry(link, &genpd->slave_links, slave_node) { + struct generic_pm_domain *master = link->master; + + genpd_lock_nested(master, depth + 1); + __update_domain_performance_state(master, depth + 1); + genpd_unlock(master); + } +} + +static int __performance_notifier(struct generic_pm_domain_data *gpd_data, + unsigned long val) +{ + struct generic_pm_domain *genpd = ERR_PTR(-ENODATA); + struct device *dev = gpd_data->base.dev; + struct pm_domain_data *pdd; + + spin_lock_irq(&dev->power.lock); + + pdd = dev->power.subsys_data ? + dev->power.subsys_data->domain_data : NULL; + + if (pdd && pdd->dev) + genpd = dev_to_genpd(dev); + + spin_unlock_irq(&dev->power.lock); + + if (IS_ERR(genpd)) + return NOTIFY_DONE; + + genpd_lock(genpd); + gpd_data->performance_state = val; + __update_domain_performance_state(genpd, 0); + genpd_unlock(genpd); + + return NOTIFY_DONE; +} + static int genpd_dev_pm_qos_notifier(struct notifier_block *nb, unsigned long val, void *ptr) { @@ -474,6 +547,9 @@ static int genpd_dev_pm_qos_notifier(struct notifier_block *nb, if (dev_pm_qos_is_resume_latency(dev, ptr)) return genpd_latency_notifier(gpd_data, val);
+ if (dev_pm_qos_is_performance(dev, ptr)) + return __performance_notifier(gpd_data, val); + dev_err(dev, "%s: Unexpected notifier call\n", __func__); return NOTIFY_BAD; } @@ -1168,6 +1244,7 @@ static struct generic_pm_domain_data *genpd_alloc_dev_data(struct device *dev, gpd_data->td.constraint_changed = true; gpd_data->td.effective_constraint_ns = -1; gpd_data->nb.notifier_call = genpd_dev_pm_qos_notifier; + gpd_data->performance_state = 0;
spin_lock_irq(&dev->power.lock);
diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h index b7803a251044..84ee474e66d0 100644 --- a/include/linux/pm_domain.h +++ b/include/linux/pm_domain.h @@ -63,8 +63,11 @@ struct generic_pm_domain { unsigned int device_count; /* Number of devices */ unsigned int suspended_count; /* System suspend device counter */ unsigned int prepared_count; /* Suspend counter of prepared devices */ + unsigned int performance_state; /* Max requested performance state */ int (*power_off)(struct generic_pm_domain *domain); int (*power_on)(struct generic_pm_domain *domain); + int (*set_performance_state)(struct generic_pm_domain *domain, + unsigned int state); struct gpd_dev_ops dev_ops; s64 max_off_time_ns; /* Maximum allowed "suspended" time. */ bool max_off_time_changed; @@ -118,6 +121,7 @@ struct generic_pm_domain_data { struct pm_domain_data base; struct gpd_timing_data td; struct notifier_block nb; + unsigned int performance_state; void *data; };
The power-domain core would be using the OPP core going forward and the OPP core has the basic requirement of a device structure for its working.
Add a struct device to the genpd structure and also add a genpd bus type for the devices.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/base/power/domain.c | 37 +++++++++++++++++++++++++++++++++++++ include/linux/pm_domain.h | 1 + 2 files changed, 38 insertions(+)
diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c index 7d35dafe8c97..85365611b258 100644 --- a/drivers/base/power/domain.c +++ b/drivers/base/power/domain.c @@ -1546,6 +1546,10 @@ static void genpd_lock_init(struct generic_pm_domain *genpd) } }
+static struct bus_type genpd_bus_type = { + .name = "genpd", +}; + /** * pm_genpd_init - Initialize a generic I/O PM domain object. * @genpd: PM domain object to initialize. @@ -1602,6 +1606,18 @@ int pm_genpd_init(struct generic_pm_domain *genpd, return ret; }
+ genpd->dev.bus = &genpd_bus_type; + device_initialize(&genpd->dev); + dev_set_name(&genpd->dev, "%s", genpd->name); + + ret = device_add(&genpd->dev); + if (ret) { + dev_err(&genpd->dev, "failed to add device: %d\n", ret); + put_device(&genpd->dev); + kfree(genpd->free); + return ret; + } + mutex_lock(&gpd_list_lock); list_add(&genpd->gpd_list_node, &gpd_list); mutex_unlock(&gpd_list_lock); @@ -1639,6 +1655,7 @@ static int genpd_remove(struct generic_pm_domain *genpd)
list_del(&genpd->gpd_list_node); genpd_unlock(genpd); + device_del(&genpd->dev); cancel_work_sync(&genpd->power_off_work); kfree(genpd->free); pr_debug("%s: removed %s\n", __func__, genpd->name); @@ -1806,6 +1823,7 @@ int of_genpd_add_provider_simple(struct device_node *np, if (!ret) { genpd->provider = &np->fwnode; genpd->has_provider = true; + genpd->dev.of_node = np; } }
@@ -1839,6 +1857,7 @@ int of_genpd_add_provider_onecell(struct device_node *np,
data->domains[i]->provider = &np->fwnode; data->domains[i]->has_provider = true; + data->domains[i]->dev.of_node = np; }
ret = genpd_add_provider(np, genpd_xlate_onecell, data); @@ -2421,3 +2440,21 @@ static void __exit pm_genpd_debug_exit(void) } __exitcall(pm_genpd_debug_exit); #endif /* CONFIG_DEBUG_FS */ + +static int __init pm_genpd_core_init(void) +{ + int ret; + + ret = bus_register(&genpd_bus_type); + if (ret) + pr_err("bus_register failed (%d)\n", ret); + + return ret; +} +pure_initcall(pm_genpd_core_init); + +static void __exit pm_genpd_core_exit(void) +{ + bus_unregister(&genpd_bus_type); +} +__exitcall(pm_genpd_core_exit); diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h index 84ee474e66d0..c01f12b370d2 100644 --- a/include/linux/pm_domain.h +++ b/include/linux/pm_domain.h @@ -48,6 +48,7 @@ struct genpd_power_state { struct genpd_lock_ops;
struct generic_pm_domain { + struct device dev; struct dev_pm_domain domain; /* PM domain operations */ struct list_head gpd_list_node; /* Node in the global PM domains list */ struct list_head master_links; /* Links with PM domain as a master */
Parse the OPP table for power domains if they have their set_performance_state() callback set.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/base/power/domain.c | 86 ++++++++++++++++++++++++++++++++++++--------- include/linux/pm_domain.h | 1 + 2 files changed, 71 insertions(+), 16 deletions(-)
diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c index 85365611b258..17ba541ec3c8 100644 --- a/drivers/base/power/domain.c +++ b/drivers/base/power/domain.c @@ -10,6 +10,7 @@ #include <linux/kernel.h> #include <linux/io.h> #include <linux/platform_device.h> +#include <linux/pm_opp.h> #include <linux/pm_runtime.h> #include <linux/pm_domain.h> #include <linux/pm_qos.h> @@ -1818,15 +1819,37 @@ int of_genpd_add_provider_simple(struct device_node *np,
mutex_lock(&gpd_list_lock);
- if (pm_genpd_present(genpd)) { - ret = genpd_add_provider(np, genpd_xlate_simple, genpd); - if (!ret) { - genpd->provider = &np->fwnode; - genpd->has_provider = true; - genpd->dev.of_node = np; + if (!pm_genpd_present(genpd)) + goto unlock; + + genpd->dev.of_node = np; + + /* Parse genpd OPP table */ + if (genpd->set_performance_state) { + ret = dev_pm_opp_of_add_table(&genpd->dev); + if (ret) { + dev_err(&genpd->dev, "Failed to add OPP table: %d\n", + ret); + goto unlock; + } + + genpd->has_opp_table = true; + } + + ret = genpd_add_provider(np, genpd_xlate_simple, genpd); + if (ret) { + if (genpd->has_opp_table) { + genpd->has_opp_table = false; + dev_pm_opp_of_remove_table(&genpd->dev); } + + goto unlock; }
+ genpd->provider = &np->fwnode; + genpd->has_provider = true; + +unlock: mutex_unlock(&gpd_list_lock);
return ret; @@ -1841,6 +1864,7 @@ EXPORT_SYMBOL_GPL(of_genpd_add_provider_simple); int of_genpd_add_provider_onecell(struct device_node *np, struct genpd_onecell_data *data) { + struct generic_pm_domain *genpd; unsigned int i; int ret = -EINVAL;
@@ -1850,14 +1874,29 @@ int of_genpd_add_provider_onecell(struct device_node *np, mutex_lock(&gpd_list_lock);
for (i = 0; i < data->num_domains; i++) { - if (!data->domains[i]) + genpd = data->domains[i]; + + if (!genpd) continue; - if (!pm_genpd_present(data->domains[i])) + if (!pm_genpd_present(genpd)) goto error;
- data->domains[i]->provider = &np->fwnode; - data->domains[i]->has_provider = true; - data->domains[i]->dev.of_node = np; + genpd->dev.of_node = np; + + /* Parse genpd OPP table */ + if (genpd->set_performance_state) { + ret = dev_pm_opp_of_add_table_indexed(&genpd->dev, i); + if (ret) { + dev_err(&genpd->dev, "Failed to add OPP table for index %d: %d\n", + i, ret); + goto error; + } + + genpd->has_opp_table = true; + } + + genpd->provider = &np->fwnode; + genpd->has_provider = true; }
ret = genpd_add_provider(np, genpd_xlate_onecell, data); @@ -1870,10 +1909,18 @@ int of_genpd_add_provider_onecell(struct device_node *np,
error: while (i--) { - if (!data->domains[i]) + genpd = data->domains[i]; + + if (!genpd) continue; - data->domains[i]->provider = NULL; - data->domains[i]->has_provider = false; + + genpd->provider = NULL; + genpd->has_provider = false; + + if (genpd->has_opp_table) { + genpd->has_opp_table = false; + dev_pm_opp_of_remove_table(&genpd->dev); + } }
mutex_unlock(&gpd_list_lock); @@ -1900,10 +1947,17 @@ void of_genpd_del_provider(struct device_node *np) * provider, set the 'has_provider' to false * so that the PM domain can be safely removed. */ - list_for_each_entry(gpd, &gpd_list, gpd_list_node) - if (gpd->provider == &np->fwnode) + list_for_each_entry(gpd, &gpd_list, gpd_list_node) { + if (gpd->provider == &np->fwnode) { gpd->has_provider = false;
+ if (!gpd->has_opp_table) + continue; + + dev_pm_opp_of_remove_table(&gpd->dev); + } + } + list_del(&cp->link); of_node_put(cp->node); kfree(cp); diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h index c01f12b370d2..60fc5b165b91 100644 --- a/include/linux/pm_domain.h +++ b/include/linux/pm_domain.h @@ -58,6 +58,7 @@ struct generic_pm_domain { struct work_struct power_off_work; struct fwnode_handle *provider; /* Identity of the domain provider */ bool has_provider; + bool has_opp_table; const char *name; atomic_t sd_count; /* Number of subdomains with power "on" */ enum gpd_status status; /* Current state of the domain */
linaro-kernel@lists.linaro.org