Profiling For TC2's Low Power Mode

List overview All Threads
Download

newer

older

[PATCH v4] drm/exynos: prepare...

[PATCH v7] sched: fix init...

Leo Yan

19 Apr 2013 19 Apr '13

6:43 a.m.

hi Nico & all,

We are do some profiling on TC2 board for low power mode, and found there have some long latency for the core/cluster's power on sequence, so want to confirm below questions:

1. From our profiling result, we found if the core_A send IPI to core_B and the core_B run into the function bL_entry_point (or the function mcpm_entry_point in your later patches for mainline) will take about 954us, it's really a long interval.

Now we use the firmware is 13.01's version (with has supported BX_ADDRx registers); so the cluster level's power on sequence should be: a) DCC to detect the nIRQOUT/nFIQOUT asserting; b) DCC power on the according cluster; c) the core run into boot monitor code and finally it will use the BX_ADDRx register to jump to the function *bL_entry_point*;

Due upper flows are black box for us, so we suspect the time will be consumed by one of these steps; could u or ARM guys can help confirm this question?

2. When we read the spec DAI0318D_v2p_ca15_a7_power_management.pdf and get confirm from ARM support, we know there only have cluster level's power down with CA15_PWRDN_EN/CA7_PWRDN_EN bits.

For the core level, we can NOT independently to power off the core if other cores in the same cluster are still powered on. But this is conflicting with TC2's power management code in tc2_pm.c.

We can see in the function *tc2_pm_down()*, it will call gic_cpu_if_down() to disable GIC's cpu interface; that means the core cannot receive interrupts anymore and the core will run into WFI. After the core run into WFI, if DCC/SPC detect there have interrupts from GIC's nIRQOUT/nFIQOUT pins, then the DCC/SPC will power on the core (or reset the core) to let the core to resume back, then s/w need enable the GIC's cpu interface for itself.

Here the questions are: a) in the function *tc2_pm_down()*, after the core run into WFI state, though DCC/SPC cannot power off the core if the core is NOT the last man of the cluster, but DCC/SPC will reset the core, right? b) how DCC/SPC decide the core is want to run into C1 state or only "WFI" state? DCC/SPC will use the WAKE_INT_MASK bits as the flag?

-- Thx, Leo Yan

Show replies by date

Nicolas Pitre

19 Apr 19 Apr

9:34 p.m.

[ Looping in Achin and Lorenzo ]

On Fri, 19 Apr 2013, Leo Yan wrote:

...

hi Nico & all,

We are do some profiling on TC2 board for low power mode, and found there have some long latency for the core/cluster's power on sequence, so want to confirm below questions:

From our profiling result, we found if the core_A send IPI to core_B and

the core_B run into the function bL_entry_point (or the function mcpm_entry_point in your later patches for mainline) will take about 954us, it's really a long interval.

Is core_B in a different cluster than core_A? It is a known fact that powering up a cluster has far greater latency than simply pulling a core out of reset.

...

Now we use the firmware is 13.01's version (with has supported BX_ADDRx registers); so the cluster level's power on sequence should be: a) DCC to detect the nIRQOUT/nFIQOUT asserting; b) DCC power on the according cluster; c) the core run into boot monitor code and finally it will use the BX_ADDRx register to jump to the function *bL_entry_point*;

Due upper flows are black box for us, so we suspect the time will be consumed by one of these steps; could u or ARM guys can help confirm this question?

Those steps are a black box to me as well. I'll let the ARM guys answer your questions.

...

When we read the spec DAI0318D_v2p_ca15_a7_power_management.pdf and get

confirm from ARM support, we know there only have cluster level's power down with CA15_PWRDN_EN/CA7_PWRDN_EN bits.

For the core level, we can NOT independently to power off the core if other cores in the same cluster are still powered on. But this is conflicting with TC2's power management code in tc2_pm.c.

We can see in the function *tc2_pm_down()*, it will call gic_cpu_if_down() to disable GIC's cpu interface; that means the core cannot receive interrupts anymore and the core will run into WFI. After the core run into WFI, if DCC/SPC detect there have interrupts from GIC's nIRQOUT/nFIQOUT pins, then the DCC/SPC will power on the core (or reset the core) to let the core to resume back, then s/w need enable the GIC's cpu interface for itself.

Here the questions are: a) in the function *tc2_pm_down()*, after the core run into WFI state, though DCC/SPC cannot power off the core if the core is NOT the last man of the cluster, but DCC/SPC will reset the core, right? b) how DCC/SPC decide the core is want to run into C1 state or only "WFI" state? DCC/SPC will use the WAKE_INT_MASK bits as the flag?

-- Thx, Leo Yan

Leo Yan

22 Apr 22 Apr

6:12 a.m.

On 04/20/2013 05:34 AM, Nicolas Pitre wrote:

...

[ Looping in Achin and Lorenzo ]

On Fri, 19 Apr 2013, Leo Yan wrote:

...
hi Nico & all,

We are do some profiling on TC2 board for low power mode, and found there have some long latency for the core/cluster's power on sequence, so want to confirm below questions:

From our profiling result, we found if the core_A send IPI to core_B and

the core_B run into the function bL_entry_point (or the function mcpm_entry_point in your later patches for mainline) will take about 954us, it's really a long interval.

Is core_B in a different cluster than core_A? It is a known fact that powering up a cluster has far greater latency than simply pulling a core out of reset.

Yes, core_B is a different cluster than core_A.

If there have other cores in the same core_B's cluster are powered on, then we can see the power on interval will take about 300us ~ 600us, but the worst case is core_B is the first man of the cluster, then usually it will take about 900+us (seldom even more than 1ms).

...

...
Now we use the firmware is 13.01's version (with has supported BX_ADDRx registers); so the cluster level's power on sequence should be: a) DCC to detect the nIRQOUT/nFIQOUT asserting; b) DCC power on the according cluster; c) the core run into boot monitor code and finally it will use the BX_ADDRx register to jump to the function *bL_entry_point*;

Due upper flows are black box for us, so we suspect the time will be consumed by one of these steps; could u or ARM guys can help confirm this question?

Those steps are a black box to me as well. I'll let the ARM guys answer your questions.

...

When we read the spec DAI0318D_v2p_ca15_a7_power_management.pdf and get

confirm from ARM support, we know there only have cluster level's power down with CA15_PWRDN_EN/CA7_PWRDN_EN bits.

For the core level, we can NOT independently to power off the core if other cores in the same cluster are still powered on. But this is conflicting with TC2's power management code in tc2_pm.c.

We can see in the function *tc2_pm_down()*, it will call gic_cpu_if_down() to disable GIC's cpu interface; that means the core cannot receive interrupts anymore and the core will run into WFI. After the core run into WFI, if DCC/SPC detect there have interrupts from GIC's nIRQOUT/nFIQOUT pins, then the DCC/SPC will power on the core (or reset the core) to let the core to resume back, then s/w need enable the GIC's cpu interface for itself.

Here the questions are: a) in the function *tc2_pm_down()*, after the core run into WFI state, though DCC/SPC cannot power off the core if the core is NOT the last man of the cluster, but DCC/SPC will reset the core, right? b) how DCC/SPC decide the core is want to run into C1 state or only "WFI" state? DCC/SPC will use the WAKE_INT_MASK bits as the flag?

Nicolas Pitre

4:37 p.m.

On Mon, 22 Apr 2013, Leo Yan wrote:

...

On 04/20/2013 05:34 AM, Nicolas Pitre wrote:

...
[ Looping in Achin and Lorenzo ]

On Fri, 19 Apr 2013, Leo Yan wrote:

...
hi Nico & all,

We are do some profiling on TC2 board for low power mode, and found there have some long latency for the core/cluster's power on sequence, so want to confirm below questions:

From our profiling result, we found if the core_A send IPI to core_B

and the core_B run into the function bL_entry_point (or the function mcpm_entry_point in your later patches for mainline) will take about 954us, it's really a long interval.

Is core_B in a different cluster than core_A? It is a known fact that powering up a cluster has far greater latency than simply pulling a core out of reset.

Yes, core_B is a different cluster than core_A.

If there have other cores in the same core_B's cluster are powered on, then we can see the power on interval will take about 300us ~ 600us, but the worst case is core_B is the first man of the cluster, then usually it will take about 900+us (seldom even more than 1ms).

Yes, that is known. Stabilizing the voltage when powering up a whole cluster on TC2 apparently takes that long. That is a characteristic that depends on the hardware implementation.

The switcher code on top of MCPM implements a notification mechanism to let the outbound CPU continue processing work until the inbound is fully operational and ready to take over in order to mitigate this latency.

However when a cluster is turned off resulting from cpuidle decisions, then waking up from idle mode does take that long for the system to resume. This has caused issues with the serial console losing characters when large strings are pasted for example, overflowing the 32 byte FIFO before the UART interrupt is serviced.

Nicolas

Leo Yan

23 Apr 23 Apr

1:57 a.m.

On 04/23/2013 12:37 AM, Nicolas Pitre wrote:

...

On Mon, 22 Apr 2013, Leo Yan wrote:

...
On 04/20/2013 05:34 AM, Nicolas Pitre wrote:

...
[ Looping in Achin and Lorenzo ]

On Fri, 19 Apr 2013, Leo Yan wrote:

...
hi Nico & all,

We are do some profiling on TC2 board for low power mode, and found there have some long latency for the core/cluster's power on sequence, so want to confirm below questions:

From our profiling result, we found if the core_A send IPI to core_B

and the core_B run into the function bL_entry_point (or the function mcpm_entry_point in your later patches for mainline) will take about 954us, it's really a long interval.

Is core_B in a different cluster than core_A? It is a known fact that powering up a cluster has far greater latency than simply pulling a core out of reset.

Yes, core_B is a different cluster than core_A.

If there have other cores in the same core_B's cluster are powered on, then we can see the power on interval will take about 300us ~ 600us, but the worst case is core_B is the first man of the cluster, then usually it will take about 900+us (seldom even more than 1ms).

Yes, that is known. Stabilizing the voltage when powering up a whole cluster on TC2 apparently takes that long. That is a characteristic that depends on the hardware implementation.

The switcher code on top of MCPM implements a notification mechanism to let the outbound CPU continue processing work until the inbound is fully operational and ready to take over in order to mitigate this latency.

Do u mean the outbound core will remain in cache coherency so that the inbound core can snoop the outbound core's cache; after a while the inbound core will notify the outbound core to flush cache, clear SMP bit and really run into the low power mode?

...

However when a cluster is turned off resulting from cpuidle decisions, then waking up from idle mode does take that long for the system to resume. This has caused issues with the serial console losing characters when large strings are pasted for example, overflowing the 32 byte FIFO before the UART interrupt is serviced.

Nicolas

Nicolas Pitre

2:52 a.m.

On Tue, 23 Apr 2013, Leo Yan wrote:

...

On 04/23/2013 12:37 AM, Nicolas Pitre wrote:

...
On Mon, 22 Apr 2013, Leo Yan wrote:

...
On 04/20/2013 05:34 AM, Nicolas Pitre wrote:

...
[ Looping in Achin and Lorenzo ]

On Fri, 19 Apr 2013, Leo Yan wrote:

...
hi Nico & all,

We are do some profiling on TC2 board for low power mode, and found there have some long latency for the core/cluster's power on sequence, so want to confirm below questions:

From our profiling result, we found if the core_A send IPI to

core_B and the core_B run into the function bL_entry_point (or the function mcpm_entry_point in your later patches for mainline) will take about 954us, it's really a long interval.

Is core_B in a different cluster than core_A? It is a known fact that powering up a cluster has far greater latency than simply pulling a core out of reset.

Yes, core_B is a different cluster than core_A.

If there have other cores in the same core_B's cluster are powered on, then we can see the power on interval will take about 300us ~ 600us, but the worst case is core_B is the first man of the cluster, then usually it will take about 900+us (seldom even more than 1ms).

Yes, that is known. Stabilizing the voltage when powering up a whole cluster on TC2 apparently takes that long. That is a characteristic that depends on the hardware implementation.

The switcher code on top of MCPM implements a notification mechanism to let the outbound CPU continue processing work until the inbound is fully operational and ready to take over in order to mitigate this latency.

Do u mean the outbound core will remain in cache coherency so that the inbound core can snoop the outbound core's cache; after a while the inbound core will notify the outbound core to flush cache, clear SMP bit and really run into the low power mode?

Exact.

Nicolas

Leo Yan

5:24 a.m.

On 04/23/2013 10:52 AM, Nicolas Pitre wrote:

...

On Tue, 23 Apr 2013, Leo Yan wrote:

...
On 04/23/2013 12:37 AM, Nicolas Pitre wrote:

...
On Mon, 22 Apr 2013, Leo Yan wrote:

...
On 04/20/2013 05:34 AM, Nicolas Pitre wrote:

...
[ Looping in Achin and Lorenzo ]

On Fri, 19 Apr 2013, Leo Yan wrote:

...
hi Nico & all,

We are do some profiling on TC2 board for low power mode, and found there have some long latency for the core/cluster's power on sequence, so want to confirm below questions:

From our profiling result, we found if the core_A send IPI to

core_B and the core_B run into the function bL_entry_point (or the function mcpm_entry_point in your later patches for mainline) will take about 954us, it's really a long interval.

Is core_B in a different cluster than core_A? It is a known fact that powering up a cluster has far greater latency than simply pulling a core out of reset.

Yes, core_B is a different cluster than core_A.

If there have other cores in the same core_B's cluster are powered on, then we can see the power on interval will take about 300us ~ 600us, but the worst case is core_B is the first man of the cluster, then usually it will take about 900+us (seldom even more than 1ms).

Yes, that is known. Stabilizing the voltage when powering up a whole cluster on TC2 apparently takes that long. That is a characteristic that depends on the hardware implementation.

The switcher code on top of MCPM implements a notification mechanism to let the outbound CPU continue processing work until the inbound is fully operational and ready to take over in order to mitigate this latency.

Do u mean the outbound core will remain in cache coherency so that the inbound core can snoop the outbound core's cache; after a while the inbound core will notify the outbound core to flush cache, clear SMP bit and really run into the low power mode?

Exact.

Thanks a lot for kindly answers ;-)

Thx, Leo Yan

Lorenzo Pieralisi

22 Apr 22 Apr

3:48 p.m.

On Fri, Apr 19, 2013 at 10:34:39PM +0100, Nicolas Pitre wrote:

...

[ Looping in Achin and Lorenzo ]

On Fri, 19 Apr 2013, Leo Yan wrote:

...
hi Nico & all,

We are do some profiling on TC2 board for low power mode, and found there have some long latency for the core/cluster's power on sequence, so want to confirm below questions:

From our profiling result, we found if the core_A send IPI to core_B and

the core_B run into the function bL_entry_point (or the function mcpm_entry_point in your later patches for mainline) will take about 954us, it's really a long interval.

Is core_B in a different cluster than core_A? It is a known fact that powering up a cluster has far greater latency than simply pulling a core out of reset.

...
Now we use the firmware is 13.01's version (with has supported BX_ADDRx registers); so the cluster level's power on sequence should be: a) DCC to detect the nIRQOUT/nFIQOUT asserting; b) DCC power on the according cluster; c) the core run into boot monitor code and finally it will use the BX_ADDRx register to jump to the function *bL_entry_point*;

Due upper flows are black box for us, so we suspect the time will be consumed by one of these steps; could u or ARM guys can help confirm this question?

Those steps are a black box to me as well. I'll let the ARM guys answer your questions.

b) is what takes time to allow voltage to become stable.

Also remember that there might be an ongoing DVFS request for the other cluster and DVFS/power up are inherently serialized by the power controller, so you should expect a degree of variability on that.

...

...

When we read the spec DAI0318D_v2p_ca15_a7_power_management.pdf and get

confirm from ARM support, we know there only have cluster level's power down with CA15_PWRDN_EN/CA7_PWRDN_EN bits.

For the core level, we can NOT independently to power off the core if other cores in the same cluster are still powered on. But this is conflicting with TC2's power management code in tc2_pm.c.

We can see in the function *tc2_pm_down()*, it will call gic_cpu_if_down() to disable GIC's cpu interface; that means the core cannot receive interrupts anymore and the core will run into WFI. After the core run into WFI, if DCC/SPC detect there have interrupts from GIC's nIRQOUT/nFIQOUT pins, then the DCC/SPC will power on the core (or reset the core) to let the core to resume back, then s/w need enable the GIC's cpu interface for itself.

Here the questions are: a) in the function *tc2_pm_down()*, after the core run into WFI state, though DCC/SPC cannot power off the core if the core is NOT the last man of the cluster, but DCC/SPC will reset the core, right? b) how DCC/SPC decide the core is want to run into C1 state or only "WFI" state? DCC/SPC will use the WAKE_INT_MASK bits as the flag?

b) is correct.

To be precise the core will be reset upon IRQ. Till there is no pending IRQ the core stays in wfi and is stuck there.

Lorenzo

Leo Yan

23 Apr 23 Apr

1:25 a.m.

On 04/22/2013 11:48 PM, Lorenzo Pieralisi wrote:

...

On Fri, Apr 19, 2013 at 10:34:39PM +0100, Nicolas Pitre wrote:

...
[ Looping in Achin and Lorenzo ]

On Fri, 19 Apr 2013, Leo Yan wrote:

...
hi Nico & all,

We are do some profiling on TC2 board for low power mode, and found there have some long latency for the core/cluster's power on sequence, so want to confirm below questions:

From our profiling result, we found if the core_A send IPI to core_B and

the core_B run into the function bL_entry_point (or the function mcpm_entry_point in your later patches for mainline) will take about 954us, it's really a long interval.

Is core_B in a different cluster than core_A? It is a known fact that powering up a cluster has far greater latency than simply pulling a core out of reset.

...
Now we use the firmware is 13.01's version (with has supported BX_ADDRx registers); so the cluster level's power on sequence should be: a) DCC to detect the nIRQOUT/nFIQOUT asserting; b) DCC power on the according cluster; c) the core run into boot monitor code and finally it will use the BX_ADDRx register to jump to the function *bL_entry_point*;

Due upper flows are black box for us, so we suspect the time will be consumed by one of these steps; could u or ARM guys can help confirm this question?

Those steps are a black box to me as well. I'll let the ARM guys answer your questions.

b) is what takes time to allow voltage to become stable.

Also remember that there might be an ongoing DVFS request for the other cluster and DVFS/power up are inherently serialized by the power controller, so you should expect a degree of variability on that.

Fair enough, thx for explanation.

...

...
...

When we read the spec DAI0318D_v2p_ca15_a7_power_management.pdf and get

confirm from ARM support, we know there only have cluster level's power down with CA15_PWRDN_EN/CA7_PWRDN_EN bits.

For the core level, we can NOT independently to power off the core if other cores in the same cluster are still powered on. But this is conflicting with TC2's power management code in tc2_pm.c.

We can see in the function *tc2_pm_down()*, it will call gic_cpu_if_down() to disable GIC's cpu interface; that means the core cannot receive interrupts anymore and the core will run into WFI. After the core run into WFI, if DCC/SPC detect there have interrupts from GIC's nIRQOUT/nFIQOUT pins, then the DCC/SPC will power on the core (or reset the core) to let the core to resume back, then s/w need enable the GIC's cpu interface for itself.

Here the questions are: a) in the function *tc2_pm_down()*, after the core run into WFI state, though DCC/SPC cannot power off the core if the core is NOT the last man of the cluster, but DCC/SPC will reset the core, right? b) how DCC/SPC decide the core is want to run into C1 state or only "WFI" state? DCC/SPC will use the WAKE_INT_MASK bits as the flag?

b) is correct.

To be precise the core will be reset upon IRQ. Till there is no pending IRQ the core stays in wfi and is stuck there.

So this also can explain why we can use DStream to see the core exit from wfi, but in the s/w flow usually the core will be reset.

Here have another question, let's use the A15 cluster with two cores (a15_core_0/1) as the example: if the cluster is powered off and there have a pending IRQ for a15_core_0, so the power controller will power on A15 cluster and a15_core_0; at this point, will a15_core_1 stay in reset state until there have pending IRQ for it?

Thx, Leo Yan

Achin Gupta

9:13 a.m.

Hi Leo,

On Tue, Apr 23, 2013 at 02:25:46AM +0100, Leo Yan wrote:

...

On 04/22/2013 11:48 PM, Lorenzo Pieralisi wrote:

...
On Fri, Apr 19, 2013 at 10:34:39PM +0100, Nicolas Pitre wrote:

...
[ Looping in Achin and Lorenzo ]

On Fri, 19 Apr 2013, Leo Yan wrote:

...
hi Nico & all,

We are do some profiling on TC2 board for low power mode, and found there have some long latency for the core/cluster's power on sequence, so want to confirm below questions:

From our profiling result, we found if the core_A send IPI to core_B and

the core_B run into the function bL_entry_point (or the function mcpm_entry_point in your later patches for mainline) will take about 954us, it's really a long interval.

Is core_B in a different cluster than core_A? It is a known fact that powering up a cluster has far greater latency than simply pulling a core out of reset.

...
Now we use the firmware is 13.01's version (with has supported BX_ADDRx registers); so the cluster level's power on sequence should be: a) DCC to detect the nIRQOUT/nFIQOUT asserting; b) DCC power on the according cluster; c) the core run into boot monitor code and finally it will use the BX_ADDRx register to jump to the function *bL_entry_point*;

Due upper flows are black box for us, so we suspect the time will be consumed by one of these steps; could u or ARM guys can help confirm this question?

Those steps are a black box to me as well. I'll let the ARM guys answer your questions.

b) is what takes time to allow voltage to become stable.

Also remember that there might be an ongoing DVFS request for the other cluster and DVFS/power up are inherently serialized by the power controller, so you should expect a degree of variability on that.

Fair enough, thx for explanation.

...
...
...

When we read the spec DAI0318D_v2p_ca15_a7_power_management.pdf and get

confirm from ARM support, we know there only have cluster level's power down with CA15_PWRDN_EN/CA7_PWRDN_EN bits.

For the core level, we can NOT independently to power off the core if other cores in the same cluster are still powered on. But this is conflicting with TC2's power management code in tc2_pm.c.

We can see in the function *tc2_pm_down()*, it will call gic_cpu_if_down() to disable GIC's cpu interface; that means the core cannot receive interrupts anymore and the core will run into WFI. After the core run into WFI, if DCC/SPC detect there have interrupts from GIC's nIRQOUT/nFIQOUT pins, then the DCC/SPC will power on the core (or reset the core) to let the core to resume back, then s/w need enable the GIC's cpu interface for itself.

Here the questions are: a) in the function *tc2_pm_down()*, after the core run into WFI state, though DCC/SPC cannot power off the core if the core is NOT the last man of the cluster, but DCC/SPC will reset the core, right? b) how DCC/SPC decide the core is want to run into C1 state or only "WFI" state? DCC/SPC will use the WAKE_INT_MASK bits as the flag?

b) is correct.

To be precise the core will be reset upon IRQ. Till there is no pending IRQ the core stays in wfi and is stuck there.

So this also can explain why we can use DStream to see the core exit from wfi, but in the s/w flow usually the core will be reset.

Here have another question, let's use the A15 cluster with two cores (a15_core_0/1) as the example: if the cluster is powered off and there have a pending IRQ for a15_core_0, so the power controller will power on A15 cluster and a15_core_0; at this point, will a15_core_1 stay in reset state until there have pending IRQ for it?

The power controller brings all A15 cores out of reset when the A15 cluster is powered up. a15_core_0 will resume execution in Linux after picking up the entry point from its BX_ADDR register. a15_core_1 will:

1. Enter wfi in the bootloader if its BX_ADDR register is empty. This will happen if it had been hotplugged out earlier.

2. Enter linux like a15_core_0 if there is an address in its BX_ADDR register. This will happen if it had been suspended through cpuidle earlier.

hth, Achin

4556

days inactive

4560

days old

linaro-kernel@lists.linaro.org

9 comments

participants

tags (0)

participants (4)

Achin Gupta
Leo Yan
Lorenzo Pieralisi
Nicolas Pitre