Hi Andrey,
Please pull big-LITTLE-MP-master-v12 with following updates:
- Based on v3.7-rc5 - Stats: - Total Patches: 62 - New Patches: 1 - genirq: Add default affinity mask command line option in misc-patches branch - top 3 patches in: sched-pack-small-tasks-v1 - top 2 patches in: task-placement-v2 - additional patch in: config-fragments - Dropped patches/branches (as they are managed in experimental merge branch): 20 - patches in per-entity-load-tracking-with-core-sched-v1: 15 - Updated Patches: 0
---------------------x--------------------------x-----------------------
The following changes since commit 77b67063bb6bce6d475e910d3b886a606d0d91f7:
Linux 3.7-rc5 (2012-11-11 13:44:33 +0100)
are available in the git repository at:
git://git.linaro.org/arm/big.LITTLE/mp.git big-LITTLE-MP-master-v12
for you to fetch changes up to f942092bd1008de7379b4a52d38dc03de5949fc8:
Merge branches 'arm-multi_pmu_v2', 'hw-bkp-v7.1-debug-v1', 'task-placement-v2', 'misc-patches', 'config-fragments' and 'sched-pack-small-tasks-v1' into big-LITTLE-MP-master-v12-v2 (2012-11-17 09:29:41 +0530)
----------------------------------------------------------------
Ben Segall (1): sched: Maintain per-rq runnable averages
Chris Redpath (1): ARM: Experimental Frequency-Invariant Load Scaling Patch
Dietmar Eggemann (1): ARM: hw_breakpoint: v7.1 self-hosted debug powerdown support
Jon Medhurst (1): ARM: sched: Avoid empty 'slow' HMP domain
Liviu Dudau (2): Revert "sched: secure access to other CPU statistics" linaro/configs: big-LITTLE-MP: Enable the new tunable sysfs interface by default.
Lorenzo Pieralisi (1): ARM: kernel: provide cluster to logical cpu mask mapping API
Marc Zyngier (1): ARM: perf: add guest vs host discrimination
Mark Rutland (1): ARM: perf: register cpu_notifier at driver init
Morten Rasmussen (15): sched: entity load-tracking load_avg_ratio sched: Task placement for heterogeneous systems based on task load-tracking sched: Forced task migration on heterogeneous systems sched: Introduce priority-based task migration filter ARM: Add HMP scheduling support for ARM architecture ARM: sched: Use device-tree to provide fast/slow CPU list for HMP ARM: sched: Setup SCHED_HMP domains sched: Add ftrace events for entity load-tracking sched: Add HMP task migration ftrace event sched: SCHED_HMP multi-domain task migration control sched: Enable HMP priority filter by default sched: Only down migrate low priority tasks if allowed by affinity mask linaro/configs: Enable HMP priority filter by default sched: SD_SHARE_POWERLINE buddy selection fix ARM: TC2: Re-enable SD_SHARE_POWERLINE
Olivier Cozette (1): ARM: Change load tracking scale using sysfs
Paul Turner (15): sched: Track the runnable average on a per-task entity basis sched: Aggregate load contributed by task entities on parenting cfs_rq sched: Maintain the load contribution of blocked entities sched: Add an rq migration call-back to sched_class sched: Account for blocked load waking back up sched: Aggregate total task_group load sched: Compute load contribution by a group entity sched: Normalize tg load contributions against runnable time sched: Maintain runnable averages across throttled periods sched: Replace update_shares weight distribution with per-entity computation sched: Refactor update_shares_cpu() -> update_blocked_avgs() sched: Update_cfs_shares at period edge sched: Make __update_entity_runnable_avg() fast sched: Introduce temporary FAIR_GROUP_SCHED dependency for load-tracking sched: implement usage tracking
Peter Zijlstra (1): sched: Describe CFS load-balancer
Sudeep KarkadaNagesha (9): ARM: perf: allocate CPU PMU dynamically at probe time ARM: perf: consistently use struct perf_event in arm_pmu functions ARM: perf: check ARMv7 counter validity on a per-pmu basis ARM: perf: replace global CPU PMU pointer with per-cpu pointers ARM: perf: register CPU PMUs with idr types ARM: perf: set cpu affinity to support multiple PMUs ARM: perf: set cpu affinity for the irqs correctly ARM: perf: remove spaces in CPU PMU names ARM: perf: save/restore pmu registers in pm notifier
Thomas Gleixner (1): genirq: Add default affinity mask command line option
Vincent Guittot (5): sched: add a new SD SHARE_POWERLINE flag for sched_domain sched: pack small tasks sched: secure access to other CPU statistics sched: pack the idle load balance ARM: sched: clear SD_SHARE_POWERLINE
Viresh Kumar (5): Revert "sched: Introduce temporary FAIR_GROUP_SCHED dependency for load-tracking" configs: Add config fragments for big LITTLE MP linaro/configs: Update big LITTLE MP fragment for task placement work config-frag/big-LITTLE: Use device-tree to provide fast/slow CPU list for HMP Merge branches 'arm-multi_pmu_v2', 'hw-bkp-v7.1-debug-v1', 'task-placement-v2', 'misc-patches', 'config-fragments' and 'sched-pack-small-tasks-v1' into big-LITTLE-MP-master-v12-v2
Will Deacon (2): ARM: perf: return NOTIFY_DONE from cpu notifier when no available PMU ARM: perf: consistently use arm_pmu->name for PMU name
Documentation/devicetree/bindings/arm/pmu.txt | 3 + Documentation/kernel-parameters.txt | 9 + arch/arm/Kconfig | 85 ++ arch/arm/include/asm/perf_event.h | 5 + arch/arm/include/asm/pmu.h | 40 +- arch/arm/include/asm/topology.h | 34 + arch/arm/kernel/hw_breakpoint.c | 57 + arch/arm/kernel/perf_event.c | 103 +- arch/arm/kernel/perf_event_cpu.c | 169 ++- arch/arm/kernel/perf_event_v6.c | 130 +- arch/arm/kernel/perf_event_v7.c | 295 ++-- arch/arm/kernel/perf_event_xscale.c | 161 +- arch/arm/kernel/topology.c | 125 ++ arch/ia64/include/asm/topology.h | 1 + arch/tile/include/asm/topology.h | 1 + include/linux/sched.h | 29 + include/linux/topology.h | 3 + include/trace/events/sched.h | 153 ++ kernel/irq/irqdesc.c | 21 +- kernel/sched/core.c | 16 + kernel/sched/debug.c | 39 +- kernel/sched/fair.c | 1942 ++++++++++++++++++++++--- kernel/sched/sched.h | 65 +- linaro/configs/big-LITTLE-MP.conf | 13 + 24 files changed, 2943 insertions(+), 556 deletions(-) create mode 100644 linaro/configs/big-LITTLE-MP.conf
On Sun, 2012-11-18 at 10:40 +0530, Viresh Kumar wrote:
Hi Andrey,
Please pull big-LITTLE-MP-master-v12 with following updates:
- Based on v3.7-rc5 - Stats: - Total Patches: 62 - New Patches: 1 - genirq: Add default affinity mask command line option in
misc-patches branch - top 3 patches in: sched-pack-small-tasks-v1 - top 2 patches in: task-placement-v2 - additional patch in: config-fragments - Dropped patches/branches (as they are managed in experimental merge branch): 20 - patches in per-entity-load-tracking-with-core-sched-v1: 15 - Updated Patches: 0
This version increases Android boot time by a factor of 3, from 91 seconds to 257 seconds, this is comparing it with the version of the master-v12 branch created on Nov 15th. Looking at the differences in the code, one obvious thing which stands out is big-LITTLE-MP.conf now has:
CONFIG_HMP_VARIABLE_SCALE=y CONFIG_HMP_FREQUENCY_INVARIANT_SCALE=y
If I remove this then boot time goes back to 90 seconds.
Also, if I build without big-LITTLE-MP.conf the I get a build error:
kernel/sched/fair.c: In function 'update_entity_load_avg': kernel/sched/fair.c:1469:26: error: 'struct sched_entity' has no member named 'cfs_rq'
Hi Tixy,
These patches are mine and Olivier's, can you tell me what your config is to see even a 90s boot? I haven't noticed any boot-time extension on my system, but I'm booting from the A15 cluster. I'd like to reproduce your system here.
The build error is my mistake, I need to know which CPU a task is on and it looks like I have missed a dependency when I've changed the calling code. I will sort out a patch for that asap.
Best Regards, Chris
-----Original Message----- From: Jon Medhurst (Tixy) [mailto:tixy@linaro.org] Sent: 19 November 2012 15:41 To: Viresh Kumar Cc: Andrey Konovalov; PDSW-power-team; Lists linaro-dev Subject: Re: [GIT PULL]; big LITTLE MP master v12
On Sun, 2012-11-18 at 10:40 +0530, Viresh Kumar wrote:
Hi Andrey,
Please pull big-LITTLE-MP-master-v12 with following updates:
- Based on v3.7-rc5 - Stats: - Total Patches: 62 - New Patches: 1 - genirq: Add default affinity mask command line option in
misc-patches branch - top 3 patches in: sched-pack-small-tasks-v1 - top 2 patches in: task-placement-v2 - additional patch in: config-fragments - Dropped patches/branches (as they are managed in experimental merge branch): 20 - patches in per-entity-load-tracking-with-core-sched-v1: 15 - Updated Patches: 0
This version increases Android boot time by a factor of 3, from 91 seconds to 257 seconds, this is comparing it with the version of the master-v12 branch created on Nov 15th. Looking at the differences in the code, one obvious thing which stands out is big-LITTLE-MP.conf now has:
CONFIG_HMP_VARIABLE_SCALE=y CONFIG_HMP_FREQUENCY_INVARIANT_SCALE=y
If I remove this then boot time goes back to 90 seconds.
Also, if I build without big-LITTLE-MP.conf the I get a build error:
kernel/sched/fair.c: In function 'update_entity_load_avg': kernel/sched/fair.c:1469:26: error: 'struct sched_entity' has no member named 'cfs_rq'
-- Tixy
---------------------x--------------------------x--------------------
The following changes since commit
77b67063bb6bce6d475e910d3b886a606d0d91f7:
Linux 3.7-rc5 (2012-11-11 13:44:33 +0100)
are available in the git repository at:
git://git.linaro.org/arm/big.LITTLE/mp.git big-LITTLE-MP-master-v12
for you to fetch changes up to
f942092bd1008de7379b4a52d38dc03de5949fc8:
Merge branches 'arm-multi_pmu_v2', 'hw-bkp-v7.1-debug-v1', 'task-placement-v2', 'misc-patches', 'config-fragments' and 'sched-pack-small-tasks-v1' into big-LITTLE-MP-master-v12-v2 (2012-11-17 09:29:41 +0530)
Ben Segall (1): sched: Maintain per-rq runnable averages
Chris Redpath (1): ARM: Experimental Frequency-Invariant Load Scaling Patch
Dietmar Eggemann (1): ARM: hw_breakpoint: v7.1 self-hosted debug powerdown support
Jon Medhurst (1): ARM: sched: Avoid empty 'slow' HMP domain
Liviu Dudau (2): Revert "sched: secure access to other CPU statistics" linaro/configs: big-LITTLE-MP: Enable the new tunable sysfs interface by default.
Lorenzo Pieralisi (1): ARM: kernel: provide cluster to logical cpu mask mapping API
Marc Zyngier (1): ARM: perf: add guest vs host discrimination
Mark Rutland (1): ARM: perf: register cpu_notifier at driver init
Morten Rasmussen (15): sched: entity load-tracking load_avg_ratio sched: Task placement for heterogeneous systems based on task load-tracking sched: Forced task migration on heterogeneous systems sched: Introduce priority-based task migration filter ARM: Add HMP scheduling support for ARM architecture ARM: sched: Use device-tree to provide fast/slow CPU list for
HMP
ARM: sched: Setup SCHED_HMP domains sched: Add ftrace events for entity load-tracking sched: Add HMP task migration ftrace event sched: SCHED_HMP multi-domain task migration control sched: Enable HMP priority filter by default sched: Only down migrate low priority tasks if allowed by
affinity mask
linaro/configs: Enable HMP priority filter by default sched: SD_SHARE_POWERLINE buddy selection fix ARM: TC2: Re-enable SD_SHARE_POWERLINE
Olivier Cozette (1): ARM: Change load tracking scale using sysfs
Paul Turner (15): sched: Track the runnable average on a per-task entity basis sched: Aggregate load contributed by task entities on parenting
cfs_rq
sched: Maintain the load contribution of blocked entities sched: Add an rq migration call-back to sched_class sched: Account for blocked load waking back up sched: Aggregate total task_group load sched: Compute load contribution by a group entity sched: Normalize tg load contributions against runnable time sched: Maintain runnable averages across throttled periods sched: Replace update_shares weight distribution with per-
entity
computation sched: Refactor update_shares_cpu() -> update_blocked_avgs() sched: Update_cfs_shares at period edge sched: Make __update_entity_runnable_avg() fast sched: Introduce temporary FAIR_GROUP_SCHED dependency for
load-tracking
sched: implement usage tracking
Peter Zijlstra (1): sched: Describe CFS load-balancer
Sudeep KarkadaNagesha (9): ARM: perf: allocate CPU PMU dynamically at probe time ARM: perf: consistently use struct perf_event in arm_pmu
functions
ARM: perf: check ARMv7 counter validity on a per-pmu basis ARM: perf: replace global CPU PMU pointer with per-cpu pointers ARM: perf: register CPU PMUs with idr types ARM: perf: set cpu affinity to support multiple PMUs ARM: perf: set cpu affinity for the irqs correctly ARM: perf: remove spaces in CPU PMU names ARM: perf: save/restore pmu registers in pm notifier
Thomas Gleixner (1): genirq: Add default affinity mask command line option
Vincent Guittot (5): sched: add a new SD SHARE_POWERLINE flag for sched_domain sched: pack small tasks sched: secure access to other CPU statistics sched: pack the idle load balance ARM: sched: clear SD_SHARE_POWERLINE
Viresh Kumar (5): Revert "sched: Introduce temporary FAIR_GROUP_SCHED dependency for load-tracking" configs: Add config fragments for big LITTLE MP linaro/configs: Update big LITTLE MP fragment for task
placement work
config-frag/big-LITTLE: Use device-tree to provide fast/slow
CPU
list for HMP Merge branches 'arm-multi_pmu_v2', 'hw-bkp-v7.1-debug-v1', 'task-placement-v2', 'misc-patches', 'config-fragments' and 'sched-pack-small-tasks-v1' into big-LITTLE-MP-master-v12-v2
Will Deacon (2): ARM: perf: return NOTIFY_DONE from cpu notifier when no
available PMU
ARM: perf: consistently use arm_pmu->name for PMU name
Documentation/devicetree/bindings/arm/pmu.txt | 3 + Documentation/kernel-parameters.txt | 9 + arch/arm/Kconfig | 85 ++ arch/arm/include/asm/perf_event.h | 5 + arch/arm/include/asm/pmu.h | 40 +- arch/arm/include/asm/topology.h | 34 + arch/arm/kernel/hw_breakpoint.c | 57 + arch/arm/kernel/perf_event.c | 103 +- arch/arm/kernel/perf_event_cpu.c | 169 ++- arch/arm/kernel/perf_event_v6.c | 130 +- arch/arm/kernel/perf_event_v7.c | 295 ++-- arch/arm/kernel/perf_event_xscale.c | 161 +- arch/arm/kernel/topology.c | 125 ++ arch/ia64/include/asm/topology.h | 1 + arch/tile/include/asm/topology.h | 1 + include/linux/sched.h | 29 + include/linux/topology.h | 3 + include/trace/events/sched.h | 153 ++ kernel/irq/irqdesc.c | 21 +- kernel/sched/core.c | 16 + kernel/sched/debug.c | 39 +- kernel/sched/fair.c | 1942
++++++++++++++++++++++---
kernel/sched/sched.h | 65 +- linaro/configs/big-LITTLE-MP.conf | 13 + 24 files changed, 2943 insertions(+), 556 deletions(-) create mode 100644 linaro/configs/big-LITTLE-MP.conf
linaro-dev mailing list linaro-dev@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-dev
-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
On Mon, 2012-11-19 at 15:57 +0000, Chris Redpath wrote:
These patches are mine and Olivier's, can you tell me what your config is to see even a 90s boot? I haven't noticed any boot-time extension on my system, but I'm booting from the A15 cluster. I'd like to reproduce your system here.
The config Linaro uses for Android on vexpress is generated by the command:
ARCH=arm scripts/kconfig/merge_config.sh \ linaro/configs/linaro-base.conf \ linaro/configs/android.conf \ linaro/configs/big-LITTLE-MP.conf \ linaro/configs/vexpress.conf
I've push the kernel tree I had to my personal git area... http://git.linaro.org/gitweb?p=people/tixy/kernel.git%3Ba=shortlog%3Bh=refs/...
the Android userside I'm using the latest daily build: https://android-build.linaro.org/builds/~linaro-android/vexpress-jb-gcc47-ar...
An Android images always takes a lot longer to boot first time, so before doing any timing boot a fresh image one and wait for it to settle down (the power/freq status LED on the TC2 coretile are a good indicator of when the system finally goes mostly idle)
The build error is my mistake, I need to know which CPU a task is on and it looks like I have missed a dependency when I've changed the calling code. I will sort out a patch for that asap.
Even the 90 second boot seemed very long from what I remember, that's why I was also trying to to build without any big.LITTLE MP configured; was going to try different configs to see if I can narrow the slowness down.
Hi Tixy,
I've got your kernel and the JB filesystem modified for USB booting here. There is no boot delay for me - it takes just over 70s from boot monitor to mouse pointer.
Catch you in the morning :)
Chris
-----Original Message----- From: Jon Medhurst (Tixy) [mailto:tixy@linaro.org] Sent: 19 November 2012 16:16 To: Chris Redpath Cc: Viresh Kumar; Andrey Konovalov; PDSW-power-team; Lists linaro-dev Subject: Re: [GIT PULL]; big LITTLE MP master v12
On Mon, 2012-11-19 at 15:57 +0000, Chris Redpath wrote:
These patches are mine and Olivier's, can you tell me what your
config is to see even a 90s boot? I haven't noticed any boot-time extension on my system, but I'm booting from the A15 cluster. I'd like to reproduce your system here.
The config Linaro uses for Android on vexpress is generated by the command:
ARCH=arm scripts/kconfig/merge_config.sh \ linaro/configs/linaro-base.conf \ linaro/configs/android.conf \ linaro/configs/big-LITTLE-MP.conf \ linaro/configs/vexpress.conf
I've push the kernel tree I had to my personal git area... http://git.linaro.org/gitweb?p=people/tixy/kernel.git%3Ba=shortlog%3Bh=refs /heads/integration-android-vexpress
the Android userside I'm using the latest daily build: https://android-build.linaro.org/builds/~linaro-android/vexpress-jb- gcc47-armlt-tracking-open/#build=104
An Android images always takes a lot longer to boot first time, so before doing any timing boot a fresh image one and wait for it to settle down (the power/freq status LED on the TC2 coretile are a good indicator of when the system finally goes mostly idle)
The build error is my mistake, I need to know which CPU a task is on
and it looks like I have missed a dependency when I've changed the calling code. I will sort out a patch for that asap.
Even the 90 second boot seemed very long from what I remember, that's why I was also trying to to build without any big.LITTLE MP configured; was going to try different configs to see if I can narrow the slowness down.
-- Tixy
Best Regards, Chris
-----Original Message----- From: Jon Medhurst (Tixy) [mailto:tixy@linaro.org] Sent: 19 November 2012 15:41 To: Viresh Kumar Cc: Andrey Konovalov; PDSW-power-team; Lists linaro-dev Subject: Re: [GIT PULL]; big LITTLE MP master v12
On Sun, 2012-11-18 at 10:40 +0530, Viresh Kumar wrote:
Hi Andrey,
Please pull big-LITTLE-MP-master-v12 with following updates:
- Based on v3.7-rc5 - Stats: - Total Patches: 62 - New Patches: 1 - genirq: Add default affinity mask command line option in
misc-patches branch - top 3 patches in: sched-pack-small-tasks-v1 - top 2 patches in: task-placement-v2 - additional patch in: config-fragments - Dropped patches/branches (as they are managed in
experimental
merge branch): 20 - patches in per-entity-load-tracking-with-core-sched-v1:
15
- Updated Patches: 0
This version increases Android boot time by a factor of 3, from 91 seconds to 257 seconds, this is comparing it with the version of
the
master-v12 branch created on Nov 15th. Looking at the differences
in
the code, one obvious thing which stands out is big-LITTLE-MP.conf now
has:
CONFIG_HMP_VARIABLE_SCALE=y CONFIG_HMP_FREQUENCY_INVARIANT_SCALE=y
If I remove this then boot time goes back to 90 seconds.
Also, if I build without big-LITTLE-MP.conf the I get a build
error:
kernel/sched/fair.c: In function 'update_entity_load_avg': kernel/sched/fair.c:1469:26: error: 'struct sched_entity' has no
member
named 'cfs_rq'
-- Tixy
---------------------x--------------------------x----------------
The following changes since commit
77b67063bb6bce6d475e910d3b886a606d0d91f7:
Linux 3.7-rc5 (2012-11-11 13:44:33 +0100)
are available in the git repository at:
git://git.linaro.org/arm/big.LITTLE/mp.git big-LITTLE-MP-
master-v12
for you to fetch changes up to
f942092bd1008de7379b4a52d38dc03de5949fc8:
Merge branches 'arm-multi_pmu_v2', 'hw-bkp-v7.1-debug-v1', 'task-placement-v2', 'misc-patches', 'config-fragments' and 'sched-pack-small-tasks-v1' into big-LITTLE-MP-master-v12-v2 (2012-11-17 09:29:41 +0530)
Ben Segall (1): sched: Maintain per-rq runnable averages
Chris Redpath (1): ARM: Experimental Frequency-Invariant Load Scaling Patch
Dietmar Eggemann (1): ARM: hw_breakpoint: v7.1 self-hosted debug powerdown
support
Jon Medhurst (1): ARM: sched: Avoid empty 'slow' HMP domain
Liviu Dudau (2): Revert "sched: secure access to other CPU statistics" linaro/configs: big-LITTLE-MP: Enable the new tunable sysfs interface by default.
Lorenzo Pieralisi (1): ARM: kernel: provide cluster to logical cpu mask mapping
API
Marc Zyngier (1): ARM: perf: add guest vs host discrimination
Mark Rutland (1): ARM: perf: register cpu_notifier at driver init
Morten Rasmussen (15): sched: entity load-tracking load_avg_ratio sched: Task placement for heterogeneous systems based on
task
load-tracking sched: Forced task migration on heterogeneous systems sched: Introduce priority-based task migration filter ARM: Add HMP scheduling support for ARM architecture ARM: sched: Use device-tree to provide fast/slow CPU list
for
HMP
ARM: sched: Setup SCHED_HMP domains sched: Add ftrace events for entity load-tracking sched: Add HMP task migration ftrace event sched: SCHED_HMP multi-domain task migration control sched: Enable HMP priority filter by default sched: Only down migrate low priority tasks if allowed by
affinity mask
linaro/configs: Enable HMP priority filter by default sched: SD_SHARE_POWERLINE buddy selection fix ARM: TC2: Re-enable SD_SHARE_POWERLINE
Olivier Cozette (1): ARM: Change load tracking scale using sysfs
Paul Turner (15): sched: Track the runnable average on a per-task entity
basis
sched: Aggregate load contributed by task entities on
parenting
cfs_rq
sched: Maintain the load contribution of blocked entities sched: Add an rq migration call-back to sched_class sched: Account for blocked load waking back up sched: Aggregate total task_group load sched: Compute load contribution by a group entity sched: Normalize tg load contributions against runnable
time
sched: Maintain runnable averages across throttled periods sched: Replace update_shares weight distribution with per-
entity
computation sched: Refactor update_shares_cpu() ->
update_blocked_avgs()
sched: Update_cfs_shares at period edge sched: Make __update_entity_runnable_avg() fast sched: Introduce temporary FAIR_GROUP_SCHED dependency for
load-tracking
sched: implement usage tracking
Peter Zijlstra (1): sched: Describe CFS load-balancer
Sudeep KarkadaNagesha (9): ARM: perf: allocate CPU PMU dynamically at probe time ARM: perf: consistently use struct perf_event in arm_pmu
functions
ARM: perf: check ARMv7 counter validity on a per-pmu basis ARM: perf: replace global CPU PMU pointer with per-cpu
pointers
ARM: perf: register CPU PMUs with idr types ARM: perf: set cpu affinity to support multiple PMUs ARM: perf: set cpu affinity for the irqs correctly ARM: perf: remove spaces in CPU PMU names ARM: perf: save/restore pmu registers in pm notifier
Thomas Gleixner (1): genirq: Add default affinity mask command line option
Vincent Guittot (5): sched: add a new SD SHARE_POWERLINE flag for sched_domain sched: pack small tasks sched: secure access to other CPU statistics sched: pack the idle load balance ARM: sched: clear SD_SHARE_POWERLINE
Viresh Kumar (5): Revert "sched: Introduce temporary FAIR_GROUP_SCHED
dependency
for load-tracking" configs: Add config fragments for big LITTLE MP linaro/configs: Update big LITTLE MP fragment for task
placement work
config-frag/big-LITTLE: Use device-tree to provide
fast/slow
CPU
list for HMP Merge branches 'arm-multi_pmu_v2', 'hw-bkp-v7.1-debug-v1', 'task-placement-v2', 'misc-patches', 'config-fragments' and 'sched-pack-small-tasks-v1' into big-LITTLE-MP-master-v12-v2
Will Deacon (2): ARM: perf: return NOTIFY_DONE from cpu notifier when no
available PMU
ARM: perf: consistently use arm_pmu->name for PMU name
Documentation/devicetree/bindings/arm/pmu.txt | 3 + Documentation/kernel-parameters.txt | 9 + arch/arm/Kconfig | 85 ++ arch/arm/include/asm/perf_event.h | 5 + arch/arm/include/asm/pmu.h | 40 +- arch/arm/include/asm/topology.h | 34 + arch/arm/kernel/hw_breakpoint.c | 57 + arch/arm/kernel/perf_event.c | 103 +- arch/arm/kernel/perf_event_cpu.c | 169 ++- arch/arm/kernel/perf_event_v6.c | 130 +- arch/arm/kernel/perf_event_v7.c | 295 ++-- arch/arm/kernel/perf_event_xscale.c | 161 +- arch/arm/kernel/topology.c | 125 ++ arch/ia64/include/asm/topology.h | 1 + arch/tile/include/asm/topology.h | 1 + include/linux/sched.h | 29 + include/linux/topology.h | 3 + include/trace/events/sched.h | 153 ++ kernel/irq/irqdesc.c | 21 +- kernel/sched/core.c | 16 + kernel/sched/debug.c | 39 +- kernel/sched/fair.c | 1942
++++++++++++++++++++++---
kernel/sched/sched.h | 65 +- linaro/configs/big-LITTLE-MP.conf | 13 + 24 files changed, 2943 insertions(+), 556 deletions(-) create mode 100644 linaro/configs/big-LITTLE-MP.conf
linaro-dev mailing list linaro-dev@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-dev
-- IMPORTANT NOTICE: The contents of this email and any attachments
are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Viresh,
I won't pull the big-LITTLE-MP-master-v12 into the linux-linaro-core-tracking tree today due to the issues found by Tixy.
Tomorrow evening I am going to pull this topic anyway - whether these issues are resolved, or not. If the build error is not fixed by Thursday morning UTC, I'll move llct back to v11. Would it work for the Landing Teams? Tixy?
Thanks, Andrey
On 11/19/2012 07:57 PM, Chris Redpath wrote:
Hi Tixy,
These patches are mine and Olivier's, can you tell me what your config is to see even a 90s boot? I haven't noticed any boot-time extension on my system, but I'm booting from the A15 cluster. I'd like to reproduce your system here.
The build error is my mistake, I need to know which CPU a task is on and it looks like I have missed a dependency when I've changed the calling code. I will sort out a patch for that asap.
Best Regards, Chris
-----Original Message----- From: Jon Medhurst (Tixy) [mailto:tixy@linaro.org] Sent: 19 November 2012 15:41 To: Viresh Kumar Cc: Andrey Konovalov; PDSW-power-team; Lists linaro-dev Subject: Re: [GIT PULL]; big LITTLE MP master v12
On Sun, 2012-11-18 at 10:40 +0530, Viresh Kumar wrote:
Hi Andrey,
Please pull big-LITTLE-MP-master-v12 with following updates:
- Based on v3.7-rc5 - Stats: - Total Patches: 62 - New Patches: 1 - genirq: Add default affinity mask command line option in
misc-patches branch - top 3 patches in: sched-pack-small-tasks-v1 - top 2 patches in: task-placement-v2 - additional patch in: config-fragments - Dropped patches/branches (as they are managed in experimental merge branch): 20 - patches in per-entity-load-tracking-with-core-sched-v1: 15 - Updated Patches: 0
This version increases Android boot time by a factor of 3, from 91 seconds to 257 seconds, this is comparing it with the version of the master-v12 branch created on Nov 15th. Looking at the differences in the code, one obvious thing which stands out is big-LITTLE-MP.conf now has:
CONFIG_HMP_VARIABLE_SCALE=y CONFIG_HMP_FREQUENCY_INVARIANT_SCALE=y
If I remove this then boot time goes back to 90 seconds.
Also, if I build without big-LITTLE-MP.conf the I get a build error:
kernel/sched/fair.c: In function 'update_entity_load_avg': kernel/sched/fair.c:1469:26: error: 'struct sched_entity' has no member named 'cfs_rq'
-- Tixy
---------------------x--------------------------x--------------------
The following changes since commit
77b67063bb6bce6d475e910d3b886a606d0d91f7:
Linux 3.7-rc5 (2012-11-11 13:44:33 +0100)
are available in the git repository at:
git://git.linaro.org/arm/big.LITTLE/mp.git big-LITTLE-MP-master-v12
for you to fetch changes up to
f942092bd1008de7379b4a52d38dc03de5949fc8:
Merge branches 'arm-multi_pmu_v2', 'hw-bkp-v7.1-debug-v1', 'task-placement-v2', 'misc-patches', 'config-fragments' and 'sched-pack-small-tasks-v1' into big-LITTLE-MP-master-v12-v2 (2012-11-17 09:29:41 +0530)
Ben Segall (1): sched: Maintain per-rq runnable averages
Chris Redpath (1): ARM: Experimental Frequency-Invariant Load Scaling Patch
Dietmar Eggemann (1): ARM: hw_breakpoint: v7.1 self-hosted debug powerdown support
Jon Medhurst (1): ARM: sched: Avoid empty 'slow' HMP domain
Liviu Dudau (2): Revert "sched: secure access to other CPU statistics" linaro/configs: big-LITTLE-MP: Enable the new tunable sysfs interface by default.
Lorenzo Pieralisi (1): ARM: kernel: provide cluster to logical cpu mask mapping API
Marc Zyngier (1): ARM: perf: add guest vs host discrimination
Mark Rutland (1): ARM: perf: register cpu_notifier at driver init
Morten Rasmussen (15): sched: entity load-tracking load_avg_ratio sched: Task placement for heterogeneous systems based on task load-tracking sched: Forced task migration on heterogeneous systems sched: Introduce priority-based task migration filter ARM: Add HMP scheduling support for ARM architecture ARM: sched: Use device-tree to provide fast/slow CPU list for
HMP
ARM: sched: Setup SCHED_HMP domains sched: Add ftrace events for entity load-tracking sched: Add HMP task migration ftrace event sched: SCHED_HMP multi-domain task migration control sched: Enable HMP priority filter by default sched: Only down migrate low priority tasks if allowed by
affinity mask
linaro/configs: Enable HMP priority filter by default sched: SD_SHARE_POWERLINE buddy selection fix ARM: TC2: Re-enable SD_SHARE_POWERLINE
Olivier Cozette (1): ARM: Change load tracking scale using sysfs
Paul Turner (15): sched: Track the runnable average on a per-task entity basis sched: Aggregate load contributed by task entities on parenting
cfs_rq
sched: Maintain the load contribution of blocked entities sched: Add an rq migration call-back to sched_class sched: Account for blocked load waking back up sched: Aggregate total task_group load sched: Compute load contribution by a group entity sched: Normalize tg load contributions against runnable time sched: Maintain runnable averages across throttled periods sched: Replace update_shares weight distribution with per-
entity
computation sched: Refactor update_shares_cpu() -> update_blocked_avgs() sched: Update_cfs_shares at period edge sched: Make __update_entity_runnable_avg() fast sched: Introduce temporary FAIR_GROUP_SCHED dependency for
load-tracking
sched: implement usage tracking
Peter Zijlstra (1): sched: Describe CFS load-balancer
Sudeep KarkadaNagesha (9): ARM: perf: allocate CPU PMU dynamically at probe time ARM: perf: consistently use struct perf_event in arm_pmu
functions
ARM: perf: check ARMv7 counter validity on a per-pmu basis ARM: perf: replace global CPU PMU pointer with per-cpu pointers ARM: perf: register CPU PMUs with idr types ARM: perf: set cpu affinity to support multiple PMUs ARM: perf: set cpu affinity for the irqs correctly ARM: perf: remove spaces in CPU PMU names ARM: perf: save/restore pmu registers in pm notifier
Thomas Gleixner (1): genirq: Add default affinity mask command line option
Vincent Guittot (5): sched: add a new SD SHARE_POWERLINE flag for sched_domain sched: pack small tasks sched: secure access to other CPU statistics sched: pack the idle load balance ARM: sched: clear SD_SHARE_POWERLINE
Viresh Kumar (5): Revert "sched: Introduce temporary FAIR_GROUP_SCHED dependency for load-tracking" configs: Add config fragments for big LITTLE MP linaro/configs: Update big LITTLE MP fragment for task
placement work
config-frag/big-LITTLE: Use device-tree to provide fast/slow
CPU
list for HMP Merge branches 'arm-multi_pmu_v2', 'hw-bkp-v7.1-debug-v1', 'task-placement-v2', 'misc-patches', 'config-fragments' and 'sched-pack-small-tasks-v1' into big-LITTLE-MP-master-v12-v2
Will Deacon (2): ARM: perf: return NOTIFY_DONE from cpu notifier when no
available PMU
ARM: perf: consistently use arm_pmu->name for PMU name
Documentation/devicetree/bindings/arm/pmu.txt | 3 + Documentation/kernel-parameters.txt | 9 + arch/arm/Kconfig | 85 ++ arch/arm/include/asm/perf_event.h | 5 + arch/arm/include/asm/pmu.h | 40 +- arch/arm/include/asm/topology.h | 34 + arch/arm/kernel/hw_breakpoint.c | 57 + arch/arm/kernel/perf_event.c | 103 +- arch/arm/kernel/perf_event_cpu.c | 169 ++- arch/arm/kernel/perf_event_v6.c | 130 +- arch/arm/kernel/perf_event_v7.c | 295 ++-- arch/arm/kernel/perf_event_xscale.c | 161 +- arch/arm/kernel/topology.c | 125 ++ arch/ia64/include/asm/topology.h | 1 + arch/tile/include/asm/topology.h | 1 + include/linux/sched.h | 29 + include/linux/topology.h | 3 + include/trace/events/sched.h | 153 ++ kernel/irq/irqdesc.c | 21 +- kernel/sched/core.c | 16 + kernel/sched/debug.c | 39 +- kernel/sched/fair.c | 1942
++++++++++++++++++++++---
kernel/sched/sched.h | 65 +- linaro/configs/big-LITTLE-MP.conf | 13 + 24 files changed, 2943 insertions(+), 556 deletions(-) create mode 100644 linaro/configs/big-LITTLE-MP.conf
linaro-dev mailing list linaro-dev@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-dev
-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
On Mon, 2012-11-19 at 21:14 +0400, Andrey Konovalov wrote:
Viresh,
I won't pull the big-LITTLE-MP-master-v12 into the linux-linaro-core-tracking tree today due to the issues found by Tixy.
Tomorrow evening I am going to pull this topic anyway - whether these issues are resolved, or not. If the build error is not fixed by Thursday morning UTC, I'll move llct back to v11. Would it work for the Landing Teams? Tixy?
The timescales seem a bit wrong for that, working backwards...
- The monthly release is made from linux-linaro's state at end of Thursday.
- You need to merge Landing Team's topics in before then.
- Landing Teams need to prepare their topics based on a given llct.
- To prepare their topics, Landing Teams need to be able to compile their kernels.
So I would say that LT's need a final working llct tomorrow really. I could manage OK getting this on Wednesday, don't know about other teams.
That's would then mean the monthly release candidate build comes from a tree who's contents have never been built together before that day, so it's trusting to luck somewhat.
Last month llct was created by the Monday morning so LT's could base their branches on that and have them merged into linux-linaro by the end of Tuesday. We then had two days to fix problems before Thursday's code cutoff.
On 11/19/2012 10:17 PM, Jon Medhurst (Tixy) wrote:
On Mon, 2012-11-19 at 21:14 +0400, Andrey Konovalov wrote:
Viresh,
I won't pull the big-LITTLE-MP-master-v12 into the linux-linaro-core-tracking tree today due to the issues found by Tixy.
Tomorrow evening I am going to pull this topic anyway - whether these issues are resolved, or not. If the build error is not fixed by Thursday morning UTC, I'll move llct back to v11. Would it work for the Landing Teams? Tixy?
The timescales seem a bit wrong for that, working backwards...
That's correct..
- The monthly release is made from linux-linaro's state at end of
Thursday.
You need to merge Landing Team's topics in before then.
Landing Teams need to prepare their topics based on a given llct.
To prepare their topics, Landing Teams need to be able to compile
their kernels.
So I would say that LT's need a final working llct tomorrow really. I could manage OK getting this on Wednesday, don't know about other teams.
That's would then mean the monthly release candidate build comes from a tree who's contents have never been built together before that day, so it's trusting to luck somewhat.
Last month llct was created by the Monday morning so LT's could base their branches on that and have them merged into linux-linaro by the end of Tuesday. We then had two days to fix problems before Thursday's code cutoff.
I'll push updated llct tree with v11 big-LITTLE-MP topic tonight. If there is the build error fix by tomorrow, I can push one more llct update using the updated master v12 version tomorrow. Is the boot delay issue a show-stopper? If yes, we could just stick to the v11 for this cycle.
Thanks, Andrey
On Mon, 2012-11-19 at 22:24 +0400, Andrey Konovalov wrote:
On 11/19/2012 10:17 PM, Jon Medhurst (Tixy) wrote:
On Mon, 2012-11-19 at 21:14 +0400, Andrey Konovalov wrote:
Viresh,
I won't pull the big-LITTLE-MP-master-v12 into the linux-linaro-core-tracking tree today due to the issues found by Tixy.
Tomorrow evening I am going to pull this topic anyway - whether these issues are resolved, or not. If the build error is not fixed by Thursday morning UTC, I'll move llct back to v11. Would it work for the Landing Teams? Tixy?
The timescales seem a bit wrong for that, working backwards...
That's correct..
- The monthly release is made from linux-linaro's state at end of
Thursday.
You need to merge Landing Team's topics in before then.
Landing Teams need to prepare their topics based on a given llct.
To prepare their topics, Landing Teams need to be able to compile
their kernels.
So I would say that LT's need a final working llct tomorrow really. I could manage OK getting this on Wednesday, don't know about other teams.
That's would then mean the monthly release candidate build comes from a tree who's contents have never been built together before that day, so it's trusting to luck somewhat.
Last month llct was created by the Monday morning so LT's could base their branches on that and have them merged into linux-linaro by the end of Tuesday. We then had two days to fix problems before Thursday's code cutoff.
I'll push updated llct tree with v11 big-LITTLE-MP topic tonight. If there is the build error fix by tomorrow, I can push one more llct update using the updated master v12 version tomorrow.
That's sounds like a sensible plan :-)
Is the boot delay issue a show-stopper?
It's not a show-stopper. It only manifests with a new config option in the big-LITTPLE-MP config, so doesn't impact any board other than vexpress and if required I could override it in my tree or we could add a simple patch to linux-linaro later.
Someone else can't reproduce the problem so the slowness could be user error on my part. (The build failure problem is definitely real though :-)
On 11/19/2012 10:42 PM, Jon Medhurst (Tixy) wrote:
On Mon, 2012-11-19 at 22:24 +0400, Andrey Konovalov wrote:
On 11/19/2012 10:17 PM, Jon Medhurst (Tixy) wrote:
On Mon, 2012-11-19 at 21:14 +0400, Andrey Konovalov wrote:
Viresh,
I won't pull the big-LITTLE-MP-master-v12 into the linux-linaro-core-tracking tree today due to the issues found by Tixy.
Tomorrow evening I am going to pull this topic anyway - whether these issues are resolved, or not. If the build error is not fixed by Thursday morning UTC, I'll move llct back to v11. Would it work for the Landing Teams? Tixy?
The timescales seem a bit wrong for that, working backwards...
That's correct..
- The monthly release is made from linux-linaro's state at end of
Thursday.
You need to merge Landing Team's topics in before then.
Landing Teams need to prepare their topics based on a given llct.
To prepare their topics, Landing Teams need to be able to compile
their kernels.
So I would say that LT's need a final working llct tomorrow really. I could manage OK getting this on Wednesday, don't know about other teams.
That's would then mean the monthly release candidate build comes from a tree who's contents have never been built together before that day, so it's trusting to luck somewhat.
Last month llct was created by the Monday morning so LT's could base their branches on that and have them merged into linux-linaro by the end of Tuesday. We then had two days to fix problems before Thursday's code cutoff.
I'll push updated llct tree with v11 big-LITTLE-MP topic tonight. If there is the build error fix by tomorrow, I can push one more llct update using the updated master v12 version tomorrow.
That's sounds like a sensible plan :-)
llct-20121120.0 has been pushed to g.l.o: - v3.7-rc6 based - the same v11 big-LITTLE-MP topic, - configs topic renamed to core-configs, - basic-board-configs topic added, - devfreq topic added, - "KBuild: Allow scripts/* to be cross compiled" patch added to llct-v3.7-misc-fixes topic
Is the boot delay issue a show-stopper?
It's not a show-stopper. It only manifests with a new config option in the big-LITTPLE-MP config, so doesn't impact any board other than vexpress and if required I could override it in my tree or we could add a simple patch to linux-linaro later.
Someone else can't reproduce the problem so the slowness could be user error on my part. (The build failure problem is definitely real though :-)
OK :)
Thanks, Andrey
On 19 November 2012 22:44, Andrey Konovalov andrey.konovalov@linaro.org wrote:
I won't pull the big-LITTLE-MP-master-v12 into the linux-linaro-core-tracking tree today due to the issues found by Tixy.
Tomorrow evening I am going to pull this topic anyway - whether these issues are resolved, or not. If the build error is not fixed by Thursday morning UTC, I'll move llct back to v11. Would it work for the Landing Teams? Tixy?
Hi Andrey,
I have updated master-v12 branch with fixes from tixy and ARM. You can PULL it now :)
-- viresh
On Tue, 2012-11-20 at 13:00 +0530, Viresh Kumar wrote:
On 19 November 2012 22:44, Andrey Konovalov andrey.konovalov@linaro.org wrote:
I won't pull the big-LITTLE-MP-master-v12 into the linux-linaro-core-tracking tree today due to the issues found by Tixy.
Tomorrow evening I am going to pull this topic anyway - whether these issues are resolved, or not. If the build error is not fixed by Thursday morning UTC, I'll move llct back to v11. Would it work for the Landing Teams? Tixy?
Hi Andrey,
I have updated master-v12 branch with fixes from tixy and ARM. You can PULL it now :)
I've redone my vexpress test integration branch [1] based on the new big-LITTLE-MP-master-v12 and can confirm that an Android kernel builds and boots on TC2 and A9 CoreTiles, both with and without the big-LITTLE-MP.conf.
What shall we do about the boot time and possibly other performance regressions? I can disable CONFIG_HMP_VARIABLE_SCALE in vexpress builds...
On 20 November 2012 15:40, Jon Medhurst (Tixy) tixy@linaro.org wrote:
I've redone my vexpress test integration branch [1] based on the new big-LITTLE-MP-master-v12 and can confirm that an Android kernel builds and boots on TC2 and A9 CoreTiles, both with and without the big-LITTLE-MP.conf.
Thanks.
What shall we do about the boot time and possibly other performance regressions? I can disable CONFIG_HMP_VARIABLE_SCALE in vexpress
I haven't done anything for this, as Chris reported he isn't facing this issue. @Chris?
-- viresh
Well, I didn't find any problem with USB boot. I'm just setting up u-boot to go for a completely standard MMC boot and I'll continue to try to reproduce.
Tixy, could you try temporarily changing the default option to off for the frequency-invariant load?
I think it'll be at line 3631 in your fair.c
hmp_data.freqinvar_load_scale_enabled = 1;
Change it to 0 to change the default.
Basically, there are two feature changes which are both disabled when you disable CONFIG_HMP_VARIABLE_SCALE and I'd like to find out which is responsible.
The CONFIG_HMP_VARIABLE_SCALE control is responsible for a change which lets us 'stretch time' during load accumulation so that we can retain the optimised math routines in Paul's patch but also control the rate of accumulation of load.
On top of that using the same sysfs code, is CONFIG_HMP_FREQUENCY_INVARIANT_SCALE which uses a PM notifier to change a scale factor so that the recorded loads are expressed in terms of the potential CPU capacity rather than instantaneous capacity. This has the effect of stopping HMP up migrations from happening when the frequency of the A7 cluster is still low.
I can't think of any reason why either of these changes should result in increased boot time other than some odd interaction between DVFS and IO wait time which we haven't seen anywhere yet. Even then, it seems strange that we could get such a slowdown. Presumably the IO itself wouldn't get slower, and over the course of a whole boot the CPUs are not very busy anyway.
Regards, Chris
-----Original Message----- From: Viresh Kumar [mailto:viresh.kumar@linaro.org] Sent: 20 November 2012 10:13 To: Chris Redpath; Jon Medhurst (Tixy) Cc: Andrey Konovalov; PDSW-power-team; Lists linaro-dev Subject: Re: [GIT PULL]; big LITTLE MP master v12
On 20 November 2012 15:40, Jon Medhurst (Tixy) tixy@linaro.org wrote:
I've redone my vexpress test integration branch [1] based on the new big-LITTLE-MP-master-v12 and can confirm that an Android kernel
builds
and boots on TC2 and A9 CoreTiles, both with and without the big-LITTLE-MP.conf.
Thanks.
What shall we do about the boot time and possibly other performance regressions? I can disable CONFIG_HMP_VARIABLE_SCALE in vexpress
I haven't done anything for this, as Chris reported he isn't facing this issue. @Chris?
-- viresh
-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Hi Tixy,
It seems that at least on my board, the presence of so many MMC timeouts causes the A7s to remain at the lowest frequency, presumably because the system load average is so low. With the frequency-invariant patch, this has the side effect that no tasks ever get to run elsewhere.
Do you get lots of these?
[ 529.490615] mmcblk0: retrying using single block transfer [ 529.591009] mmcblk0: error -5 transferring data, sector 845208, nr 24, cmd response 0x0, card status 0x900
Regards, Chris
-----Original Message----- From: Chris Redpath Sent: 20 November 2012 10:23 To: Viresh Kumar; Jon Medhurst (Tixy) Cc: Andrey Konovalov; PDSW-power-team; Lists linaro-dev Subject: RE: [GIT PULL]; big LITTLE MP master v12
Well, I didn't find any problem with USB boot. I'm just setting up u- boot to go for a completely standard MMC boot and I'll continue to try to reproduce.
Tixy, could you try temporarily changing the default option to off for the frequency-invariant load?
I think it'll be at line 3631 in your fair.c
hmp_data.freqinvar_load_scale_enabled = 1;
Change it to 0 to change the default.
Basically, there are two feature changes which are both disabled when you disable CONFIG_HMP_VARIABLE_SCALE and I'd like to find out which is responsible.
The CONFIG_HMP_VARIABLE_SCALE control is responsible for a change which lets us 'stretch time' during load accumulation so that we can retain the optimised math routines in Paul's patch but also control the rate of accumulation of load.
On top of that using the same sysfs code, is CONFIG_HMP_FREQUENCY_INVARIANT_SCALE which uses a PM notifier to change a scale factor so that the recorded loads are expressed in terms of the potential CPU capacity rather than instantaneous capacity. This has the effect of stopping HMP up migrations from happening when the frequency of the A7 cluster is still low.
I can't think of any reason why either of these changes should result in increased boot time other than some odd interaction between DVFS and IO wait time which we haven't seen anywhere yet. Even then, it seems strange that we could get such a slowdown. Presumably the IO itself wouldn't get slower, and over the course of a whole boot the CPUs are not very busy anyway.
Regards, Chris
-----Original Message----- From: Viresh Kumar [mailto:viresh.kumar@linaro.org] Sent: 20 November 2012 10:13 To: Chris Redpath; Jon Medhurst (Tixy) Cc: Andrey Konovalov; PDSW-power-team; Lists linaro-dev Subject: Re: [GIT PULL]; big LITTLE MP master v12
On 20 November 2012 15:40, Jon Medhurst (Tixy) tixy@linaro.org
wrote:
I've redone my vexpress test integration branch [1] based on the
new
big-LITTLE-MP-master-v12 and can confirm that an Android kernel
builds
and boots on TC2 and A9 CoreTiles, both with and without the big-LITTLE-MP.conf.
Thanks.
What shall we do about the boot time and possibly other performance regressions? I can disable CONFIG_HMP_VARIABLE_SCALE in vexpress
I haven't done anything for this, as Chris reported he isn't facing this issue. @Chris?
-- viresh
-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
On Tue, 2012-11-20 at 11:31 +0000, Chris Redpath wrote:
Hi Tixy,
It seems that at least on my board, the presence of so many MMC timeouts causes the A7s to remain at the lowest frequency, presumably because the system load average is so low. With the frequency-invariant patch, this has the side effect that no tasks ever get to run elsewhere.
Do you get lots of these?
[ 529.490615] mmcblk0: retrying using single block transfer [ 529.591009] mmcblk0: error -5 transferring data, sector 845208, nr 24, cmd response 0x0, card status 0x900
No, because we have fixed IOFPGA firmware. The 'Firmware Update' tab on http://releases.linaro.org/12.10/android/vexpress links to ARM's site to get this.
Our kernel commandline also has "mmci.fmax=12000000" to select the fastest clock speed possible.
On Tue, 2012-11-20 at 10:23 +0000, Chris Redpath wrote:
Well, I didn't find any problem with USB boot. I'm just setting up u-boot to go for a completely standard MMC boot and I'll continue to try to reproduce.
I use ARM's bootmonitor for booting. My setup should have the same firmware and boot process as results from the instructions at: http://releases.linaro.org/12.10/android/vexpress
Tixy, could you try temporarily changing the default option to off for the frequency-invariant load?
I think it'll be at line 3631 in your fair.c
hmp_data.freqinvar_load_scale_enabled = 1;
Change it to 0 to change the default.
That change restores boot time to 90 seconds down from 5-ish minutes.
I'll try booting from a USB stick instead of MMC...
On Tue, 2012-11-20 at 11:32 +0000, Jon Medhurst (Tixy) wrote:
Tixy, could you try temporarily changing the default option to off for the frequency-invariant load?
I think it'll be at line 3631 in your fair.c
hmp_data.freqinvar_load_scale_enabled = 1;
Change it to 0 to change the default.
That change restores boot time to 90 seconds down from 5-ish minutes.
I'll try booting from a USB stick instead of MMC...
I cloned my MMC card onto a USB stick and updated init.partitions.rc in my initrd to mount sdaX rather than mmblk0pX. Here are the boot times I get:
With hmp_data.freqinvar_load_scale_enabled = 1
49s, 46s, 46s
With hmp_data.freqinvar_load_scale_enabled = 0
39s, 40s, 37s
So whilst the difference is nothing like what I see with MMC, there does still seem to be a significant effect.
On the slow booting config, the 'Geiger counter' sound from the power circuitry does sound a bit harsher, perhaps the pattern of cpuidling is different. That could be my imaginations, or a symptom rather than a cause...
Hi Tixy,
I can now reproduce it with MMC too, and the times are pretty much the same for me :(
There is definitely some interaction with CPUFreq, the A7s stay at a low enough OPP that the busy threads during boot are never able to accumulate enough load to migrate up to an A15. Normally, it makes no sense for a thread classified as 'heavy' at 350MHz to move to an A15 since it may very well not be a heavy thread at 1GHz. This is the expected behaviour of the patch, to bring HMP migration in line with cpufreq DVFS changes.
Do you always use the ondemand governor in Linaro? I'm wondering if one of the differences is that the default ondemand config bases it's load calculation on a 1s sliding window, while we use the interactive governor which is much more sensitive.
I'm going to try changing the ondemand config at boot and see what happens.
Regards, Chris
-----Original Message----- From: Jon Medhurst (Tixy) [mailto:tixy@linaro.org] Sent: 20 November 2012 12:31 To: Chris Redpath Cc: Viresh Kumar; Andrey Konovalov; PDSW-power-team; Lists linaro-dev Subject: Re: [GIT PULL]; big LITTLE MP master v12
On Tue, 2012-11-20 at 11:32 +0000, Jon Medhurst (Tixy) wrote:
Tixy, could you try temporarily changing the default option to off
for the frequency-invariant load?
I think it'll be at line 3631 in your fair.c
hmp_data.freqinvar_load_scale_enabled = 1;
Change it to 0 to change the default.
That change restores boot time to 90 seconds down from 5-ish minutes.
I'll try booting from a USB stick instead of MMC...
I cloned my MMC card onto a USB stick and updated init.partitions.rc in my initrd to mount sdaX rather than mmblk0pX. Here are the boot times I get:
With hmp_data.freqinvar_load_scale_enabled = 1
49s, 46s, 46s
With hmp_data.freqinvar_load_scale_enabled = 0
39s, 40s, 37s
So whilst the difference is nothing like what I see with MMC, there does still seem to be a significant effect.
On the slow booting config, the 'Geiger counter' sound from the power circuitry does sound a bit harsher, perhaps the pattern of cpuidling is different. That could be my imaginations, or a symptom rather than a cause...
-- Tixy
-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
On Tue, 2012-11-20 at 12:39 +0000, Chris Redpath wrote:
Hi Tixy,
I can now reproduce it with MMC too, and the times are pretty much the same for me :(
There is definitely some interaction with CPUFreq, the A7s stay at a low enough OPP that the busy threads during boot are never able to accumulate enough load to migrate up to an A15. Normally, it makes no sense for a thread classified as 'heavy' at 350MHz to move to an A15 since it may very well not be a heavy thread at 1GHz. This is the expected behaviour of the patch, to bring HMP migration in line with cpufreq DVFS changes.
Do you always use the ondemand governor in Linaro?
That is how I was told by people at ARM to configure vexpress for big.LITTLE MP; A15's as 'perfomance', A7's as 'ondemand'.
On 11/20/2012 11:30 AM, Viresh Kumar wrote:
On 19 November 2012 22:44, Andrey Konovalov andrey.konovalov@linaro.org wrote:
I won't pull the big-LITTLE-MP-master-v12 into the linux-linaro-core-tracking tree today due to the issues found by Tixy.
Tomorrow evening I am going to pull this topic anyway - whether these issues are resolved, or not. If the build error is not fixed by Thursday morning UTC, I'll move llct back to v11. Would it work for the Landing Teams? Tixy?
Hi Andrey,
I have updated master-v12 branch with fixes from tixy and ARM. You can PULL it now :)
Done. Thanks!
Andrey
Hi Viresh,
I test your updated branch, and find there is some errors:
------------------------------------------------------------------------------------------------ # ARCH=arm scripts/kconfig/merge_config.sh arch/arm/configs/vexpress_defconfig linaro/configs/big-LITTLE-MP.conf Using arch/arm/configs/vexpress_defconfig as base Merging linaro/configs/big-LITTLE-MP.conf HOSTCC scripts/basic/fixdep HOSTCC scripts/kconfig/conf.o SHIPPED scripts/kconfig/zconf.tab.c SHIPPED scripts/kconfig/zconf.lex.c SHIPPED scripts/kconfig/zconf.hash.c HOSTCC scripts/kconfig/zconf.tab.o HOSTLD scripts/kconfig/conf scripts/kconfig/conf --alldefconfig Kconfig # # configuration written to .config # Value requested for CONFIG_RCU_CPU_STALL_DETECTOR not in final .config Requested value: # CONFIG_RCU_CPU_STALL_DETECTOR is not set Actual value:
Value requested for CONFIG_DEBUG_ERRORS not in final .config Requested value: CONFIG_DEBUG_ERRORS=y Actual value:
Value requested for CONFIG_HMP_FREQUENCY_INVARIANT_SCALE not in final .config Requested value: CONFIG_HMP_FREQUENCY_INVARIANT_SCALE=y Actual value: ------------------------------------------------------------------------------------------------
Any suggestion for this?
Also when launch the MP system, how could I know there already two cluster are running? For I only see one cpu by checking cpuinfo. The system I am running actually is A15x1-A7x1 over fastmodel... ------------------------------------------------------------------ / # cat /proc/cpuinfo Processor : ARMv7 Processor rev 0 (v7l) processor : 0 BogoMIPS : 99.73
Features : swp half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt CPU implementer : 0x41 CPU architecture: 7 CPU variant : 0x2 CPU part : 0xc0f CPU revision : 0
Hardware : ARM-Versatile Express Revision : 0000 Serial : 0000000000000000
Thanks, Lei
On Sun, Nov 18, 2012 at 1:10 PM, Viresh Kumar viresh.kumar@linaro.orgwrote:
Hi Andrey,
Please pull big-LITTLE-MP-master-v12 with following updates:
- Based on v3.7-rc5 - Stats: - Total Patches: 62 - New Patches: 1 - genirq: Add default affinity mask command line option in
misc-patches branch - top 3 patches in: sched-pack-small-tasks-v1 - top 2 patches in: task-placement-v2 - additional patch in: config-fragments - Dropped patches/branches (as they are managed in experimental merge branch): 20 - patches in per-entity-load-tracking-with-core-sched-v1: 15 - Updated Patches: 0
---------------------x--------------------------x-----------------------
The following changes since commit 77b67063bb6bce6d475e910d3b886a606d0d91f7:
Linux 3.7-rc5 (2012-11-11 13:44:33 +0100)
are available in the git repository at:
git://git.linaro.org/arm/big.LITTLE/mp.git big-LITTLE-MP-master-v12
for you to fetch changes up to f942092bd1008de7379b4a52d38dc03de5949fc8:
Merge branches 'arm-multi_pmu_v2', 'hw-bkp-v7.1-debug-v1', 'task-placement-v2', 'misc-patches', 'config-fragments' and 'sched-pack-small-tasks-v1' into big-LITTLE-MP-master-v12-v2 (2012-11-17 09:29:41 +0530)
Ben Segall (1): sched: Maintain per-rq runnable averages
Chris Redpath (1): ARM: Experimental Frequency-Invariant Load Scaling Patch
Dietmar Eggemann (1): ARM: hw_breakpoint: v7.1 self-hosted debug powerdown support
Jon Medhurst (1): ARM: sched: Avoid empty 'slow' HMP domain
Liviu Dudau (2): Revert "sched: secure access to other CPU statistics" linaro/configs: big-LITTLE-MP: Enable the new tunable sysfs interface by default.
Lorenzo Pieralisi (1): ARM: kernel: provide cluster to logical cpu mask mapping API
Marc Zyngier (1): ARM: perf: add guest vs host discrimination
Mark Rutland (1): ARM: perf: register cpu_notifier at driver init
Morten Rasmussen (15): sched: entity load-tracking load_avg_ratio sched: Task placement for heterogeneous systems based on task load-tracking sched: Forced task migration on heterogeneous systems sched: Introduce priority-based task migration filter ARM: Add HMP scheduling support for ARM architecture ARM: sched: Use device-tree to provide fast/slow CPU list for HMP ARM: sched: Setup SCHED_HMP domains sched: Add ftrace events for entity load-tracking sched: Add HMP task migration ftrace event sched: SCHED_HMP multi-domain task migration control sched: Enable HMP priority filter by default sched: Only down migrate low priority tasks if allowed by affinity mask linaro/configs: Enable HMP priority filter by default sched: SD_SHARE_POWERLINE buddy selection fix ARM: TC2: Re-enable SD_SHARE_POWERLINE
Olivier Cozette (1): ARM: Change load tracking scale using sysfs
Paul Turner (15): sched: Track the runnable average on a per-task entity basis sched: Aggregate load contributed by task entities on parenting cfs_rq sched: Maintain the load contribution of blocked entities sched: Add an rq migration call-back to sched_class sched: Account for blocked load waking back up sched: Aggregate total task_group load sched: Compute load contribution by a group entity sched: Normalize tg load contributions against runnable time sched: Maintain runnable averages across throttled periods sched: Replace update_shares weight distribution with per-entity computation sched: Refactor update_shares_cpu() -> update_blocked_avgs() sched: Update_cfs_shares at period edge sched: Make __update_entity_runnable_avg() fast sched: Introduce temporary FAIR_GROUP_SCHED dependency for load-tracking sched: implement usage tracking
Peter Zijlstra (1): sched: Describe CFS load-balancer
Sudeep KarkadaNagesha (9): ARM: perf: allocate CPU PMU dynamically at probe time ARM: perf: consistently use struct perf_event in arm_pmu functions ARM: perf: check ARMv7 counter validity on a per-pmu basis ARM: perf: replace global CPU PMU pointer with per-cpu pointers ARM: perf: register CPU PMUs with idr types ARM: perf: set cpu affinity to support multiple PMUs ARM: perf: set cpu affinity for the irqs correctly ARM: perf: remove spaces in CPU PMU names ARM: perf: save/restore pmu registers in pm notifier
Thomas Gleixner (1): genirq: Add default affinity mask command line option
Vincent Guittot (5): sched: add a new SD SHARE_POWERLINE flag for sched_domain sched: pack small tasks sched: secure access to other CPU statistics sched: pack the idle load balance ARM: sched: clear SD_SHARE_POWERLINE
Viresh Kumar (5): Revert "sched: Introduce temporary FAIR_GROUP_SCHED dependency for load-tracking" configs: Add config fragments for big LITTLE MP linaro/configs: Update big LITTLE MP fragment for task placement work config-frag/big-LITTLE: Use device-tree to provide fast/slow CPU list for HMP Merge branches 'arm-multi_pmu_v2', 'hw-bkp-v7.1-debug-v1', 'task-placement-v2', 'misc-patches', 'config-fragments' and 'sched-pack-small-tasks-v1' into big-LITTLE-MP-master-v12-v2
Will Deacon (2): ARM: perf: return NOTIFY_DONE from cpu notifier when no available PMU ARM: perf: consistently use arm_pmu->name for PMU name
Documentation/devicetree/bindings/arm/pmu.txt | 3 + Documentation/kernel-parameters.txt | 9 + arch/arm/Kconfig | 85 ++ arch/arm/include/asm/perf_event.h | 5 + arch/arm/include/asm/pmu.h | 40 +- arch/arm/include/asm/topology.h | 34 + arch/arm/kernel/hw_breakpoint.c | 57 + arch/arm/kernel/perf_event.c | 103 +- arch/arm/kernel/perf_event_cpu.c | 169 ++- arch/arm/kernel/perf_event_v6.c | 130 +- arch/arm/kernel/perf_event_v7.c | 295 ++-- arch/arm/kernel/perf_event_xscale.c | 161 +- arch/arm/kernel/topology.c | 125 ++ arch/ia64/include/asm/topology.h | 1 + arch/tile/include/asm/topology.h | 1 + include/linux/sched.h | 29 + include/linux/topology.h | 3 + include/trace/events/sched.h | 153 ++ kernel/irq/irqdesc.c | 21 +- kernel/sched/core.c | 16 + kernel/sched/debug.c | 39 +- kernel/sched/fair.c | 1942 ++++++++++++++++++++++--- kernel/sched/sched.h | 65 +- linaro/configs/big-LITTLE-MP.conf | 13 + 24 files changed, 2943 insertions(+), 556 deletions(-) create mode 100644 linaro/configs/big-LITTLE-MP.conf
linaro-dev mailing list linaro-dev@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-dev
On 20 November 2012 20:06, Lei Wen adrian.wenl@gmail.com wrote:
I test your updated branch, and find there is some errors:
# ARCH=arm scripts/kconfig/merge_config.sh arch/arm/configs/vexpress_defconfig linaro/configs/big-LITTLE-MP.conf
I don't know what's there in vexpress_defconfig. How everybody else is using this is, merge tixy integration branch with my branch (It may already have my branch).
Then do merge of linaro base config, vexpress.conf, ubuntu or android.conf. All above from linaro/configs/*
Using arch/arm/configs/vexpress_defconfig as base Merging linaro/configs/big-LITTLE-MP.conf HOSTCC scripts/basic/fixdep HOSTCC scripts/kconfig/conf.o SHIPPED scripts/kconfig/zconf.tab.c SHIPPED scripts/kconfig/zconf.lex.c SHIPPED scripts/kconfig/zconf.hash.c HOSTCC scripts/kconfig/zconf.tab.o HOSTLD scripts/kconfig/conf scripts/kconfig/conf --alldefconfig Kconfig # # configuration written to .config # Value requested for CONFIG_RCU_CPU_STALL_DETECTOR not in final .config Requested value: # CONFIG_RCU_CPU_STALL_DETECTOR is not set Actual value:
Value requested for CONFIG_DEBUG_ERRORS not in final .config Requested value: CONFIG_DEBUG_ERRORS=y Actual value:
Value requested for CONFIG_HMP_FREQUENCY_INVARIANT_SCALE not in final .config Requested value: CONFIG_HMP_FREQUENCY_INVARIANT_SCALE=y Actual value:
This can happen due to some mismatched dependencies. You just need to check dependencies. Even i get lot of errors when i merge configs mentioned earlier.
Any suggestion for this?
Also when launch the MP system, how could I know there already two cluster are running? For I only see one cpu by checking cpuinfo. The system I am running actually is A15x1-A7x1 over fastmodel...
Check cpu domain information. pass sched_debug in cmdline params and check boot prints.
-- viresh