linaro-kernel April 2013

linaro-kernel@lists.linaro.org

91 participants
94 discussions

[RFC PATCH v3 0/6] sched: packing small tasks

by Vincent Guittot

Hi, This patchset takes advantage of the new per-task load tracking that is available in the kernel for packing the small tasks in as few as possible CPU/Cluster/Core. The main goal of packing small tasks is to reduce the power consumption in the low load use cases by minimizing the number of power domain that are enabled. The packing is done in 2 steps: The 1st step looks for the best place to pack tasks in a system according to its topology and it defines a pack buddy CPU for each CPU if there is one available. We define the best CPU during the build of the sched_domain instead of evaluating it at runtime because it can be difficult to define a stable buddy CPU in a low CPU load situation. The policy for defining a buddy CPU is that we pack at all levels inside a node where a group of CPU can be power gated independently from others. For describing this capability, a new flag has been introduced SD_SHARE_POWERDOMAIN that is used to indicate whether the groups of CPUs of a scheduling domain are sharing their power state. By default, this flag has been set in all sched_domain in order to keep unchanged the current behavior of the scheduler and only ARM platform clears the SD_SHARE_POWERDOMAIN flag for MC and CPU level. In a 2nd step, the scheduler checks the load average of a task which wakes up as well as the load average of the buddy CPU and it can decide to migrate the light tasks on a not busy buddy. This check is done during the wake up because small tasks tend to wake up between periodic load balance and asynchronously to each other which prevents the default mechanism to catch and migrate them efficiently. A light task is defined by a runnable_avg_sum that is less than 20% of the runnable_avg_period. In fact, the former condition encloses 2 ones: The average CPU load of the task must be less than 20% and the task must have been runnable less than 10ms when it woke up last time in order to be electable for the packing migration. So, a task than runs 1 ms each 5ms will be considered as a small task but a task that runs 50 ms with a period of 500ms, will not. Then, the business of the buddy CPU depends of the load average for the rq and the number of running tasks. A CPU with a load average greater than 50% will be considered as busy CPU whatever the number of running tasks is and this threshold will be reduced by the number of running tasks in order to not increase too much the wake up latency of a task. When the buddy CPU is busy, the scheduler falls back to default CFS policy. Change since V2: - Migrate only a task that wakes up - Change the light tasks threshold to 20% - Change the loaded CPU threshold to not pull tasks if the current number of running tasks is null but the load average is already greater than 50% - Fix the algorithm for selecting the buddy CPU. Change since V1: Patch 2/6 - Change the flag name which was not clear. The new name is SD_SHARE_POWERDOMAIN. - Create an architecture dependent function to tune the sched_domain flags Patch 3/6 - Fix issues in the algorithm that looks for the best buddy CPU - Use pr_debug instead of pr_info - Fix for uniprocessor Patch 4/6 - Remove the use of usage_avg_sum which has not been merged Patch 5/6 - Change the way the coherency of runnable_avg_sum and runnable_avg_period is ensured Patch 6/6 - Use the arch dependent function to set/clear SD_SHARE_POWERDOMAIN for ARM platform New results for v3: This series has been tested with hackbench on ARM platform and the results don't show any performance regression Hackbench 3.9-rc2 +patches Mean Time (10 tests): 2.048 2.015 stdev : 0.047 0.068 Previous results for V2: This series has been tested with MP3 play back on ARM platform: TC2 HMP (dual CA-15 and 3xCA-7 cluster). The measurements have been done on an Ubuntu image during 60 seconds of playback and the result has been normalized to 100. | CA15 | CA7 | total | ------------------------------------- default | 81 | 97 | 178 | pack | 13 | 100 | 113 | ------------------------------------- Previous results for V1: The patch-set has been tested on ARM platforms: quad CA-9 SMP and TC2 HMP (dual CA-15 and 3xCA-7 cluster). For ARM platform, the results have demonstrated that it's worth packing small tasks at all topology levels. The performance tests have been done on both platforms with sysbench. The results don't show any performance regressions. These results are aligned with the policy which uses the normal behavior with heavy use cases. test: sysbench --test=cpu --num-threads=N --max-requests=R run Results below is the average duration of 3 tests on the quad CA-9. default is the current scheduler behavior (pack buddy CPU is -1) pack is the scheduler with the pack mechanism | default | pack | ----------------------------------- N=8; R=200 | 3.1999 | 3.1921 | N=8; R=2000 | 31.4939 | 31.4844 | N=12; R=200 | 3.2043 | 3.2084 | N=12; R=2000 | 31.4897 | 31.4831 | N=16; R=200 | 3.1774 | 3.1824 | N=16; R=2000 | 31.4899 | 31.4897 | ----------------------------------- The power consumption tests have been done only on TC2 platform which has got accessible power lines and I have used cyclictest to simulate small tasks. The tests show some power consumption improvements. test: cyclictest -t 8 -q -e 1000000 -D 20 & cyclictest -t 8 -q -e 1000000 -D 20 The measurements have been done during 16 seconds and the result has been normalized to 100 | CA15 | CA7 | total | ------------------------------------- default | 100 | 40 | 140 | pack | <1 | 45 | <46 | ------------------------------------- The A15 cluster is less power efficient than the A7 cluster but if we assume that the tasks is well spread on both clusters, we can guest estimate that the power consumption on a dual cluster of CA7 would have been for a default kernel: | CA7 | CA7 | total | ------------------------------------- default | 40 | 40 | 80 | ------------------------------------- Vincent Guittot (6): Revert "sched: Introduce temporary FAIR_GROUP_SCHED dependency for load-tracking" sched: add a new SD_SHARE_POWERDOMAIN flag for sched_domain sched: pack small tasks sched: secure access to other CPU statistics sched: pack the idle load balance ARM: sched: clear SD_SHARE_POWERDOMAIN arch/arm/kernel/topology.c | 9 +++ arch/ia64/include/asm/topology.h | 1 + arch/tile/include/asm/topology.h | 1 + include/linux/sched.h | 9 +-- include/linux/topology.h | 4 + kernel/sched/core.c | 14 ++-- kernel/sched/fair.c | 149 +++++++++++++++++++++++++++++++++++--- kernel/sched/sched.h | 14 ++-- 8 files changed, 169 insertions(+), 32 deletions(-) -- 1.7.9.5

12 years, 2 months

[PATCH] cpuidle: add maintainer entry

by Daniel Lezcano

Currently the cpuidle drivers are spread across the different archs. The patch submission for cpuidle are following different path: the cpuidle core code goes to linux-pm, the ARM drivers goes to arm-soc or the SoC specific tree, sh goes through sh arch tree, pseries goes through PowerPC and finally intel goes through Len's tree while acpi_idle goes under linux-pm. That makes difficult to consolidate the code and to propagate modifications from the cpuidle core to the different drivers. Hopefully, a movement has initiated to put the cpuidle drivers into the drivers/cpuidle directory like cpuidle-calxeda.c and cpuidle-kirkwood.c Add an explicit maintainer entry in the MAINTAINER to clarify the situation and prevent new cpuidle drivers to goes to an arch directory. The upstreaming process is unchanged: Rafael takes the patches to merge them into its tree but with the acked-by from the driver's maintainer. So the header must contains the name of the maintainer. This organization will be the same than cpufreq. Signed-off-by: Daniel Lezcano <daniel.lezcano(a)linaro.org> --- MAINTAINERS | 7 +++++++ drivers/cpuidle/cpuidle-calxeda.c | 4 +++- drivers/cpuidle/cpuidle-kirkwood.c | 5 +++-- 3 files changed, 13 insertions(+), 3 deletions(-) diff --git a/MAINTAINERS b/MAINTAINERS index 61677c3..effa0f3 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -2206,6 +2206,13 @@ S: Maintained F: drivers/cpufreq/ F: include/linux/cpufreq.h +CPUIDLE DRIVERS +M: Rafael J. Wysocki <rjw(a)sisk.pl> +L: linux-pm(a)vger.kernel.org +S: Maintained +F: drivers/cpuidle/* +F: include/linux/cpuidle.h + CPU FREQUENCY DRIVERS - ARM BIG LITTLE M: Viresh Kumar <viresh.kumar(a)linaro.org> M: Sudeep KarkadaNagesha <sudeep.karkadanagesha(a)arm.com> diff --git a/drivers/cpuidle/cpuidle-calxeda.c b/drivers/cpuidle/cpuidle-calxeda.c index e344b56..2378c39 100644 --- a/drivers/cpuidle/cpuidle-calxeda.c +++ b/drivers/cpuidle/cpuidle-calxeda.c @@ -1,7 +1,6 @@ /* * Copyright 2012 Calxeda, Inc. * - * Based on arch/arm/plat-mxc/cpuidle.c: * Copyright 2012 Freescale Semiconductor, Inc. * Copyright 2012 Linaro Ltd. * @@ -16,6 +15,9 @@ * * You should have received a copy of the GNU General Public License along with * this program. If not, see <http://www.gnu.org/licenses/>. + * + * Author : Rob Herring <rob.herring(a)calxeda.com> + * Maintainer: Rob Herring <rob.herring(a)calxeda.com> */ #include <linux/cpuidle.h> diff --git a/drivers/cpuidle/cpuidle-kirkwood.c b/drivers/cpuidle/cpuidle-kirkwood.c index 53290e1..521b0a7 100644 --- a/drivers/cpuidle/cpuidle-kirkwood.c +++ b/drivers/cpuidle/cpuidle-kirkwood.c @@ -1,6 +1,4 @@ /* - * arch/arm/mach-kirkwood/cpuidle.c - * * CPU idle Marvell Kirkwood SoCs * * This file is licensed under the terms of the GNU General Public @@ -11,6 +9,9 @@ * to implement two idle states - * #1 wait-for-interrupt * #2 wait-for-interrupt and DDR self refresh + * + * Maintainer: Jason Cooper <jason(a)lakedaemon.net> + * Maintainer: Andrew Lunn <andrew(a)lunn.ch> */ #include <linux/kernel.h> -- 1.7.9.5

12 years, 2 months

[PATCH V2] ARM: KVM: Allow host virtual timer irq number to be different from guest virtual timer irq number

by Anup Patel

The arch_timer irq numbers (or PPI number) are implementation dependent so, the host virtual timer irq number can be different from guest virtual timer irq number. This patch ensures that host virtual timer irq number is read from DTB and guest virtual timer irq is determined based on guest vcpu target type. Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar(a)linaro.org> --- arch/arm/include/asm/kvm_host.h | 1 + arch/arm/kvm/arch_timer.c | 25 ++++++++++++++++++------- arch/arm/kvm/guest.c | 15 +++++++++++++++ 3 files changed, 34 insertions(+), 7 deletions(-) diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h index 57cb786..cdc0551 100644 --- a/arch/arm/include/asm/kvm_host.h +++ b/arch/arm/include/asm/kvm_host.h @@ -43,6 +43,7 @@ struct kvm_vcpu; u32 *kvm_vcpu_reg(struct kvm_vcpu *vcpu, u8 reg_num, u32 mode); int kvm_target_cpu(void); +struct kvm_irq_level *kvm_target_timer_irq(struct kvm_vcpu *vcpu); int kvm_reset_vcpu(struct kvm_vcpu *vcpu); void kvm_reset_coprocs(struct kvm_vcpu *vcpu); diff --git a/arch/arm/kvm/arch_timer.c b/arch/arm/kvm/arch_timer.c index 49a7516..521cdb9 100644 --- a/arch/arm/kvm/arch_timer.c +++ b/arch/arm/kvm/arch_timer.c @@ -30,7 +30,7 @@ static struct timecounter *timecounter; static struct workqueue_struct *wqueue; -static struct kvm_irq_level timer_irq = { +static struct kvm_irq_level host_timer_irq = { .level = 1, }; @@ -65,10 +65,21 @@ static void kvm_timer_inject_irq(struct kvm_vcpu *vcpu) { struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu; + /* + * The vcpu timer irq number cannont be determined in + * kvm_timer_vcpu_init() because it is called much before + * kvm_vcpu_set_target(). To handle this, we determin + * vcpu timer irq number when we inject the vcpu timer irq + * first time. + */ + if (!timer->irq) { + timer->irq = kvm_target_timer_irq(vcpu); + } + timer->cntv_ctl |= ARCH_TIMER_CTRL_IT_MASK; kvm_vgic_inject_irq(vcpu->kvm, vcpu->vcpu_id, - vcpu->arch.timer_cpu.irq->irq, - vcpu->arch.timer_cpu.irq->level); + timer->irq->irq, + timer->irq->level); } static irqreturn_t kvm_arch_timer_handler(int irq, void *dev_id) @@ -163,12 +174,12 @@ void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu) INIT_WORK(&timer->expired, kvm_timer_inject_irq_work); hrtimer_init(&timer->timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS); timer->timer.function = kvm_timer_expire; - timer->irq = &timer_irq; + timer->irq = NULL; } static void kvm_timer_init_interrupt(void *info) { - enable_percpu_irq(timer_irq.irq, 0); + enable_percpu_irq(host_timer_irq.irq, 0); } @@ -182,7 +193,7 @@ static int kvm_timer_cpu_notify(struct notifier_block *self, break; case CPU_DYING: case CPU_DYING_FROZEN: - disable_percpu_irq(timer_irq.irq); + disable_percpu_irq(host_timer_irq.irq); break; } @@ -230,7 +241,7 @@ int kvm_timer_hyp_init(void) goto out; } - timer_irq.irq = ppi; + host_timer_irq.irq = ppi; err = register_cpu_notifier(&kvm_timer_cpu_nb); if (err) { diff --git a/arch/arm/kvm/guest.c b/arch/arm/kvm/guest.c index 152d036..d87b05d 100644 --- a/arch/arm/kvm/guest.c +++ b/arch/arm/kvm/guest.c @@ -36,6 +36,11 @@ struct kvm_stats_debugfs_item debugfs_entries[] = { { NULL } }; +struct kvm_irq_level target_default_timer_irq = { + .irq = 27, + .level = 1, +}; + int kvm_arch_vcpu_setup(struct kvm_vcpu *vcpu) { return 0; @@ -197,6 +202,16 @@ int __attribute_const__ kvm_target_cpu(void) } } +struct kvm_irq_level *kvm_target_timer_irq(struct kvm_vcpu *vcpu) +{ + switch (vcpu->arch.target) { + case KVM_ARM_TARGET_CORTEX_A15: + return &target_default_timer_irq; + default: + return NULL; + }; +} + int kvm_vcpu_set_target(struct kvm_vcpu *vcpu, const struct kvm_vcpu_init *init) { -- 1.7.9.5

12 years, 2 months

Re: [PATCH] cpufreq: Fix the driver can not be unloaded issue

by Viresh Kumar

On 25 April 2013 08:16, Tang Yuantian-B29983 <B29983(a)freescale.com> wrote: > It happened when policy->cpus contains *MORE THEN ONE CPU*. > Taking my board T4240 for example, it has 3 cluster, 8 CPUs for each cluster. > The log is: > # insmod ppc-corenet-cpufreq.ko > ppc_corenet_cpufreq: Freescale PowerPC corenet CPU frequency scaling driver > # rmmod ppc-corenet-cpufreq.ko > ERROR: Module ppc_corenet_cpufreq is in use > # lsmod > Module Size Used by > ppc_corenet_cpufreq 6542 9 > # uname -a > Linux T4240 3.9.0-rc1-11081-g34642bb-dirty #44 SMP Thu Apr 25 08:58:26 CST 2013 ppc64 unknown > > I am not using the newest kernel (since new t4240 board has not included yet), > but the issue is still there. > The reason is just like what I said in patch. I believed what you said is correct and went on testing this on my platform. 2 clusters with 2 and 3 cpus... And so i have multiple cpus per cluster or policy structure. insmod/rmmod worked as expected without any issues. So, for me there are no such issues. BTW, i tested this on latest rc from Linus and also on latest code from linux-next. I am sure the counts are very well balanced and there are no issues in the latest code Atleast.

12 years, 2 months

[PATCH] ARM: KVM: Allow host virtual timer irq number to be different from guest virtual timer irq number

by Anup Patel

The arch_timer irq numbers (or PPI number) are implementation dependent so, the host virtual timer irq number can be different from guest virtual timer irq number. Currently, we only have Cortex-A15 guest (for KVM ARMv7) and Cortex-A57 guest (for KVM ARMv8) supported. These guests have virtual timer irq number as 27. This patch ensures that host virtual timer irq number is read from DTB and guest virtual timer irq is always 27. Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar(a)linaro.org> --- arch/arm/kvm/arch_timer.c | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-) diff --git a/arch/arm/kvm/arch_timer.c b/arch/arm/kvm/arch_timer.c index 49a7516..376abf0 100644 --- a/arch/arm/kvm/arch_timer.c +++ b/arch/arm/kvm/arch_timer.c @@ -30,10 +30,18 @@ static struct timecounter *timecounter; static struct workqueue_struct *wqueue; -static struct kvm_irq_level timer_irq = { +static struct kvm_irq_level host_timer_irq = { .level = 1, }; +/* Guest virtual timer irq number will be based on type of guest we emulate. + * For Cortex-A15 & Cortex-A57 guest, virtual timer irq is 27 + */ +static struct kvm_irq_level guest_timer_irq = { + .irq = 27, + .level = 1, +}; + static cycle_t kvm_phys_timer_read(void) { return timecounter->cc->read(timecounter->cc); @@ -163,12 +171,12 @@ void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu) INIT_WORK(&timer->expired, kvm_timer_inject_irq_work); hrtimer_init(&timer->timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS); timer->timer.function = kvm_timer_expire; - timer->irq = &timer_irq; + timer->irq = &guest_timer_irq; } static void kvm_timer_init_interrupt(void *info) { - enable_percpu_irq(timer_irq.irq, 0); + enable_percpu_irq(host_timer_irq.irq, 0); } @@ -182,7 +190,7 @@ static int kvm_timer_cpu_notify(struct notifier_block *self, break; case CPU_DYING: case CPU_DYING_FROZEN: - disable_percpu_irq(timer_irq.irq); + disable_percpu_irq(host_timer_irq.irq); break; } @@ -230,7 +238,7 @@ int kvm_timer_hyp_init(void) goto out; } - timer_irq.irq = ppi; + host_timer_irq.irq = ppi; err = register_cpu_notifier(&kvm_timer_cpu_nb); if (err) { -- 1.7.9.5

12 years, 2 months

[PATCH v8] sched: fix init NOHZ_IDLE flag

by Vincent Guittot

On my smp platform which is made of 5 cores in 2 clusters, I have the nr_busy_cpu field of sched_group_power struct that is not null when the platform is fully idle. The root cause is: During the boot sequence, some CPUs reach the idle loop and set their NOHZ_IDLE flag while waiting for others CPUs to boot. But the nr_busy_cpus field is initialized later with the assumption that all CPUs are in the busy state whereas some CPUs have already set their NOHZ_IDLE flag. More generally, the NOHZ_IDLE flag must be initialized when new sched_domains are created in order to ensure that NOHZ_IDLE and nr_busy_cpus are aligned. This condition can be ensured by adding a synchronize_rcu between the destruction of old sched_domains and the creation of new ones so the NOHZ_IDLE flag will not be updated with old sched_domain once it has been initialized. But this solution introduces a additionnal latency in the rebuild sequence that is called during cpu hotplug. As suggested by Frederic Weisbecker, another solution is to have the same rcu lifecycle for both NOHZ_IDLE and sched_domain struct. A new nohz_idle field is added to sched_domain so both status and sched_domain will share the same RCU lifecycle and will be always synchronized. In addition, there is no more need to protect nohz_idle against concurrent access as it is only modified by 2 exclusive functions called by local cpu. This solution has been prefered to the creation of a new struct with an extra pointer indirection for sched_domain. The synchronization is done at the cost of : - An additional indirection and a rcu_dereference for accessing nohz_idle. - We use only the nohz_idle field of the top sched_domain. Change since v7: - remove atomic access which is useless now. - refactor the sequence that update nohz_idle status and nr_busy_cpus. Change since v6: - Add the flags in struct sched_domain instead of creating a sched_domain_rq. Change since v5: - minor variable and function name change. - remove a useless null check before kfree - fix a compilation error when NO_HZ is not set. Change since v4: - link both sched_domain and NOHZ_IDLE flag in one RCU object so their states are always synchronized. Change since V3; - NOHZ flag is not cleared if a NULL domain is attached to the CPU - Remove patch 2/2 which becomes useless with latest modifications Change since V2: - change the initialization to idle state instead of busy state so a CPU that enters idle during the build of the sched_domain will not corrupt the initialization state Change since V1: - remove the patch for SCHED softirq on an idle core use case as it was a side effect of the other use cases. Signed-off-by: Vincent Guittot <vincent.guittot(a)linaro.org> --- include/linux/sched.h | 3 +++ kernel/sched/fair.c | 26 ++++++++++++++++---------- kernel/sched/sched.h | 1 - 3 files changed, 19 insertions(+), 11 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index d35d2b6..22bcbe8 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -899,6 +899,9 @@ struct sched_domain { unsigned int wake_idx; unsigned int forkexec_idx; unsigned int smt_gain; +#ifdef CONFIG_NO_HZ + int nohz_idle; /* NOHZ IDLE status */ +#endif int flags; /* See SD_* */ int level; diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 7a33e59..5db1817 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -5395,13 +5395,16 @@ static inline void set_cpu_sd_state_busy(void) struct sched_domain *sd; int cpu = smp_processor_id(); - if (!test_bit(NOHZ_IDLE, nohz_flags(cpu))) - return; - clear_bit(NOHZ_IDLE, nohz_flags(cpu)); - rcu_read_lock(); - for_each_domain(cpu, sd) + sd = rcu_dereference_check_sched_domain(cpu_rq(cpu)->sd); + + if (!sd || !sd->nohz_idle) + goto unlock; + sd->nohz_idle = 0; + + for (; sd; sd = sd->parent) atomic_inc(&sd->groups->sgp->nr_busy_cpus); +unlock: rcu_read_unlock(); } @@ -5410,13 +5413,16 @@ void set_cpu_sd_state_idle(void) struct sched_domain *sd; int cpu = smp_processor_id(); - if (test_bit(NOHZ_IDLE, nohz_flags(cpu))) - return; - set_bit(NOHZ_IDLE, nohz_flags(cpu)); - rcu_read_lock(); - for_each_domain(cpu, sd) + sd = rcu_dereference_check_sched_domain(cpu_rq(cpu)->sd); + + if (!sd || sd->nohz_idle) + goto unlock; + sd->nohz_idle = 1; + + for (; sd; sd = sd->parent) atomic_dec(&sd->groups->sgp->nr_busy_cpus); +unlock: rcu_read_unlock(); } diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index cc03cfd..03b13c8 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -1187,7 +1187,6 @@ extern void account_cfs_bandwidth_used(int enabled, int was_enabled); enum rq_nohz_flag_bits { NOHZ_TICK_STOPPED, NOHZ_BALANCE_KICK, - NOHZ_IDLE, }; #define nohz_flags(cpu) (&cpu_rq(cpu)->nohz_flags) -- 1.7.9.5

12 years, 2 months

panda board suspend/resume broken

by zoran markovic

Hi, Working as a newbie in the PMWG, I noticed I'm not able to resume my pandaboard-es with the latest 3.9 kernel from Linus (configuration file omap2plus_defconfig). Suspend/resume appears to work with the Linaro 12.11 release; I managed to wake it up with a USB keyboard. There is also launchpad bug 989547 that is still open. Any updates on this issue? Thanks, Zoran

12 years, 2 months

[PATCH V4 0/4] Queue work on UNBOUND wq

by Viresh Kumar

This patchset was called: "Create sched_select_cpu() and use it for workqueues" for the first three versions. Earlier discussions over v3, v2 and v1 can be found here: https://lkml.org/lkml/2013/3/18/364 http://lists.linaro.org/pipermail/linaro-dev/2012-November/014344.html http://www.mail-archive.com/linaro-dev@lists.linaro.org/msg13342.html For power saving it is better to schedule work on cpus that aren't idle, as bringing a cpu/cluster from idle state can be very costly (both performance and power wise). Earlier we tried to use timer infrastructure to take this decision but we found out later that scheduler gives even better results and so we should use scheduler for choosing cpu for scheduling work. In workqueue subsystem workqueues with flag WQ_UNBOUND are the ones which uses cpu to select target cpu. Here we are migrating few users of workqueues to WQ_UNBOUND. These drivers are found to be very much active on idle or lightly busy system and using WQ_UNBOUND for these gave impressive results. Setup: ----- - ARM Vexpress TC2 - big.LITTLE CPU - Core 0-1: A15, 2-4: A7 - rootfs: linaro-ubuntu-devel This patchset has been tested on a big LITTLE system (heterogeneous) but is useful for all other homogeneous systems as well. During these tests audio was played in background using aplay. Results: ------- Cluster A15 Energy Cluster A7 Energy Total ------------------------- ----------------------- ------ Without this patchset (Energy in Joules): --------------------------------------------------- 0.151162 2.183545 2.334707 0.223730 2.687067 2.910797 0.289687 2.732702 3.022389 0.454198 2.745908 3.200106 0.495552 2.746465 3.242017 Average: 0.322866 2.619137 2.942003 With this patchset (Energy in Joules): ----------------------------------------------- 0.226421 2.283658 2.510079 0.151361 2.236656 2.388017 0.197726 2.249849 2.447575 0.221915 2.229446 2.451361 0.347098 2.257707 2.604805 Average: 0.2289042 2.2514632 2.4803674 Above tests are repeated multiple times and events are tracked using trace-cmd and analysed using kernelshark. And it was easily noticeable that idle time for many cpus has increased considerably, which eventually saved some power. PS: All the earlier Acks we got for drivers are reverted here as patches have been updated significantly. V3->V4: ------- - Dropped changes to kernel/sched directory and hence sched_select_non_idle_cpu(). - Dropped queue_work_on_any_cpu() - Created system_freezable_unbound_wq - Changed all patches accordingly. V2->V3: ------- - Dropped changes into core queue_work() API, rather create *_on_any_cpu() APIs - Dropped running timers migration patch as that was broken - Migrated few users of workqueues to use *_on_any_cpu() APIs. Viresh Kumar (4): workqueue: Add system wide system_freezable_unbound_wq PHYLIB: queue work on unbound wq block: queue work on unbound wq fbcon: queue work on unbound wq block/blk-core.c | 3 ++- block/blk-ioc.c | 2 +- block/genhd.c | 10 ++++++---- drivers/net/phy/phy.c | 9 +++++---- drivers/video/console/fbcon.c | 2 +- include/linux/workqueue.h | 4 ++++ kernel/workqueue.c | 7 ++++++- 7 files changed, 25 insertions(+), 12 deletions(-) -- 1.7.12.rc2.18.g61b472e

12 years, 2 months

[V3 patch 00/19] cpuidle: code consolidation

by Daniel Lezcano

This patchset series provide some code consolidation across the different cpuidle drivers. It contains two parts, the first one is the removal of the time keeping flag and the second one, is a common initialization routine. All the drivers use the en_core_tk_irqen flag, which means it is not necessary to make the time computation optional. We can remove this flag and assume the cpuidle framework always manage this operation. The cpuidle code initialization is duplicated across the different drivers in the same manner. The repeating pattern is: SMP: cpuidle_register_driver(drv); for_each_possible_cpu(cpu) { dev = per_cpu(cpuidle_device, cpu); cpuidle_register_device(dev); } UP: cpuidle_register_driver(drv); cpuidle_register_device(dev); As on a UP machine the macro 'for_each_cpu' is a one iteration loop, using the initialization loop from SMP to UP works. The patchset does some cleanup for different drivers in order to make the init code the same. Then it introduces a generic function: cpuidle_register(struct cpuidle_driver *drv, struct cpumask *cpumask) The cpumask is for the coupled idle states. The drivers are then modified to take into account this new function and to remove the duplicated code. The benefit is observable in the diffstat: 332 lines of code removed. Changelog: - V3: * folded patch 5/19 into 19/19, they were: * ARM: imx: cpuidle: use init/exit common routine * ARM: imx: cpuidle: create separate drivers for imx5/imx6 * removed rule to make cpuidle.o in the imx's Makefile * splitted patch 1/19 into two, they are: * [V3 patch 01/19] ARM: shmobile: cpuidle: remove shmobile_enter_wfi * [V3 patch 02/19] ARM: shmobile: cpuidle: remove shmobile_enter_wfi prototype - V2: * fixed cpumask NULL test for coupled state in cpuidle_register * added comment about structure copy * changed printk by pr_err * folded splitted message * fixed return code in cpuidle_register * updated Documentation/cpuidle/drivers.txt * added in the changelog dev->state_count is filled by cpuidle_enable_device * fixed tag for tegra in the first line patch description * fixed tegra2 removed tegra_tear_down_cpu = tegra20_tear_down_cpu; - V1: Initial post Tested-on: u8500 Tested-on: at91 Tested-on: intel i5 Tested-on: OMAP4 Compiled with and without CPU_IDLE for: u8500, at91, davinci, exynos, imx5, imx6, kirkwood, multi_v7 (for calxeda), omap2plus, s3c64, tegra1, tegra2, tegra3 Daniel Lezcano (19): ARM: shmobile: cpuidle: remove shmobile_enter_wfi function ARM: shmobile: cpuidle: remove shmobile_enter_wfi prototype ARM: OMAP3: remove cpuidle_wrap_enter cpuidle: remove en_core_tk_irqen flag ARM: ux500: cpuidle: replace for_each_online_cpu by for_each_possible_cpu cpuidle: make a single register function for all ARM: ux500: cpuidle: use init/exit common routine ARM: at91: cpuidle: use init/exit common routine ARM: OMAP3: cpuidle: use init/exit common routine ARM: s3c64xx: cpuidle: use init/exit common routine ARM: tegra: cpuidle: use init/exit common routine ARM: shmobile: cpuidle: use init/exit common routine ARM: OMAP4: cpuidle: use init/exit common routine ARM: tegra: cpuidle: use init/exit common routine for tegra2 ARM: tegra: cpuidle: use init/exit common routine for tegra3 ARM: calxeda: cpuidle: use init/exit common routine ARM: kirkwood: cpuidle: use init/exit common routine ARM: davinci: cpuidle: use init/exit common routine ARM: imx: cpuidle: use init/exit common routine Documentation/cpuidle/driver.txt | 6 + arch/arm/mach-at91/cpuidle.c | 18 +-- arch/arm/mach-davinci/cpuidle.c | 21 +--- arch/arm/mach-exynos/cpuidle.c | 1 - arch/arm/mach-imx/Makefile | 2 +- arch/arm/mach-imx/cpuidle-imx5.c | 40 +++++++ arch/arm/mach-imx/cpuidle-imx6q.c | 3 +- arch/arm/mach-imx/cpuidle.c | 80 ------------- arch/arm/mach-imx/cpuidle.h | 10 +- arch/arm/mach-imx/pm-imx5.c | 30 +---- arch/arm/mach-omap2/cpuidle34xx.c | 49 ++------ arch/arm/mach-omap2/cpuidle44xx.c | 23 +--- arch/arm/mach-s3c64xx/cpuidle.c | 15 +-- arch/arm/mach-shmobile/cpuidle.c | 11 +- arch/arm/mach-shmobile/include/mach/common.h | 3 - arch/arm/mach-shmobile/pm-sh7372.c | 2 - arch/arm/mach-tegra/cpuidle-tegra114.c | 27 +---- arch/arm/mach-tegra/cpuidle-tegra20.c | 31 +---- arch/arm/mach-tegra/cpuidle-tegra30.c | 28 +---- arch/arm/mach-ux500/cpuidle.c | 33 +----- arch/powerpc/platforms/pseries/processor_idle.c | 1 - arch/sh/kernel/cpu/shmobile/cpuidle.c | 1 - arch/x86/kernel/apm_32.c | 1 - drivers/acpi/processor_idle.c | 1 - drivers/cpuidle/cpuidle-calxeda.c | 53 +-------- drivers/cpuidle/cpuidle-kirkwood.c | 18 +-- drivers/cpuidle/cpuidle.c | 144 ++++++++++++++--------- drivers/idle/intel_idle.c | 1 - include/linux/cpuidle.h | 20 ++-- 29 files changed, 175 insertions(+), 498 deletions(-) create mode 100644 arch/arm/mach-imx/cpuidle-imx5.c delete mode 100644 arch/arm/mach-imx/cpuidle.c -- 1.7.9.5

12 years, 2 months

[PATCH v4] drm/exynos: prepare FIMD clocks

by Vikas Sajjan

While migrating to common clock framework (CCF), I found that the FIMD clocks were pulled down by the CCF. If CCF finds any clock(s) which has NOT been claimed by any of the drivers, then such clock(s) are PULLed low by CCF. Calling clk_prepare() for FIMD clocks fixes the issue. This patch also replaces clk_disable() with clk_unprepare() during exit, since clk_prepare() is called in fimd_probe(). Signed-off-by: Vikas Sajjan <vikas.sajjan(a)linaro.org> --- Changes since v3: - added clk_prepare() in fimd_probe() and clk_unprepare() in fimd_remove() as suggested by Viresh Kumar <viresh.kumar(a)linaro.org> Changes since v2: - moved clk_prepare_enable() and clk_disable_unprepare() from fimd_probe() to fimd_clock() as suggested by Inki Dae <inki.dae(a)samsung.com> Changes since v1: - added error checking for clk_prepare_enable() and also replaced clk_disable() with clk_disable_unprepare() during exit. --- drivers/gpu/drm/exynos/exynos_drm_fimd.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/exynos/exynos_drm_fimd.c b/drivers/gpu/drm/exynos/exynos_drm_fimd.c index 9537761..aa22370 100644 --- a/drivers/gpu/drm/exynos/exynos_drm_fimd.c +++ b/drivers/gpu/drm/exynos/exynos_drm_fimd.c @@ -934,6 +934,16 @@ static int fimd_probe(struct platform_device *pdev) return ret; } + ret = clk_prepare(ctx->bus_clk); + if (ret < 0) + return ret; + + ret = clk_prepare(ctx->lcd_clk); + if (ret < 0) { + clk_unprepare(ctx->bus_clk); + return ret; + } + ctx->vidcon0 = pdata->vidcon0; ctx->vidcon1 = pdata->vidcon1; ctx->default_win = pdata->default_win; @@ -981,8 +991,8 @@ static int fimd_remove(struct platform_device *pdev) if (ctx->suspended) goto out; - clk_disable(ctx->lcd_clk); - clk_disable(ctx->bus_clk); + clk_unprepare(ctx->lcd_clk); + clk_unprepare(ctx->bus_clk); pm_runtime_set_suspended(dev); pm_runtime_put_sync(dev); -- 1.7.9.5

12 years, 2 months

← Newer
1
2
3
4
5
6
7
8
9
10
Older →

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

linaro-kernel April 2013