This is the third and final set of patches towards a fully functional
and production quality switcher solution for big.LITTLE systems,
establishing a landmark to compare against for any scheduler based
solution meant to eventually surpass the switcher's power efficiency
available in the mainline kernel.
Rationale for this code: http://lwn.net/Articles/481055/
The first and second patch sets have already been merged by RMK. They
implement the core switcher mechanism. This set adds the necessary code
to drive the switcher based on cpufreq governor decisions.
This set (v2) was rebased on top of the latest linux-pm tree as
requested by Rafael.
Hi Rafael,
Thanks for applying the first four sets.
This is the last set that I had for v3.13, see if you can get it applied for
v3.13.. It shouldn't create much issues I believe, as the changes shouldn't make
any difference in the way code is supposed to work..
I know this might be late for 3.13, but I am just trying my luck :)
Viresh Kumar (16):
cpufreq: create cpufreq_generic_get() routine
cpufreq: at32ap: use cpufreq_generic_get() routine
cpufreq: cpu0: use cpufreq_generic_get() routine
cpufreq: davinci: use cpufreq_generic_get() routine
cpufreq: dbx500: use cpufreq_generic_get() routine
cpufreq: exynos: use cpufreq_generic_get() routine
cpufreq: imx6q: use cpufreq_generic_get() routine
cpufreq: loongson2: use cpufreq_generic_get() routine
cpufreq: omap: use cpufreq_generic_get() routine
cpufreq: ppc: use cpufreq_generic_get() routine
cpufreq: s3c: use cpufreq_generic_get() routine
cpufreq: s5pv210: use cpufreq_generic_get() routine
cpufreq: spear: use cpufreq_generic_get() routine
cpufreq: tegra: remove target_cpu_speed[] array
cpufreq: tegra: use cpufreq_generic_get() routine
cpufreq: unicore2: use cpufreq_generic_get() routine
drivers/cpufreq/at32ap-cpufreq.c | 17 ++++--------
drivers/cpufreq/cpufreq-cpu0.c | 8 ++----
drivers/cpufreq/cpufreq.c | 26 +++++++++++++-----
drivers/cpufreq/davinci-cpufreq.c | 14 +++-------
drivers/cpufreq/dbx500-cpufreq.c | 19 ++-----------
drivers/cpufreq/exynos-cpufreq.c | 10 +++----
drivers/cpufreq/exynos5440-cpufreq.c | 33 ++++++++++-------------
drivers/cpufreq/imx6q-cpufreq.c | 8 ++----
drivers/cpufreq/loongson2_cpufreq.c | 15 ++++-------
drivers/cpufreq/omap-cpufreq.c | 32 +++++++---------------
drivers/cpufreq/ppc-corenet-cpufreq.c | 17 +++---------
drivers/cpufreq/s3c24xx-cpufreq.c | 10 +++----
drivers/cpufreq/s3c64xx-cpufreq.c | 33 +++++++++--------------
drivers/cpufreq/s5pv210-cpufreq.c | 21 +++++----------
drivers/cpufreq/spear-cpufreq.c | 8 ++----
drivers/cpufreq/tegra-cpufreq.c | 50 ++++++-----------------------------
drivers/cpufreq/unicore2-cpufreq.c | 21 ++++++---------
include/linux/cpufreq.h | 3 +++
18 files changed, 113 insertions(+), 232 deletions(-)
--
1.7.12.rc2.18.g61b472e
This is the third and final set of patches towards a fully functional
and production quality switcher solution for big.LITTLE systems,
establishing a landmark to compare against for any scheduler based
solution meant to eventually surpass the switcher's power efficiency
available in the mainline kernel.
Rationale for this code: http://lwn.net/Articles/481055/
The first and second patch sets have already been merged by RMK. They
implement the core switcher mechanism. This set adds the necessary code
to drive the switcher based on cpufreq governor decisions.
drivers/cpufreq/arm_big_little.c | 418 ++++++++++++++++++++++++++++++---
drivers/cpufreq/arm_big_little.h | 5 -
2 files changed, 389 insertions(+), 34 deletions(-)
This patch series implements get_user_pages_fast on ARM. Unlike other
architectures, we do not use IPIs/disabled IRQs as a blocking
mechanism to protect the page table walker. Instead an atomic counter
is used to indicate how many fast gup walkers are active on an address
space, and any code that would cause them problems (THP splitting or
code that could free a page table page) spins on positive values of
this counter.
This series also addresses an assumption made in kernel/futex.c that
THP page splitting can be blocked by disabling the IRQs on a processor
by introducing arch_block_thp_split and arch_unblock_thp_split.
As well as fixing a problem where futexes on THP tails cause hangs on
ARM, I expect this series to also be beneficial for direct-IO, and for
KVM (the hva_to_pfn fast path uses __get_user_pages_fast).
Any comments would be greatly appreciated.
Steve Capper (2):
thp: Introduce arch_(un)block_thp_split
arm: mm: implement get_user_pages_fast
arch/arm/include/asm/mmu.h | 1 +
arch/arm/include/asm/pgalloc.h | 9 ++
arch/arm/include/asm/pgtable-2level.h | 1 +
arch/arm/include/asm/pgtable-3level.h | 21 +++
arch/arm/include/asm/pgtable.h | 18 +++
arch/arm/include/asm/tlb.h | 8 ++
arch/arm/mm/Makefile | 2 +-
arch/arm/mm/gup.c | 234 ++++++++++++++++++++++++++++++++++
include/linux/huge_mm.h | 16 +++
kernel/futex.c | 6 +-
10 files changed, 312 insertions(+), 4 deletions(-)
create mode 100644 arch/arm/mm/gup.c
--
1.8.1.4