This patchset was called: "Create sched_select_cpu() and use it for workqueues"
for the first three versions.
Earlier discussions over v3, v2 and v1 can be found here:
https://lkml.org/lkml/2013/3/18/364http://lists.linaro.org/pipermail/linaro-dev/2012-November/014344.htmlhttp://www.mail-archive.com/linaro-dev@lists.linaro.org/msg13342.html
V4 is here:
https://lkml.org/lkml/2013/3/31/55
Workqueues can be performance or power oriented. For performance we may want to
keep them running on a single cpu, so that it remains cache hot. For power we
can give scheduler the liberty to choose target cpu for running work handler.
Later one (Power oriented WQ) can be achieved if the workqueue is allocated with
WQ_UNBOUND flag. Enabling CONFIG_WQ_POWER_EFFICIENT will set
'wq_power_efficient' to 'true'. Setting 'power_efficient' boot param will
override value of 'wq_power_efficient' variable. When 'wq_power_efficient' is
set to 'true', we will convert WQ_POWER_EFFICIENT flag to WQ_UNBOUND on wq
allocation. And so scheduler will have the liberty to choose where to run this
work.
Here we are migrating few users of workqueues to WQ_POWER_EFFICIENT. These
drivers are found to be very much active on idle or lightly busy system and
using WQ_POWER_EFFICIENT for these gave impressive results.
These would be used in power saving mode only if relevant configs are enabled
at compile time or in bootargs. Otherwise behavior is unchanged.
Setup:
-----
- ARM Vexpress TC2 - big.LITTLE CPU
- Core 0-1: A15, 2-4: A7
- rootfs: linaro-ubuntu-devel
This patchset has been tested on a big LITTLE system (heterogeneous) but is
useful for all other homogeneous systems as well. During these tests audio was
played in background using aplay.
Results:
-------
Cluster A15 Energy Cluster A7 Energy Total
------------------------- ----------------------- ------
Without this patchset (Energy in Joules):
---------------------------------------------------
0.151162 2.183545 2.334707
0.223730 2.687067 2.910797
0.289687 2.732702 3.022389
0.454198 2.745908 3.200106
0.495552 2.746465 3.242017
Average:
0.322866 2.619137 2.942003
With this patchset (Energy in Joules):
-----------------------------------------------
0.226421 2.283658 2.510079
0.151361 2.236656 2.388017
0.197726 2.249849 2.447575
0.221915 2.229446 2.451361
0.347098 2.257707 2.604805
Average:
0.2289042 2.2514632 2.4803674
Above tests are repeated multiple times and events are tracked using trace-cmd
and analysed using kernelshark. And it was easily noticeable that idle time for
many cpus has increased considerably, which eventually saved some power.
V4->V5:
-------
- Created new wq flag: WQ_POWER_EFFICIENT, config option:
CONFIG_WQ_POWER_EFFICIENT and kernel param workqueue.power_efficient.
- Created few system wide workqueues aligned towards power saving.
V3->V4:
-------
- Dropped changes to kernel/sched directory and hence
sched_select_non_idle_cpu().
- Dropped queue_work_on_any_cpu()
- Created system_freezable_unbound_wq
- Changed all patches accordingly.
V2->V3:
-------
- Dropped changes into core queue_work() API, rather create *_on_any_cpu()
APIs
- Dropped running timers migration patch as that was broken
- Migrated few users of workqueues to use *_on_any_cpu() APIs.
Viresh Kumar (5):
workqueues: Introduce new flag WQ_POWER_EFFICIENT for power oriented
workqueues
workqueue: Add system wide power_efficient workqueues
PHYLIB: queue work on system_power_efficient_wq
block: queue work on power efficient wq
fbcon: queue work on power efficient wq
Documentation/kernel-parameters.txt | 17 +++++++++++++++++
block/blk-core.c | 3 ++-
block/blk-ioc.c | 3 ++-
block/genhd.c | 12 ++++++++----
drivers/net/phy/phy.c | 9 +++++----
drivers/video/console/fbcon.c | 2 +-
include/linux/workqueue.h | 10 ++++++++++
kernel/power/Kconfig | 19 +++++++++++++++++++
kernel/workqueue.c | 24 +++++++++++++++++++++++-
9 files changed, 87 insertions(+), 12 deletions(-)
--
1.7.12.rc2.18.g61b472e
This patch series adds LCD backlight and LCD enable gpios pins to dp-controller
DT node of exynos5250-smdk5250 and parsing of these gpio pins in exynos-dp driver
tested on exynos5250-smdk5250 Board.
rebased on kgene-next branch of
https://git.kernel.org/cgit/linux/kernel/git/kgene/linux-samsung.git/
Vikas Sajjan (2):
video: exynos_dp: Add parsing of gpios pins to exynos-dp driver
ARM: dts: Add LCD backlight and LCD enable gpios pins to
dp-controller DT node
arch/arm/boot/dts/exynos5250-smdk5250.dts | 3 ++
drivers/video/exynos/exynos_dp_core.c | 45 +++++++++++++++++++++++++++++
2 files changed, 48 insertions(+)
--
1.7.9.5
From: Sukanto Ghosh <sghosh(a)apm.com>
The format of the lower 32-bits of the 64-bit operand to 'dc cisw' is
unchanged from ARMv7 architecture and the upper bits are RES0. This
implies that the 'way' field of the operand of 'dc cisw' occupies the
bit-positions [31 .. (32-A)]. Due to the use of 64-bit extended operands
to 'clz', the existing implementation of __flush_dcache_all is incorrectly
placing the 'way' field in the bit-positions [63 .. (64-A)].
Signed-off-by: Sukanto Ghosh <sghosh(a)apm.com>
Tested-by: Anup Patel <anup.patel(a)linaro.org>
---
arch/arm64/mm/cache.S | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/mm/cache.S b/arch/arm64/mm/cache.S
index abe69b8..48a3860 100644
--- a/arch/arm64/mm/cache.S
+++ b/arch/arm64/mm/cache.S
@@ -52,7 +52,7 @@ loop1:
add x2, x2, #4 // add 4 (line length offset)
mov x4, #0x3ff
and x4, x4, x1, lsr #3 // find maximum number on the way size
- clz x5, x4 // find bit position of way size increment
+ clz w5, w4 // find bit position of way size increment
mov x7, #0x7fff
and x7, x7, x1, lsr #13 // extract max number of the index size
loop2:
--
1.7.9.5
=== David Long ===
=== Highlights ===
* Tixy voiced doubts about the approach I used for uprobes being
upstreamable due to some redundancy in the kprobe/uprobe code. We
talked about an alternative design which I now have partially implemented.
* Helped relocate the Nashua office to its new space.
=== Plans ===
* Continue with uprobe/kprobe
* Start building systemtap
=== Issues ===
* None
-dl
hi Nico & all,
After we studied the IKS code, we believe the code is general and
smoothly and can almost meet well for our own SoC's requirement; here
also have some questions want to confirm with you guys:
1. When outbound core wake up inbound core, the outbound core's thread
will sleep until the inbound core use MCPM’s early pork to send IPI;
a) Looks like this method somehow is due to TC2 board has long letancy
to power on/off cluster and core; right? How about to use polling
method? because on our own SoC, the wakenup interval will take _only_
about 10 ~ 20us;
b) The inbound core will send IPI to outbound core for the
synchronization, but at this point the inbound core's GIC's cpu
interface is disabled; so even the core's cpu interface is disabled, can
the core send SGI to other cores?
c) MCPM's patchset merged for mainline have no related function for
early pork, so later will early pork related functions be committed to
mainline?
2. Now the switching is an async operation, means after the function
bL_switch_request is return back, we cannot say switching has been
completed; so we have some concern for it.
For example, when switch from A15 core to A7 core, then maybe we want to
decrease the voltage so that can save power; if the switching is an
async operation, then it maybe introduce the issue is: after return back
from the function bL_switch_request, then s/w will decrease the voltage;
but at the meantime, the real switching is ongoing on another pair cores.
i browser the git log and get to know at the beginning the switching is
synced by using kernel's workqueue, later changed to use a dedicated
kernel thread with FIFO type; do u think it's better to go ahead to add
sync method for switching?
3. After enabled switcher, then it will disable hotplug.
Actually current code can support hotplug with IKS; because with IKS,
the logical core will map the according physical core id and GIC's
interface id, so that it can make sure if the system has hotplugged out
which physical core, later the kernel can hotplug in this physical core.
So could u give more hints why iks need disable hotplug?
--
Thx,
Leo Yan
=== Highlights ===
* Did a first pass at addressing remapping issue for volatile ranges
(still waiting on Minchan's feedback)
* Updated Anton's KDB/FIQ patch queues. Pinged Jason and sent the short
list out to lkml for feedback.
* Cherry-picked the ION code into a dev tree against 3.10 to learn more
about it in prep for future discussions.
* Reached out to Arnd on ION dma questions.
* Updated linaro.android tree to pre-3.10-rc1, ran into some trouble
testing since panda wasn't booting, but Tixy helped with testing and
Kevin followed up with a solution for panda.
* Sent out some requests-for-participation emails for Linux Plumbers
Android miniconf
* Talked w/ Deepak about drivers/clocksource issues and other planning
* Reviewed YongQins's get_user macro fix
* Reviewed blueprints and had bi-weekly hangout with android upstreaming
team
* Reviewed Dmitry's vfat ioctl patch
* Reviewed Zoran's suspend watchdog patchset & discussed plans for
suspend-time logging
=== Plans ===
* Get feedback from Minchan and send volatile ranges to lkml
* Probably More LPC minisummit planning
* Hopefully more ION research/discussion
=== Issues ===
* NA
Hey Kevin,
Sorry to pester you, but you've always been helpful with these sort
of questions.
I'm trying to test a pre 3.10-rc1 kernel on panda, and its hanging after
"Starting kernel ..."
3.9 boots fine, and I was curious if you had any hints as to what new
config magic I need to get things going.
Attached are my good/bad configs.
thanks
-john