This is V2 Resend of my sched_select_cpu() work. Resend because didn't got much
attention on V2. Including more guys now in cc :)
In order to save power, it would be useful to schedule work onto non-IDLE cpus
instead of waking up an IDLE one.
To achieve this, we need scheduler to guide kernel frameworks (like: timers &
workqueues) on which is the most preferred CPU that must be used for these
This patchset is about implementing this concept.
- The first patch adds sched_select_cpu() routine which returns the preferred
cpu which is non-idle.
- Second patch removes idle_cpu() calls from timer & hrtimer.
- Third patch is about adapting this change in workqueue framework.
- Fourth patch add migration capability in running timer
Earlier discussions over v1 can be found here:
Earlier discussions over this concept were done at last LPC:
Module created for testing this behavior is present here:
Following are the steps followed in test module:
1. Run single work on each cpu
2. This work will start a timer after x (tested with 10) jiffies of delay
3. Timer routine queues a work... (This may be called from idle or non-idle cpu)
and starts the same timer again STEP 3 is done for n number of times (i.e.
queuing n works, one after other)
4. All works will call a single routine, which will count following per cpu:
- Total works processed by a CPU
- Total works processed by a CPU, which are queued from it
- Total works processed by a CPU, which aren't queued from it
- ARM Vexpress TC2 - big.LITTLE CPU
- Core 0-1: A15, 2-4: A7
- rootfs: linaro-ubuntu-nano
Without Workqueue Modification, i.e. PATCH 3/3:
[ 2493.022335] Workqueue Analyser: works processsed by CPU0, Total: 1000, Own: 0, migrated: 0
[ 2493.047789] Workqueue Analyser: works processsed by CPU1, Total: 1000, Own: 0, migrated: 0
[ 2493.072918] Workqueue Analyser: works processsed by CPU2, Total: 1000, Own: 0, migrated: 0
[ 2493.098576] Workqueue Analyser: works processsed by CPU3, Total: 1000, Own: 0, migrated: 0
[ 2493.123702] Workqueue Analyser: works processsed by CPU4, Total: 1000, Own: 0, migrated: 0
With Workqueue Modification, i.e. PATCH 3/3:
[ 2493.022335] Workqueue Analyser: works processsed by CPU0, Total: 1002, Own: 999, migrated: 3
[ 2493.047789] Workqueue Analyser: works processsed by CPU1, Total: 998, Own: 997, migrated: 1
[ 2493.072918] Workqueue Analyser: works processsed by CPU2, Total: 1013, Own: 996, migrated: 17
[ 2493.098576] Workqueue Analyser: works processsed by CPU3, Total: 998, Own: 993, migrated: 5
[ 2493.123702] Workqueue Analyser: works processsed by CPU4, Total: 989, Own: 987, migrated: 2
- Included timer migration patch in the same thread.
- New SD_* macros removed now and earlier ones used
- sched_select_cpu() rewritten and it includes the check on current cpu's
- cpu_idle() calls from timer and hrtimer removed now.
- Patch 2/3 from V1, removed as it doesn't apply to latest workqueue branch from
- CONFIG_MIGRATE_WQ removed and so is wq_select_cpu()
- sched_select_cpu() called only from __queue_work()
- got tejun/for-3.7 branch in my tree, before making workqueue changes.
Viresh Kumar (4):
sched: Create sched_select_cpu() to give preferred CPU for power
timer: hrtimer: Don't check idle_cpu() before calling
workqueue: Schedule work on non-idle cpu instead of current one
timer: Migrate running timer
include/linux/sched.h | 16 ++++++++++--
include/linux/timer.h | 2 ++
kernel/hrtimer.c | 2 +-
kernel/sched/core.c | 69 +++++++++++++++++++++++++++++++--------------------
kernel/timer.c | 50 ++++++++++++++++++++++---------------
kernel/workqueue.c | 4 +--
6 files changed, 91 insertions(+), 52 deletions(-)
This series adds Aarch64 support and makes some minor tweaks.
The first two patches of this series add Aarch64 support to
libhugetlbfs. (Starting from 3.11-rc1, the Linux Kernel supports
HugeTLB and THP for ARM64).
Some general changes are also made:
PROT_NONE is added to the mprotect unit test, and the
linkhuge_rw test is enabled for 64 bit where there aren't any
The final patch clears up the superfluous ARM ld.hugetlbfs
Any comments would be appreciated.
Steve Capper (5):
Aarch64 unit test fixes.
Add PROT_NONE to the mprotect test.
Add linkhuge_rw test to 64 bit && !CUSTOM_LDSCIPTS
Cleanup ARM ld.hugetlbfs HTLB_LINK logic
Makefile | 7 +++++++
ld.hugetlbfs | 7 +------
sys-aarch64elf.S | 34 ++++++++++++++++++++++++++++++++++
tests/Makefile | 2 +-
tests/icache-hygiene.c | 7 ++++---
tests/mprotect.c | 6 ++++++
tests/mremap-expand-slice-collision.c | 2 +-
7 files changed, 54 insertions(+), 11 deletions(-)
create mode 100644 sys-aarch64elf.S
Debconf13 (last week) considered the matter of bare-metal
cross-toolchains in Debian. Ideally we would have one toolchain source
package from which the existing linux native compilers, and
cross-compilers are built, including bare-metal cross-compilers. There
is already mechanism for adding patches for particular gcc builds, so
so long as the patch set is manageable and trackable, this would be
nice, and futureproof, as eventually the patch set should just
evaporate as it gets upstreamed.
The alternative it to simply repack the existing linaro
cross-toolchain sources, but them we get to keep doing that for new
releases, and we have gratuitous extra copies of gcc sources and
corresponding differences between A* and M* toolchains/versions.
The linaro embedded toolchains
good, and work, both for M0 and M3. But building nominally the same
thing from upstream gcc gets something where M3 works but M0 doesn't.
Also they are gcc 4.7 based whilst Debian is moving to a 4.8 default.
We peered at checkouts from linaro and upstream and tried to work out
what the linaro patch-set for this toolchain is, and exactly where it
branched off upstream, but it was confusing with a lot of noise due to
version skew around some actually relevant changes.
So, in order to work out if we can in fact build our bare-metal
toolchains from the same sources as the existing toolchains we need to
know what the actual patch-set you are maitaining looks like, and what
is already upstreamed in which gcc branch/release and when the
remaining patches will go upstream. Also what the 4.7 vs 4.8 status
is. Knowing how this stuff is tracked might be even more useful over
the longer term.
Is there such info online somewhere? If not please elaborate. A
mechanism for keeping the (newly-formed) Debian cross-compiler team
sufficiently in the loop is probably what's needed in the longer term,
unless this is all just about to get upstreamed anyway and the issue
will soon become moot...?
There was also discussion around the concept of making existing
linux-arm cross-compilers, with M0 and M3 support included, and using
spec-file jiggery-pokery to get them to DTRT for M* targets. This
should be possible, but advice from anyone who's every actually tried
on the gotchas would be good.
Principal hats: Linaro, Emdebian, Wookware, Balloonboard, ARM
I tried to build linaro kernel for pandaboard. I have tried everything what
I can think of but the kernel still can't boot correctly. Any help will be
appreciated. Here is what I did:
1. I flash the 13.07 linaro-ubuntu-pandaboard image into the sd card (
http://releases.linaro.org/13.07/ubuntu/panda). This image works fine.
2. I clone the kernel source code from git://
3. checkout the lsk 13.07 tag.
4. copy the config file from original image (i.e.
5. make ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- CFLAGS="-O"
The cross compiler on my machine is gcc-4.7-arm-linux-gnueabihf-base
6. From pandaboard, I load the built uImage via scp.
The problem I met:
1. In most cases, the kernel can't boot correctly. And the terminal keeps
printing "hub 1-1:1.0: hub_port_status failed (err = -71)".
2.Sometime, the kernel can finish booting. But the terminal prints for
several times the aforementioned error message after booting. Plus, I don't
have any module running, 'lsmod' shows nothing.
Please help! Thank you.
I wanted to know if the ondemand governor scales down the frequency to
a minimum when the load comes below up_threshold?
I understand that
1) Every now and then, the governor work queue runs and checks for
"load" in percentage.
If load > 95 ; it bumps the frequency to maximum;
ie. if the idle time is less than 5% ; the cpu will run at maximum
2) If load < 90% ; it bumps the frequency down little bit;
It continues to do this till minimum; So the rate of decrease of
frequency to minimum is very slow;
So, the frequency decrease to minimum will take a long time to
reach minimum (depends on sampling time).
Is this correct.