Hi All,
This is V2 Resend of my sched_select_cpu() work. Resend because didn't got much
attention on V2. Including more guys now in cc :)
In order to save power, it would be useful to schedule work onto non-IDLE cpus
instead of waking up an IDLE one.
To achieve this, we need scheduler to guide kernel frameworks (like: timers &
workqueues) on which is the most preferred CPU that must be used for these
tasks.
This patchset is about implementing this concept.
- The first patch adds sched_select_cpu() routine which returns the preferred
cpu which is non-idle.
- Second patch removes idle_cpu() calls from timer & hrtimer.
- Third patch is about adapting this change in workqueue framework.
- Fourth patch add migration capability in running timer
Earlier discussions over v1 can be found here:
http://www.mail-archive.com/linaro-dev@lists.linaro.org/msg13342.html
Earlier discussions over this concept were done at last LPC:
http://summit.linuxplumbersconf.org/lpc-2012/meeting/90/lpc2012-sched-timer…
Module created for testing this behavior is present here:
http://git.linaro.org/gitweb?p=people/vireshk/module.git;a=summary
Following are the steps followed in test module:
1. Run single work on each cpu
2. This work will start a timer after x (tested with 10) jiffies of delay
3. Timer routine queues a work... (This may be called from idle or non-idle cpu)
and starts the same timer again STEP 3 is done for n number of times (i.e.
queuing n works, one after other)
4. All works will call a single routine, which will count following per cpu:
- Total works processed by a CPU
- Total works processed by a CPU, which are queued from it
- Total works processed by a CPU, which aren't queued from it
Setup:
-----
- ARM Vexpress TC2 - big.LITTLE CPU
- Core 0-1: A15, 2-4: A7
- rootfs: linaro-ubuntu-nano
Results:
-------
Without Workqueue Modification, i.e. PATCH 3/3:
[ 2493.022335] Workqueue Analyser: works processsed by CPU0, Total: 1000, Own: 0, migrated: 0
[ 2493.047789] Workqueue Analyser: works processsed by CPU1, Total: 1000, Own: 0, migrated: 0
[ 2493.072918] Workqueue Analyser: works processsed by CPU2, Total: 1000, Own: 0, migrated: 0
[ 2493.098576] Workqueue Analyser: works processsed by CPU3, Total: 1000, Own: 0, migrated: 0
[ 2493.123702] Workqueue Analyser: works processsed by CPU4, Total: 1000, Own: 0, migrated: 0
With Workqueue Modification, i.e. PATCH 3/3:
[ 2493.022335] Workqueue Analyser: works processsed by CPU0, Total: 1002, Own: 999, migrated: 3
[ 2493.047789] Workqueue Analyser: works processsed by CPU1, Total: 998, Own: 997, migrated: 1
[ 2493.072918] Workqueue Analyser: works processsed by CPU2, Total: 1013, Own: 996, migrated: 17
[ 2493.098576] Workqueue Analyser: works processsed by CPU3, Total: 998, Own: 993, migrated: 5
[ 2493.123702] Workqueue Analyser: works processsed by CPU4, Total: 989, Own: 987, migrated: 2
V2->V2-Resend
-------------
- Included timer migration patch in the same thread.
V1->V2
-----
- New SD_* macros removed now and earlier ones used
- sched_select_cpu() rewritten and it includes the check on current cpu's
idleness.
- cpu_idle() calls from timer and hrtimer removed now.
- Patch 2/3 from V1, removed as it doesn't apply to latest workqueue branch from
tejun.
- CONFIG_MIGRATE_WQ removed and so is wq_select_cpu()
- sched_select_cpu() called only from __queue_work()
- got tejun/for-3.7 branch in my tree, before making workqueue changes.
Viresh Kumar (4):
sched: Create sched_select_cpu() to give preferred CPU for power
saving
timer: hrtimer: Don't check idle_cpu() before calling
get_nohz_timer_target()
workqueue: Schedule work on non-idle cpu instead of current one
timer: Migrate running timer
include/linux/sched.h | 16 ++++++++++--
include/linux/timer.h | 2 ++
kernel/hrtimer.c | 2 +-
kernel/sched/core.c | 69 +++++++++++++++++++++++++++++++--------------------
kernel/timer.c | 50 ++++++++++++++++++++++---------------
kernel/workqueue.c | 4 +--
6 files changed, 91 insertions(+), 52 deletions(-)
--
1.7.12.rc2.18.g61b472e
Hi Guys,
I really don't know whom to direct this mail to and hence the wide spread.
Problem: When we send a mail to kernel mailing lists with linaro-dev
or linaro-kernel
in cc, and we get replies to those mails, sometimes the mails from
outside people
doesn't reach us back on linaro mailing lists. And i hope the reason
behind that is
those people aren't subscribed to these lists.
For me it makes some sense to allow anyone to send mails to this list. Can that
request be considered?
I believe the idea behind blocking such use is for protecting against
spam mails, but
these mails/replies are really important and we certainly need them
delivered to us.
One solution (don't know if its possible) would be to monitor mails
from non-subscribers
and few people from Linaro can permit them on daily/hourly basis, so
that we don't get any
spam mails, but that would be a burden.
--
viresh
While studying the reason why kernel copy from NOR was so slow on our platform,
I realized U-Boot is pulling it from 32-bit NOR in 8-bit chunks needlessly.
bootm uses memmove() and that just takes the approach by default to move u8s
around.
This optimization prefers memcpy() implementation (done mostly in 32-bit reads
and writes) if there's no overlap in source and dest, resulting in a huge
speedup on our platform (480ms copy from 32-bit NOR ---> 140ms)
Signed-off-by: Andy Green <andy.green(a)linaro.org>
---
lib/string.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/lib/string.c b/lib/string.c
index c3ad055..96d66e0 100644
--- a/lib/string.c
+++ b/lib/string.c
@@ -542,13 +542,21 @@ void * memmove(void * dest,const void *src,size_t count)
if (src == dest)
return dest;
- if (dest <= src) {
+ if (dest < src) {
+
+ if ((unsigned long)dest + count <= (unsigned long)src)
+ return memcpy(dest, src, count);
+
tmp = (char *) dest;
s = (char *) src;
while (count--)
*tmp++ = *s++;
}
else {
+
+ if ((unsigned long)src + count <= (unsigned long)dest)
+ return memcpy(dest, src, count);
+
tmp = (char *) dest + count;
s = (char *) src + count;
while (count--)
Hi,
I have some doubts on backup and restore operation. From what i understand:
We copy all registers values & addresses of all controllers in the SOC
to the internal RAM or SRAM.
before we put CPU to sleep?
I want to know if we also copy the code segment into SRAM and what
happens after wakeup.
If so, where exactly we need to copy and how cpu jumps here after
wakeup. or is there any other mechanism
that is used. What executes first since DDR is in self-refresh.
I use OMAP4 and this is my understanding. I could not understand much
other than in OFF mode
all the controller registers get copied to SRAM. Does anything else
also gets copied too?
or am i missing any basics here.
Can anyone clarify my understanding.
/Ryan
Hi all,
Just a reminder that this is happening tomorrow unless anyone shouts now. I've off lined fast models and all vexpress boards already because they take longer to clear the queue.
Please shout now if this is a problem for you.
Thanks
Dave
On 22 Jul 2013, at 14:02, Dave Pigott <dave.pigott(a)linaro.org> wrote:
> Hi all,
>
> We need to start the switch over from launchpad single sign on, and to do this there will need to be a very small downtime of the LAVA server - it should be less than 5 minutes, maybe less than a minute if I do it right.
>
> I intend to do this on Wednesday at 11:00UTC (12:00BST). I would do it tomorrow, but I have a doctor's appointment in the morning.
>
> Thanks for your patience
>
> Dave