On Sun, Mar 17, 2013 at 6:37 AM, Bruce Dawson <bruced(a)valvesoftware.com> wrote:
> Dave/others, I've come up with a simple (and real) scenario where a CPU bound task running on Ubuntu (and presumably other Linux flavors) fails to be detected as CPU bound by the Linux kernel, meaning that the CPU continues to run at low speed, meaning that this CPU bound task takes (on my machines) about three times longer to run than it should.
>
> I found these e-mail addresses in the MAINTAINERS list under CPU FREQUENCY DRIVERS which I'm hoping is the correct area.
Yes, cpufreq mailing list is the right list for this.
> The basic problem is that on a multi-core system if you run a shell script that spawns lots of sub processes then the workload ends up distributed across all of the CPUs. Therefore, since none of the CPUs are particularly busy, the Linux kernel doesn't realize that a CPU bound task is running, so it leaves the CPU frequency set to low. I have confirmed the behavior in multiple ways. Specifically, I have used "iostat 1" and "mpstat -P ALL 1" to confirm that a full core's worth of CPU work is being done. mpstat also showed that the work was distributed across multiple cores. Using the zoom profiler UI for perf showed the sub processes (and bash) being spread across multiple cores, and perf stat showed that the CPU frequency was staying low even though the task was CPU bound.
There are few things which would be helpful to understand what's going on.
What governor is used in your case? Probably Ondemand (My ubuntu uses this).
Ideally, cpu frequency is increased only if cpu load is very high (or
above threshold,
95 in my ubuntu). So, the frequency might not be increased if there
are multiple cpus
running for a specific task and none of them has high enough load at that time.
Other stuff that i suspect here is a bug which was solved recently by
below patch. If
policy->cpu (that might be cpu 0 for you) is sleeping, then load is
never evaluated even
if all other cpus are very busy. If you can try below patch then it
might be helpful. BTW,
you might not be able to apply it easily as it has got lots of
dependencies.. and so you
might need to pick all drivers/cpufreq patches from v3.9-rc1.
commit 2abfa876f1117b0ab45f191fb1f82c41b1cbc8fe
Author: Rickard Andersson <rickard.andersson(a)stericsson.com>
Date: Thu Dec 27 14:55:38 2012 +0000
cpufreq: handle SW coordinated CPUs
This patch fixes a bug that occurred when we had load on a secondary CPU
and the primary CPU was sleeping. Only one sampling timer was spawned
and it was spawned as a deferred timer on the primary CPU, so when a
secondary CPU had a change in load this was not detected by the cpufreq
governor (both ondemand and conservative).
This patch make sure that deferred timers are run on all CPUs in the
case of software controlled CPUs that run on the same frequency.
Signed-off-by: Rickard Andersson <rickard.andersson(a)stericsson.com>
Signed-off-by: Fabio Baltieri <fabio.baltieri(a)linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
---
drivers/cpufreq/cpufreq_conservative.c | 3 ++-
drivers/cpufreq/cpufreq_governor.c | 44
++++++++++++++++++++++++++++++++++++++------
drivers/cpufreq/cpufreq_governor.h | 1 +
drivers/cpufreq/cpufreq_ondemand.c | 3 ++-
4 files changed, 43 insertions(+), 8 deletions(-)
> I have only reproed this behavior on six-core/twelve-thread systems. I would assume that at least a two-core system would be needed to repro this bug, and perhaps more. The bug will not repro if the system is not relatively idle, since a background CPU hog will force the frequency up.
>
> The repro is exquisitely simple -- ExprCount() is a simplified version of the repro (portable looping in a shell script) and BashCount() is an optimized and less portable version that runs far faster and also avoids this power management problem -- the CPU frequency is raised appropriately. Running a busy loop in another process is another way to get the frequency up and this makes ExprCount() run ~3x faster. Here is the script:
>
> --------------------------------------
> #!/bin/bash
> function ExprCount() {
> i=$1
> while [ $i -gt 0 ]; do
> i=$(expr $i - 1)
I may be wrong but one cpu is used to run this script and other one
would be used
to run expr program.. So, 2 cpus should be good enough to reproduce this setup.
BTW, i have tried your scripts and was able to reproduce the setup
here on a 2 cpu
4 thread system.
--
viresh
=== David Long ===
=== Highlights ===
* Attended Connect. The keynotes were particulaly good this time.
Kudos to all who arranged this and those who arranged the weather (not
that I got outside much).
* Although the uprobe code is installing and taking the breakpoint
properly, it is getting lost somewhere when it goes to execute the
probed instruction out-of-line. I've tried to isolate this a couple of
different ways, but no success yet.
=== Plans ===
* Continue isolating the xol problem and restructing the emulation code.
=== Issues ===
-dl
Hi Guys,
Below are hangout upstreams of Scheduler Internals by Vincent Guittot
done in LCA13.
We have got another version of this recording that is done by some
other cameras, but
its size was 30 GB and so hard to upstream. In case you need that
please contact me.
Day 1: http://www.youtube.com/watch?v=2yzelou80JE
Day 2: http://www.youtube.com/watch?v=fN11Lltx1nQ
Thanks to Naresh for arranging for hangouts.
--
Viresh
This is to let you know that the migration of lists.linaro.org has been
successfully completed.
As per the email I sent on Wednesday, it may take some time for the new
address of the server to be seen by your computer. You can check this by
trying to connect to the web site:
http://lists.linaro.org/
If you are able to connect and you do not get an error, this means you are
connecting to the new server and you can send email to the lists.
If you experience any problems after the weekend and you find that you
still cannot connect to the server, please reply to this email to let us
know.
Regards
Philip
IT Services Manager
Linaro
== Ulf Hansson ==
=== Highlights ===
General:
* Last week spent at Linaro Connect Hong Kong. A great week!
Storage:
* Reviewing patches on mmc-list.
* Discussing sent patchset to enable runtime pm support for mmc/sd block device.
* Rework parts of the HS200 and SDR104 support in the mmc protocol
layer. First part for tuning sequence done, patch will be pushed to
mmc-list shortly.
Clk:
* Preparing patchset for upstreaming patches that will add support for
abx500 clocks.
* Preparing patchset to update different driver's clk support used by ux500.
* Resent patches for clk_set_parent fixup and for disable unprepared
clocks at late init.
* Diving into discussion around doing DVFS through the clock API.
Reviewing related patches.
=== Plans ===
Storage:
* Speed up work around the mmc power management blueprint so we can
finalize this work as soon as possible.
* Push patches for mmci host driver, to support UHS cards/HS200, CMD23.
* Push patches for mmci host driver to extend the power management
support. Context save/restore are for example missing.
* Push patches for mmci host driver to add support for new STE 8540 variant.
Clk:
* Optimizations and bug fixes for ux500 clk implementations.
* Send RFC/PATCH for DVFS clock type used by ux500.
* Follow up on previously sent patchset.
=== Issues ===
* None.
On Fri, Mar 15, 2013 at 11:54 AM, Chao Xie <xiechao.mail(a)gmail.com> wrote:
> hi
> It may be a old topic.
> Now the cpufreq governors will sample for system work load. The
> schduler knows about the current workload of each cores. So why not
> make use of it? The sampling need take some time, so when the cpufreq
> increase the frequency , the system has been busy for a period of
> time. Making use of the schduler information can reduce time spending
> at sampling.
Hi Chao,
I am working for Linaro Power Management Working Group and we know
about this problem or solution. We have a dedicated blueprint towards this
goal:
https://blueprints.launchpad.net/linaro-big-little-system/+spec/sched-coope…
--
viresh
Replaces the "platform_get_resource() for IORESOURCE_IRQ" with
platform_get_resource_byname().
Both in exynos4 and exynos5, FIMD IP has 3 interrupts in the order: "fifo",
"vsync", and "lcd_sys".
But The FIMD driver expects the "vsync" interrupt to be mentioned as the
1st parameter in the FIMD DT node. So to meet this expectation of the
driver, the FIMD DT node was forced to be made by keeping "vsync" as the
1st paramter.
For example in exynos4, the FIMD DT node has interrupt numbers
mentioned as <11, 1> <11, 0> <11, 2> keeping "vsync" as the 1st paramter.
This patch fixes the above mentioned "hack" of re-ordering of the
FIMD interrupt numbers by getting interrupt resource of FIMD by using
platform_get_resource_byname().
Signed-off-by: Vikas Sajjan <vikas.sajjan(a)linaro.org>
---
drivers/gpu/drm/exynos/exynos_drm_fimd.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/exynos/exynos_drm_fimd.c b/drivers/gpu/drm/exynos/exynos_drm_fimd.c
index 1ea173a..cd79d38 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_fimd.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_fimd.c
@@ -945,7 +945,7 @@ static int fimd_probe(struct platform_device *pdev)
return -ENXIO;
}
- res = platform_get_resource(pdev, IORESOURCE_IRQ, 0);
+ res = platform_get_resource_byname(pdev, IORESOURCE_IRQ, "vsync");
if (!res) {
dev_err(dev, "irq request failed.\n");
return -ENXIO;
--
1.7.9.5
This patch series adds support for DRM FIMD DT for Exynos4 DT Machines,
specifically for Exynos4412 SoC.
changes since v6:
- addressed comments and added interrupt-names = "fifo", "vsync", "lcd_sys"
in exynos4.dtsi and re-ordered the interrupt numbering to match the order in
interrupt combiner IP as suggested by Sylwester Nawrocki <sylvester.nawrocki(a)gmail.com>.
changes since v5:
- renamed the fimd binding documentation file name as "samsung-fimd.txt",
since it not only talks about exynos display controller but also about
previous samsung display controllers.
- rephrased an abmigious statement about the interrupt combiner in the
fimd binding documentation as pointed out by
Sachin Kamat <sachin.kamat(a)linaro.org>
changes since v4:
- moved the fimd binding documentation to Documentation/devicetree/bindings/video/
as suggested by Sylwester Nawrocki <sylvester.nawrocki(a)gmail.com>
- added more fimd compatiblity strings in fimd documentation as
discussed at https://patchwork.kernel.org/patch/2144861/ with
Sylwester Nawrocki <sylvester.nawrocki(a)gmail.com> and
Tomasz Figa <tomasz.figa(a)gmail.com>
- modified compatible string for exynos4 fimd as "exynos4210-fimd"
exynos5 fimd as "exynos5250-fimd" to stick to the rule that compatible
value should be named after first specific SoC model in which this
particular IP version was included as discussed at
https://patchwork.kernel.org/patch/2144861/
- documented more about the interrupt combiner and their order as
suggested by Sylwester Nawrocki <sylvester.nawrocki(a)gmail.com>
changes since v3:
- rebased on
http://git.kernel.org/?p=linux/kernel/git/kgene/linux-samsung.git;a=shortlo…
changes since v2:
- added alias to 'fimd@11c00000' node
(reported by: Rahul Sharma <r.sh.open(a)gmail.com>)
- removed 'lcd0_data' node as there was already a similar node lcd_data24
(reported by: Jingoo Han <jg1.han(a)samsung.com>
- replaced spaces with tabs in display-timing node
changes since v1:
- added new patch to add FIMD DT binding Documentation
- removed patch enabling SAMSUNG_DEV_BACKLIGHT and SAMSUNG_DEV_PMW
for mach-exynos4 DT
- added 'status' property to fimd DT node
Is based on branch "for-next-next"
http://git.kernel.org/?p=linux/kernel/git/kgene/linux-samsung.git;a=shortlo…
Sachin Kamat (1):
ARM: dts: Add lcd pinctrl node entries for EXYNOS4412 SoC
Vikas Sajjan (4):
ARM: dts: Add FIMD node to exynos4
ARM: dts: Add FIMD node and display timing node to
exynos4412-origen.dts
ARM: dts: Add FIMD AUXDATA node entry for exynos4 DT
ARM: dts: Add FIMD DT binding Documentation
.../devicetree/bindings/video/samsung-fimd.txt | 58 ++++++++++++++++++++
arch/arm/boot/dts/exynos4.dtsi | 8 +++
arch/arm/boot/dts/exynos4412-origen.dts | 22 ++++++++
arch/arm/boot/dts/exynos4x12-pinctrl.dtsi | 14 +++++
arch/arm/mach-exynos/mach-exynos4-dt.c | 2 +
5 files changed, 104 insertions(+)
create mode 100644 Documentation/devicetree/bindings/video/samsung-fimd.txt
--
1.7.9.5
Hello
You are receiving this email because you are subscribed to one or more
mailing lists provided by the lists.linaro.org server.
IT Services are announcing planned maintenance for this server scheduled
for *Friday 15th March 2013, starting at 2pm GMT*. The purpose of the work
is to move the service to another server. There will be some disruption
during this maintenance.
In order to ensure that you do not accidentally try to use the service
while it is being moved, the current server will be shut down at 2pm.
A further email will be sent on Friday afternoon to confirm that the
migration of the service is completed. However, due to the way servers are
found, it may take a while before your computer is able to connect to the
relocated service.
After the old server has been shut down, email sent to any of the lists
will be queued, but it is possible that the sending server will still
trying to deliver the email to the old server rather than the new one when
it is started.
It is therefore *strongly* recommended that you do not send any email to an
@lists.linaro.org email address until you can connect to the new service,
which you will be able to test by trying to use a web browser to connect to
http://lists.linaro.org after you receive the email confirming that the
migration has been completed. Since the old service will be shut down, if
you are able to connect, you can be sure you have connected to the new
service.
If by Monday you are still unable to connect to the service or you are not
able to send email to an @lists.linaro.org email address, please send an
email to its(a)linaro.org.
Thank you.
Regards
Philip
IT Services Manager
Linaro
The only difference between schedule_delayed_work[_on]() and
queue_delayed_work[_on]() is the workqueue, work is scheduled on. We may need to
modify the delay for works queued with schedule_delayed_work[_on]() calls and
thus adding these helpers.
First users of these new helpers is cpufreq governors which need to modify the
delay for its works.
Cc: Tejun Heo <tj(a)kernel.org>
Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org>
---
include/linux/workqueue.h | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index 2b58905..864c2b3 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -412,6 +412,7 @@ extern bool schedule_delayed_work_on(int cpu, struct delayed_work *work,
extern bool schedule_delayed_work(struct delayed_work *work,
unsigned long delay);
extern int schedule_on_each_cpu(work_func_t func);
+
extern int keventd_up(void);
int execute_in_process_context(work_func_t fn, struct execute_work *);
@@ -465,6 +466,11 @@ static inline long work_on_cpu(unsigned int cpu, long (*fn)(void *), void *arg)
long work_on_cpu(unsigned int cpu, long (*fn)(void *), void *arg);
#endif /* CONFIG_SMP */
+#define mod_scheduled_delayed_work_on(cpu, dwork, delay) \
+ mod_delayed_work_on(cpu, system_wq, dwork, delay)
+#define mod_scheduled_delayed_work(dwork, delay) \
+ mod_delayed_work(system_wq, dwork, delay)
+
#ifdef CONFIG_FREEZER
extern void freeze_workqueues_begin(void);
extern bool freeze_workqueues_busy(void);
--
1.7.12.rc2.18.g61b472e