Hi all,
This patch set introduces a buffer synchronization framework based
on DMA BUF[1] and based on ww-mutexes[2] for lock mechanism, and
may be final RFC.
The purpose of this framework is to provide not only buffer access
control to CPU and CPU, and CPU and DMA, and DMA and DMA but also
easy-to-use interfaces for device drivers and user application.
In addtion, this patch set suggests a way for enhancing performance.
For generic user mode interface, we have used fcntl and select system
call[3]. As you know, user application sees a buffer object as a dma-buf
file descriptor. So fcntl() call with the file descriptor means to lock
some buffer region being managed by the dma-buf object. And select() call
means to wait for the completion of CPU or DMA access to the dma-buf
without locking. For more detail, you can refer to the dma-buf-sync.txt
in Documentation/
There are some cases we should use this buffer synchronization framework.
One of which is to primarily enhance GPU rendering performance on Tizen
platform in case of 3d app with compositing mode that 3d app draws
something in off-screen buffer, and Web app.
In case of 3d app with compositing mode which is not a full screen mode,
the app calls glFlush to submit 3d commands to GPU driver instead of
glFinish for more performance. The reason we call glFlush is that glFinish
blocks caller's task until the execution of the 2d commands is completed.
Thus, that makes GPU and CPU more idle. As result, 3d rendering performance
with glFinish is quite lower than glFlush. However, the use of glFlush has
one issue that the a buffer shared with GPU could be broken when CPU
accesses the buffer at once after glFlush because CPU cannot be aware of
the completion of GPU access to the buffer. Of course, the app can be aware
of that time using eglWaitGL but this function is valid only in case of the
same process.
In case of Tizen, there are some applications that one process draws
something in its own off-screen buffer (pixmap buffer) using CPU, and other
process gets a off-screen buffer (window buffer) from Xorg using
DRI2GetBuffers, and then composites the pixmap buffer with the window buffer
using GPU, and finally page flip.
Web app based on HTML5 also has the same issue. Web browser and its web app
are different process. The web app draws something in its own pixmap buffer,
and then the web browser gets a window buffer from Xorg, and then composites
the pixmap buffer with the window buffer. And finally, page flip.
Thus, in such cases, a shared buffer could be broken as one process draws
something in pixmap buffer using CPU, when other process composites the
pixmap buffer with window buffer using GPU without any locking mechanism.
That is why we need user land locking interface, fcntl system call.
And last one is a deferred page flip issue. This issue is that a window
buffer rendered can be displayed on screen in about 32ms in worst case:
assume that the gpu rendering is completed within 16ms.
That can be incurred when compositing a pixmap buffer with a window buffer
using GPU and when vsync is just started. At this time, Xorg waits for
a vblank event to get a window buffer so 3d rendering will be delayed
up to about 16ms. As a result, the window buffer would be displayed in
about two vsyncs (about 32ms) and in turn, that would show slow
responsiveness.
For this, we could enhance the responsiveness with locking
mechanism: skipping one vblank wait. I guess in the similar reason,
Android, Chrome OS, and other platforms are using their own locking
mechanisms; Android sync driver, KDS, and DMA fence.
The below shows the deferred page flip issue in worst case,
|------------ <- vsync signal
|<------ DRI2GetBuffers
|
|
|
|------------ <- vsync signal
|<------ Request gpu rendering
time |
|
|<------ Request page flip (deferred)
|------------ <- vsync signal
|<------ Displayed on screen
|
|
|
|------------ <- vsync signal
Thanks,
Inki Dae
References:
[1] http://lwn.net/Articles/470339/
[2] https://patchwork.kernel.org/patch/2625361/
[3] http://linux.die.net/man/2/fcntl
Inki Dae (2):
[RFC PATCH v6] dmabuf-sync: Add a buffer synchronization framework
[RFC PATCH v2] dma-buf: Add user interfaces for dmabuf sync support.
Documentation/dma-buf-sync.txt | 285 +++++++++++++++++
drivers/base/Kconfig | 7 +
drivers/base/Makefile | 1 +
drivers/base/dma-buf.c | 85 +++++
drivers/base/dmabuf-sync.c | 678 ++++++++++++++++++++++++++++++++++++++++
include/linux/dma-buf.h | 16 +
include/linux/dmabuf-sync.h | 191 +++++++++++
7 files changed, 1263 insertions(+), 0 deletions(-)
create mode 100644 Documentation/dma-buf-sync.txt
create mode 100644 drivers/base/dmabuf-sync.c
create mode 100644 include/linux/dmabuf-sync.h
--
1.7.5.4
The comment for __acpi_map_table() is obvious wrong. In fact, after
commit 1c14fa49 (x86: use early_ioremap in __acpi_map_table), the
comment became stale, so remove it and add a new one.
Signed-off-by: Hanjun Guo <hanjun.guo(a)linaro.org>
---
arch/x86/kernel/acpi/boot.c | 12 ++----------
1 file changed, 2 insertions(+), 10 deletions(-)
diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
index 2627a81..665f857 100644
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -141,16 +141,8 @@ static u32 irq_to_gsi(int irq)
}
/*
- * Temporarily use the virtual area starting from FIX_IO_APIC_BASE_END,
- * to map the target physical address. The problem is that set_fixmap()
- * provides a single page, and it is possible that the page is not
- * sufficient.
- * By using this area, we can map up to MAX_IO_APICS pages temporarily,
- * i.e. until the next __va_range() call.
- *
- * Important Safety Note: The fixed I/O APIC page numbers are *subtracted*
- * from the fixed base. That's why we start at FIX_IO_APIC_BASE_END and
- * count idx down while incrementing the phys address.
+ * __acpi_map_table() will be called before the memory allocation
+ * is ready, so call early_ioremap() to map tables.
*/
char *__init __acpi_map_table(unsigned long phys, unsigned long size)
{
--
1.7.9.5
Hi Rafael,
You recently did this:
commit 878f6e074e9a7784a6e351512eace4ccb3542eef
Author: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
Date: Sun Aug 18 15:35:59 2013 +0200
Revert "cpufreq: Use cpufreq_policy_list for iterating over policies"
Revert commit eb60852 (cpufreq: Use cpufreq_policy_list for iterating
over policies), because it breaks system suspend/resume on multiple
machines.
It either causes resume to block indefinitely or causes the BUG_ON()
in lock_policy_rwsem_##mode() to trigger on sysfs accesses to cpufreq
attributes.
------x------------x---------------
This patchset gets the reverted patch back along with few supporting patches.
Cause of the initial problem you observed was this:
- At suspend all CPUs are removed leaving boot cpu. At this time policies aren't
freed and also aren't removed from cpufreq_policy_list. And per-cpu variable
cpufreq_cpu_data is marked as NULL.
- At resume CPUs other than boot cpu called __cpufreq_add_dev(). The tricky
change that was introduced by my patch was: We iterate over list of policies
instead of CPUs, where we used to get policy structure associated with
CPUs using per-cpu variable. Which used to be NULL for first CPU of a policy
that turned up. For the first cpu we don't want to call
cpufreq_add_policy_cpu() but want __cpufreq_add_add() to continue.
When we called cpufreq_add_policy_cpu() it tried to stop the governor (which
was already stopped) and hence errors leading into unstable state.
This patchset fixes these issues and is tested with suspend-resume over my
thinkpad with ubuntu. Apart from minor cleanups it removes policy from
cpufreq_policy_list in case of suspend/resume as well and hence we will never
call cpufreq_add_policy_cpu() for first cpu of a policy.
--
viresh
Viresh Kumar (5):
cpufreq: align closing brace '}' of an if block
cpufreq: remove policy from cpufreq_policy_list in system suspend
cpufreq: remove unnecessary check in __cpufreq_governor()
cpufreq: remove cpufreq_policy_cpu per-cpu variable
cpufreq: Use cpufreq_policy_list for iterating over policies
drivers/cpufreq/cpufreq.c | 77 +++++++++++++++--------------------------------
1 file changed, 24 insertions(+), 53 deletions(-)
--
1.7.12.rc2.18.g61b472e
Hi Mark
I have some big.LITTLE MP updates for LSK, these are based on your topic
v3.10/topic/big.LITTLE. Can you please pull these for this month's LSK
release?
Thanks
Tixy
The following changes since commit a0e1bccdf9f1c4d68ea024e4254dbbd1eff96a4a:
Merge branches 'master-arm-multi_pmu_v2', 'master-config-fragments', 'master-hw-bkpt-fix', 'master-misc-patches' and 'master-task-placement-v2-updates' into big-LITTLE-MP-master-v19 (2013-07-18 11:49:27 +0100)
are available in the git repository at:
git://git.linaro.org/arm/big.LITTLE/mp.git master
for you to fetch changes up to 18e3c3d2cc1346cb7cc2e3fd777b2c6f4fbb6135:
HMP: Update migration timer when we fork-migrate (2013-08-19 15:41:37 +0100)
----------------------------------------------------------------
Chris Redpath (8):
HMP: select 'best' task for migration rather than 'current'
sched: track per-rq 'last migration time'
HMP: Modify the runqueue stats to add a new child stat
HMP: Explicitly implement all-load-is-max-load policy for HMP targets
sched: HMP change nr_running offload metric
HMP: Implement idle pull for HMP
HMP: Access runqueue task clocks directly.
HMP: Update migration timer when we fork-migrate
Morten Rasmussen (1):
sched: HMP fix traversing the rb-tree from the curr pointer
kernel/sched/fair.c | 373 +++++++++++++++++++++++++++++++++++++++++++++------
1 file changed, 330 insertions(+), 43 deletions(-)
Hi Rafael,
This is V2 of the patch series which was posted here earlier:
http://www.gossamer-threads.com/lists/linux/kernel/1759514
This patchset tries to fix & cleanup many existing cpufreq core issues. First
four patches tries to cleanup basic problems in cpufreq core. Its first patch
was earlier sent separately but now is part of this series.
Fifth patch was also sent earlier as reply to your patches and was reviewed by
Srivatsa. Sixth patch was picked from Lukasz's patchset on introducing software
"boost" feature in core. It will be used by this patchset.
And 7-10 are are the most significant part of this set. They try to make many
things simple and robust.
Last patch in the series is new and wasn't part of V1.
This is rebased of your bleeding-edge branch + two patches from you:
18a6b03 cpufreq: Avoid double kobject_put() for the same kobject in error code path
d0cde63 cpufreq: Do not hold driver module references for additional policy CPUs
abe513f Merge branch 'acpi-sleep-next' into linux-next
They are also pushed in my cpufreq-next branch.
They are tested fairly well on ARM Vexpress TC2 board (big LITTLE).
Lukasz Majewski (1):
cpufreq: Store cpufreq policies in a list
Viresh Kumar (10):
cpufreq: Cleanup header files included in core
cpufreq: Re-arrange declarations in cpufreq.h
cpufreq: Give consistent names for struct cpufreq_policy *
cpufreq: Use sizeof(*ptr) form for finding size of a struct
cpufreq: Pass policy to cpufreq_add_policy_cpu()
cpufreq: Use cpufreq_policy_list for iterating over policies
cpufreq: Fix broken usage of governor->owner's refcount
cpufreq: Don't use cpufreq_driver->owner's refcount to protect
critical sections
cpufreq: Remove struct cpufreq_driver's owner field
cpufreq: improve error checking on return values of
__cpufreq_governor()
Documentation/cpu-freq/cpu-drivers.txt | 2 -
drivers/cpufreq/acpi-cpufreq.c | 5 +-
drivers/cpufreq/at32ap-cpufreq.c | 1 -
drivers/cpufreq/blackfin-cpufreq.c | 1 -
drivers/cpufreq/cpufreq-nforce2.c | 1 -
drivers/cpufreq/cpufreq.c | 426 ++++++++++++++++-----------------
drivers/cpufreq/cpufreq_conservative.c | 14 +-
drivers/cpufreq/cpufreq_governor.c | 6 -
drivers/cpufreq/cpufreq_governor.h | 7 +-
drivers/cpufreq/cpufreq_ondemand.c | 24 +-
drivers/cpufreq/cpufreq_performance.c | 3 +-
drivers/cpufreq/cpufreq_powersave.c | 3 +-
drivers/cpufreq/cpufreq_stats.c | 23 +-
drivers/cpufreq/cris-artpec3-cpufreq.c | 1 -
drivers/cpufreq/cris-etraxfs-cpufreq.c | 1 -
drivers/cpufreq/e_powersaver.c | 5 +-
drivers/cpufreq/elanfreq.c | 1 -
drivers/cpufreq/exynos-cpufreq.c | 2 +-
drivers/cpufreq/freq_table.c | 4 +-
drivers/cpufreq/gx-suspmod.c | 3 +-
drivers/cpufreq/ia64-acpi-cpufreq.c | 5 +-
drivers/cpufreq/intel_pstate.c | 1 -
drivers/cpufreq/kirkwood-cpufreq.c | 1 -
drivers/cpufreq/longhaul.c | 1 -
drivers/cpufreq/longrun.c | 1 -
drivers/cpufreq/loongson2_cpufreq.c | 1 -
drivers/cpufreq/maple-cpufreq.c | 1 -
drivers/cpufreq/p4-clockmod.c | 1 -
drivers/cpufreq/pasemi-cpufreq.c | 1 -
drivers/cpufreq/pcc-cpufreq.c | 1 -
drivers/cpufreq/pmac32-cpufreq.c | 1 -
drivers/cpufreq/pmac64-cpufreq.c | 6 +-
drivers/cpufreq/powernow-k6.c | 1 -
drivers/cpufreq/powernow-k7.c | 14 +-
drivers/cpufreq/powernow-k8.c | 7 +-
drivers/cpufreq/ppc-corenet-cpufreq.c | 1 -
drivers/cpufreq/ppc_cbe_cpufreq.c | 1 -
drivers/cpufreq/s3c2416-cpufreq.c | 1 -
drivers/cpufreq/s3c24xx-cpufreq.c | 6 +-
drivers/cpufreq/s3c64xx-cpufreq.c | 1 -
drivers/cpufreq/sc520_freq.c | 1 -
drivers/cpufreq/sh-cpufreq.c | 1 -
drivers/cpufreq/sparc-us2e-cpufreq.c | 6 +-
drivers/cpufreq/sparc-us3-cpufreq.c | 6 +-
drivers/cpufreq/speedstep-centrino.c | 1 -
drivers/cpufreq/speedstep-ich.c | 1 -
drivers/cpufreq/speedstep-smi.c | 1 -
include/linux/cpufreq.h | 386 ++++++++++++++---------------
48 files changed, 442 insertions(+), 547 deletions(-)
--
1.7.12.rc2.18.g61b472e
This is second version of patch that fixes rt_sig* ltp failures
in case of big endian V7 kernel. It make sigreturn_codes snippets
endian neutral. In this version of the patch problem is fixed by
using separate .S file with snippets written with regular asm
mnemonic. With such change compiler/linker take care of all needed
byteswaps in case of BE8 mode.
This approach was suggested on the following thread:
http://lists.infradead.org/pipermail/linux-arm-kernel/2013-August/191543.ht…
Changes were tested on V7 in both BE and LE modes
Changes from v1:
Use separate .S file rather than <asm/opcodes.h> instruction
byteswaping macros
Victor Kamensky (1):
ARM: signal: sigreturn_codes should be endian neutral to work in BE8
arch/arm/kernel/Makefile | 3 +-
arch/arm/kernel/signal.c | 29 +++---------------
arch/arm/kernel/sigreturn_codes.S | 64 +++++++++++++++++++++++++++++++++++++++
3 files changed, 70 insertions(+), 26 deletions(-)
create mode 100644 arch/arm/kernel/sigreturn_codes.S
--
1.8.1.4