Hi Linus,
I would like to ask for pulling one more patch for ARM dma-mapping
subsystem to Linux v3.6 kernel tree. This patch fixes very subtle bug
(typical off-by-one error) which might appear in very rare
circumstances.
The following changes since commit 55d512e245bc7699a8800e23df1a24195dd08217:
Linux 3.6-rc5 (2012-09-08 16:43:45 -0700)
are available in the git repository at:
git://git.linaro.org/people/mszyprowski/linux-dma-mapping.git fixes-for-3.6
for you to fetch changes up to f3d87524975f01b885fc3d009c6ab6afd0d00746:
arm: mm: fix DMA pool affiliation check (2012-09-10 16:15:48 +0200)
Thanks!
Best regards
Marek Szyprowski
Samsung Poland R&D Center
Patch summary:
----------------------------------------------------------------
Thomas Petazzoni (1):
arm: mm: fix DMA pool affiliation check
arch/arm/mm/dma-mapping.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
On Wed, Sep 5, 2012 at 5:08 AM, Tomi Valkeinen <tomi.valkeinen(a)ti.com> wrote:
> Hi,
>
> OMAP has a custom video ram allocator, which I'd like to remove and use
> the standard dma allocation functions.
>
> There are two problems for which I'd like to hear suggestions or
> comments:
>
> First one is that the dma_alloc_* functions map the allocated memory for
> cpu use. In many cases with OMAP DSS (display subsystem) this is not
> needed: the memory may be written only by the SGX or the DSP, and it's
> only read by the DSS, so it's never touched by the CPU.
see dma_alloc_attrs() and DMA_ATTR_NO_KERNEL_MAPPING
> This is even more true when using VRFB on omap3 (and probably TILER on
> omap4) for rotation, as VRFB hides the actual memory and offers rotated
> views. In this case the backend memory is never accessed by anyone else
> than VRFB.
just fwiw, we don't actually need contiguous memory on o4/tiler :-)
(well, at least if you ignore things like secure playback)
> Is there a way to allocate the memory without creating a mapping? While
> it won't break anything as such, the allocated areas can be quite large
> thus causing large areas of the kernel's memory space to be needlessly
> reserved.
>
> The second case is passing a framebuffer address from the bootloader to
> the kernel. Often with mobile devices the bootloader will initialize the
> display hardware, showing a company logo or such. To keep the image on
> the screen when kernel starts we need to reserve the same physical
> memory area early at boot, and use that for the framebuffer.
with a bit of handwaving, this is possible. You can pass a base
address to dma_declare_contiguous() when you setup your device's CMA
pool. Although that doesn't really guarantee you're allocation from
that pool is at offset zero, I suppose.
> I'm not sure if there's any actual problem with this one, presuming
> there is a solution for the first case. Somehow the memory is reserved
> at early boot time, and this is passed to the fb driver. But can the
> memory be managed the same way as in normal case (for example freeing
> it), or does it need to be handled as a special case?
special-casing it might be better.. although possibly a dma attr could
be added for this to tell dma_alloc_from_contiguous() that we need a
particular address within the CMA pool. It seems a bit like a hack,
but OTOH I guess pretty much every consumer device would need a hack
like this.
BR,
-R
> Tomi
>
v2->v3
Split oom killer patch only.
Based on Nishanth's patch, which change ion_debug_heap_total with id.
1. add heap_found
2. Solve the issue about serveral id share one type.
Use ion_debug_heap_total(client, heap->id) instead of ion_debug_heap_total(client, heap->type)
since id is unique while type can be shared.
Fortunately Nishanth has update one patch, so rebase on the patch
v1->v2
Sync to Aug 30 common.git
v0->v1:
1. move ion_shrink out of mutex, suggested by Nishanth
2. check error flag of ERR_PTR(-ENOMEM)
3. add msleep to allow schedule out.
Base on common.git, android-3.4 branch
Add oom killer.
Once heap is used off,
SIGKILL is send to all tasks refered the buffer with descending oom_socre_adj
Nishanth Peethambaran (1):
gpu: ion: Update debugfs to show for each id
Zhangfei Gao (1):
gpu: ion: oom killer
drivers/gpu/ion/ion.c | 131 +++++++++++++++++++++++++++++++++++++++++++++----
1 files changed, 121 insertions(+), 10 deletions(-)
v1->v2
Sync to Aug 30 common.git
v0->v1:
1, Change gen_pool_create(12, -1) to gen_pool_create(PAGE_SHIFT, -1), suggested by Haojian
2. move ion_shrink out of mutex, suggested by Nishanth
3. check error flag of ERR_PTR(-ENOMEM)
4. add msleep to allow schedule out.
Base on common.git, android-3.4 branch
Patch 2:
Add support page wised cache flush for carveout_heap
There is only one nents for carveout heap, as well as dirty bit.
As a result, cache flush only takes effect for total carve heap.
Patch 3:
Add oom killer.
Once heap is used off,
SIGKILL is send to all tasks refered the buffer with descending oom_socre_adj
Zhangfei Gao (3):
gpu: ion: update carveout_heap_ops
gpu: ion: carveout_heap page wised cache flush
gpu: ion: oom killer
drivers/gpu/ion/ion.c | 118 +++++++++++++++++++++++++++++++++-
drivers/gpu/ion/ion_carveout_heap.c | 25 ++++++--
2 files changed, 133 insertions(+), 10 deletions(-)
So I've been experimenting with support for Dave Airlie's new RandR 1.4 provider
object interface, so that Optimus-based laptops can use our driver to drive the
discrete GPU and display on the integrated GPU. The good news is that I've got
a proof of concept working.
During a review of the current code, we came up with a few concerns:
1. The output source is responsible for allocating the shared memory
Right now, the X server calls CreatePixmap on the output source screen and then
expects the output sink screen to be able to display from whatever memory the
source allocates. Right now, the source has no mechanism for asking the sink
what its requirements are for the surface. I'm using our own internal pitch
alignment requirements and that seems to be good enough for the Intel device to
scan out, but that could be pure luck.
Does it make sense to add a mechanism for drivers to negotiate this with each
other, or is it sufficient to just define a lowest common denominator format and
if your hardware can't deal with that format, you just don't get to share
buffers?
One of my coworkers brought to my attention the fact that Tegra requires a
specific pitch alignment, and cannot accommodate larger pitches. If other SoC
designs have similar restrictions, we might need to add a handshake mechanism.
2. There's no fallback mechanism if sharing can't be negotiated
If RandR fails to share a pixmap with the output sink screen, the whole modeset
fails. This means you'll end up not seeing anything on the screen and you'll
probably think your computer locked up. Should there be some sort of software
copy fallback to ensure that something at least shows up on the display?
3. How should the memory be allocated?
In the prototype I threw together, I'm allocating the shared memory using
shm_open and then exporting that as a dma-buf file descriptor using an ioctl I
added to the kernel, and then importing that memory back into our driver through
dma_buf_attach & dma_buf_map_attachment. Does it make sense for user-space
programs to be able to export shmfs files like that? Should that interface go
in DRM / GEM / PRIME instead? Something else? I'm pretty unfamiliar with this
kernel code so any suggestions would be appreciated.
-- Aaron
P.S. for those unfamiliar with PRIME:
Dave Airlie added new support to the X Resize and Rotate extension version 1.4
to support offloading display and rendering to different drivers. PRIME is the
DRM implementation in the kernel, layered on top of DMA-BUF, that implements the
actual sharing of buffers between drivers.
http://cgit.freedesktop.org/xorg/proto/randrproto/tree/randrproto.txt?id=ra…http://airlied.livejournal.com/75555.html - update on hotplug server
http://airlied.livejournal.com/76078.html - randr 1.5 demo videos
-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain
confidential information. Any unauthorized review, use, disclosure or distribution
is prohibited. If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------
Base on common.git, android-3.4 branch
Patch 2:
Add support page wised cache flush for carveout_heap
There is only one nents for carveout heap, as well as dirty bit.
As a result, cache flush only takes effect for total carve heap.
Patch 3:
Add oom killer.
Once heap is used off,
SIGKILL is send to all tasks refered the buffer with descending oom_socre_adj
Zhangfei Gao (3):
gpu: ion: update carveout_heap_ops
gpu: ion: carveout_heap page wised cache flush
gpu: ion: oom killer
drivers/gpu/ion/ion.c | 112 ++++++++++++++++++++++++++++++++++-
drivers/gpu/ion/ion_carveout_heap.c | 23 ++++++--
2 files changed, 127 insertions(+), 8 deletions(-)
v0->v1:
1, Change gen_pool_create(12, -1) to gen_pool_create(PAGE_SHIFT, -1), suggested by Haojian
2. move ion_shrink out of mutex, suggested by Nishanth
3. check error flag of ERR_PTR(-ENOMEM)
4. add msleep to allow schedule out.
Base on common.git, android-3.4 branch
Patch 2:
Add support page wised cache flush for carveout_heap
There is only one nents for carveout heap, as well as dirty bit.
As a result, cache flush only takes effect for total carve heap.
Patch 3:
Add oom killer.
Once heap is used off,
SIGKILL is send to all tasks refered the buffer with descending oom_socre_adj
Zhangfei Gao (3):
gpu: ion: update carveout_heap_ops
gpu: ion: carveout_heap page wised cache flush
gpu: ion: oom killer
drivers/gpu/ion/ion.c | 118 +++++++++++++++++++++++++++++++++-
drivers/gpu/ion/ion_carveout_heap.c | 25 ++++++--
2 files changed, 133 insertions(+), 10 deletions(-)
Hi Linus,
I would like to ask for pulling another set of fixes for ARM dma-mapping
subsystem. Commit e9da6e9905e6 replaced custom consistent buffer
remapping code with generic vmalloc areas. It however introduced some
regressions caused by limited support for allocations in atomic context.
This series contains fixes for those regressions. For some subplatforms
the default, pre-allocated pool for atomic allocations turned out to be
too small, so a function for setting its size has been added. Another
set of patches adds support for atomic allocations to IOMMU-aware
DMA-mapping implementation. The last part of this pull request contains
two fixes for Contiguous Memory Allocator, which relax too strict
requirements.
The following changes since commit fea7a08acb13524b47711625eebea40a0ede69a0:
Linux 3.6-rc3 (2012-08-22 13:29:06 -0700)
are available in the git repository at:
fixes-for-3.6
for you to fetch changes up to 479ed93a4b98eef03fd8260f7ddc00019221c450:
ARM: dma-mapping: IOMMU allocates pages from atomic_pool with GFP_ATOMIC (2012-08-28 21:01:07 +0200)
Thanks!
Best regards
Marek Szyprowski
Samsung Poland R&D Center
----------------------------------------------------------------
Patch summary:
Hiroshi Doyu (4):
ARM: dma-mapping: atomic_pool with struct page **pages
ARM: dma-mapping: Refactor out to introduce __in_atomic_pool
ARM: dma-mapping: Introduce __atomic_get_pages() for __iommu_get_pages()
ARM: dma-mapping: IOMMU allocates pages from atomic_pool with GFP_ATOMIC
Marek Szyprowski (5):
mm: cma: fix alignment requirements for contiguous regions
ARM: relax conditions required for enabling Contiguous Memory Allocator
ARM: DMA-Mapping: add function for setting coherent pool size from platform code
ARM: DMA-Mapping: print warning when atomic coherent allocation fails
ARM: Kirkwood: increase atomic coherent pool size
arch/arm/Kconfig | 2 +-
arch/arm/include/asm/dma-mapping.h | 7 ++
arch/arm/mach-kirkwood/common.c | 7 ++
arch/arm/mm/dma-mapping.c | 114 ++++++++++++++++++++++++++++++++---
drivers/base/dma-contiguous.c | 2 +-
5 files changed, 120 insertions(+), 12 deletions(-)