Linaro-mm-sig February 2014

linaro-mm-sig@lists.linaro.org

22 participants
19 discussions

[PATCH v4 0/7] reserved-memory regions/CMA in devicetree, again

by Marek Szyprowski

Hi all! This is another quick update of the patches which add basic support for dynamic allocation of memory reserved regions defined in device tree. This time I hope I've really managed to address all the issues reported by Grant Likely, see the change log for more details. As a bonus, I've added support for ARM64 and PowerPC, as those architectures had quite well defined place where early memory reservation can be done. The initial code for this feature were posted here [1], merged as commit 9d8eab7af79cb4ce2de5de39f82c455b1f796963 ("drivers: of: add initialization code for dma reserved memory") and later reverted by commit 1931ee143b0ab72924944bc06e363d837ba05063. For more information, see [2]. Finally a new bindings has been proposed [3] and Josh Cartwright a few days ago prepared some code which implements those bindings [4]. This finally pushed me again to find some time to finish this task and review the code. Josh agreed to give me the ownership of this series to continue preparing them for mainline inclusion. For more information please refer to the changlelog below. [1]: http://lkml.kernel.org/g/1377527959-5080-1-git-send-email-m.szyprowski@sams… [2]: http://lkml.kernel.org/g/1381476448-14548-1-git-send-email-m.szyprowski@sam… [3]: http://lkml.kernel.org/g/20131030134702.19B57C402A0@trevor.secretlab.ca [4]: http://thread.gmane.org/gmane.linux.documentation/19579 Changelog: v4: - dynamic allocations are processed after all static reservations has been done - moved code for handling static reservations to drivers/of/fdt.c - removed node matching by string comparison, now phandle values are used directly - moved code for DMA and CMA handling directly to drivers/base/dma-{coherent,contiguous}.c - added checks for proper #size-cells, #address-cells, ranges properties in /reserved-memory node - even more code cleanup - added init code for ARM64 and PowerPC v3: http://article.gmane.org/gmane.linux.documentation/20169/ - refactored memory reservation code, created common code to parse reg, size, align, alloc-ranges properties - added support for multiple tuples in 'reg' property - memory is reserved regardless of presence of the driver for its compatible - prepared arch specific hooks for memory reservation (defaults use memblock calls) - removed node matching by string during device initialization - CMA init code: added checks for required region alignment - more code cleanup here and there v2: http://thread.gmane.org/gmane.linux.documentation/19870/ - removed copying of the node name - split shared-dma-pool handling into separate files (one for CMA and one for dma_declare_coherent based implementations) for making the code easier to understand - added support for AMBA devices, changed prototypes to use struct decice instead of struct platform_device - renamed some functions to better match other names used in drivers/of/ - restructured the rest of the code a bit for better readability - added 'reusable' property to exmaple linux,cma node in documentation - exclusive dma (dma_coherent) is used for only handling 'shared-dma-pool' regions without 'reusable' property and CMA is used only for handling 'shared-dma-pool' regions with 'reusable' property. v1: http://thread.gmane.org/gmane.linux.documentation/19579 - initial version prepared by Josh Cartwright Summary: Grant Likely (1): of: document bindings for reserved-memory nodes Marek Szyprowski (6): drivers: of: add initialization code for reserved memory drivers: dma-coherent: add initialization from device tree drivers: dma-contiguous: add initialization from device tree arm: add support for reserved memory defined by device tree arm64: add support for reserved memory defined by device tree powerpc: add support for reserved memory defined by device tree .../bindings/reserved-memory/reserved-memory.txt | 138 +++++++++ arch/arm/Kconfig | 1 + arch/arm/mm/init.c | 3 + arch/arm64/Kconfig | 1 + arch/arm64/mm/init.c | 2 + arch/powerpc/Kconfig | 1 + arch/powerpc/kernel/prom.c | 3 + drivers/base/dma-coherent.c | 41 +++ drivers/base/dma-contiguous.c | 130 +++++++-- drivers/of/Kconfig | 6 + drivers/of/Makefile | 1 + drivers/of/fdt.c | 145 ++++++++++ drivers/of/of_reserved_mem.c | 296 ++++++++++++++++++++ drivers/of/platform.c | 7 + include/asm-generic/vmlinux.lds.h | 11 + include/linux/of_reserved_mem.h | 65 +++++ 16 files changed, 829 insertions(+), 22 deletions(-) create mode 100644 Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt create mode 100644 drivers/of/of_reserved_mem.c create mode 100644 include/linux/of_reserved_mem.h -- 1.7.9.5

11 years, 4 months

[PATCH v3 0/6] reserved-memory regions/CMA in devicetree, again

by Marek Szyprowski

Hi all! This is yet another update of the second attempt to add basic support for dynamic allocation of memory reserved regions defined in device tree. This time I've tried to address all the issues reported by Grant Likely. The side-effect of it is a complete rewrite of memory reservation code, which results in added support for for multiple tuples in 'reg' property and complete support for 'size', 'align' and 'alloc-ranges' properties. The initial code for this feature were posted here [1], merged as commit 9d8eab7af79cb4ce2de5de39f82c455b1f796963 ("drivers: of: add initialization code for dma reserved memory") and later reverted by commit 1931ee143b0ab72924944bc06e363d837ba05063. For more information, see [2]. Finally a new bindings has been proposed [3] and Josh Cartwright a few days ago prepared some code which implements those bindings [4]. This finally pushed me again to find some time to finish this task and review the code. Josh agreed to give me the ownership of this series to continue preparing them for mainline inclusion. For more information please refer to the changlelog below. [1]: http://lkml.kernel.org/g/1377527959-5080-1-git-send-email-m.szyprowski@sams… [2]: http://lkml.kernel.org/g/1381476448-14548-1-git-send-email-m.szyprowski@sam… [3]: http://lkml.kernel.org/g/20131030134702.19B57C402A0@trevor.secretlab.ca [4]: http://thread.gmane.org/gmane.linux.documentation/19579 Changelog: v3: - refactored memory reservation code, created common code to parse reg, size, align, alloc-ranges properties - added support for multiple tuples in 'reg' property - memory is reserved regardless of presence of the driver for its compatible - prepared arch specific hooks for memory reservation (defaults use memblock calls) - removed node matching by string during device initialization - CMA init code: added checks for required region alignment - more code cleanup here and there v2: http://thread.gmane.org/gmane.linux.documentation/19870/ - removed copying of the node name - split shared-dma-pool handling into separate files (one for CMA and one for dma_declare_coherent based implementations) for making the code easier to understand - added support for AMBA devices, changed prototypes to use struct decice instead of struct platform_device - renamed some functions to better match other names used in drivers/of/ - restructured the rest of the code a bit for better readability - added 'reusable' property to exmaple linux,cma node in documentation - exclusive dma (dma_coherent) is used for only handling 'shared-dma-pool' regions without 'reusable' property and CMA is used only for handling 'shared-dma-pool' regions with 'reusable' property. v1: http://thread.gmane.org/gmane.linux.documentation/19579 - initial version prepared by Josh Cartwright Summary: Grant Likely (1): of: document bindings for reserved-memory nodes Josh Cartwright (2): drivers: of: implement reserved-memory handling for dma drivers: of: implement reserved-memory handling for cma Marek Szyprowski (3): base: dma-contiguous: add dma_contiguous_init_reserved_mem() function drivers: of: add initialization code for reserved memory ARM: init: add support for reserved memory defined by device tree .../bindings/reserved-memory/reserved-memory.txt | 138 +++++++ arch/arm/Kconfig | 1 + arch/arm/mm/init.c | 3 + drivers/base/dma-contiguous.c | 70 ++-- drivers/of/Kconfig | 19 + drivers/of/Makefile | 3 + drivers/of/fdt.c | 2 + drivers/of/of_reserved_mem.c | 390 ++++++++++++++++++++ drivers/of/of_reserved_mem_cma.c | 68 ++++ drivers/of/of_reserved_mem_dma.c | 65 ++++ drivers/of/platform.c | 7 + include/asm-generic/vmlinux.lds.h | 11 + include/linux/dma-contiguous.h | 7 + include/linux/of_reserved_mem.h | 65 ++++ 14 files changed, 827 insertions(+), 22 deletions(-) create mode 100644 Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt create mode 100644 drivers/of/of_reserved_mem.c create mode 100644 drivers/of/of_reserved_mem_cma.c create mode 100644 drivers/of/of_reserved_mem_dma.c create mode 100644 include/linux/of_reserved_mem.h -- 1.7.9.5

11 years, 4 months

Re: [Linaro-mm-sig] [PATCH 2/6] seqno-fence: Hardware dma-buf implementation of fencing (v4)

by Rob Clark

On Mon, Feb 17, 2014 at 12:36 PM, Christian König <deathsimple(a)vodafone.de> wrote: > Am 17.02.2014 18:27, schrieb Rob Clark: > >> On Mon, Feb 17, 2014 at 11:56 AM, Christian König >> <deathsimple(a)vodafone.de> wrote: >>> >>> Am 17.02.2014 16:56, schrieb Maarten Lankhorst: >>> >>>> This type of fence can be used with hardware synchronization for simple >>>> hardware that can block execution until the condition >>>> (dma_buf[offset] - value) >= 0 has been met. >>> >>> >>> Can't we make that just "dma_buf[offset] != 0" instead? As far as I know >>> this way it would match the definition M$ uses in their WDDM >>> specification >>> and so make it much more likely that hardware supports it. >> >> well 'buf[offset] >= value' at least means the same slot can be used >> for multiple operations (with increasing values of 'value').. not sure >> if that is something people care about. >> >>> =value seems to be possible with adreno and radeon. I'm not really sure >>> about others (although I presume it as least supported for nv desktop >>> stuff). For hw that cannot do >=value, we can either have a different fence >>> implementation which uses the !=0 approach. Or change seqno-fence >>> implementation later if needed. But if someone has hw that can do !=0 but >>> not >=value, speak up now ;-) > > > Here! Radeon can only do >=value on the DMA and 3D engine, but not with UVD > or VCE. And for the 3D engine it means draining the pipe, which isn't really > a good idea. hmm, ok.. forgot you have a few extra rings compared to me. Is UVD re-ordering from decode-order to display-order for you in hw? If not, I guess you need sw intervention anyways when a frame is done for frame re-ordering, so maybe hw->hw sync doesn't really matter as much as compared to gpu/3d->display. For dma<->3d interactions, seems like you would care more about hw<->hw sync, but I guess you aren't likely to use GPU A to do a resolve blit for GPU B.. For 3D ring, I assume you probably want a CP_WAIT_FOR_IDLE before a CP_MEM_WRITE to update fence value in memory (for the one signalling the fence). But why would you need that before a CP_WAIT_REG_MEM (for the one waiting for the fence)? I don't exactly have documentation for adreno version of CP_WAIT_REG_{MEM,EQ,GTE}.. but PFP and ME appear to be same instruction set as r600, so I'm pretty sure they should have similar capabilities.. CP_WAIT_REG_MEM appears to be same but with 32bit gpu addresses vs 64b. BR, -R > Christian. > > >> >>> Apart from that I still don't like the idea of leaking a drivers IRQ >>> context >>> outside of the driver, but without a proper GPU scheduler there probably >>> isn't much alternative. >> >> I guess it will be not uncommon scenario for gpu device to just need >> to kick display device to write a few registers for a page flip.. >> probably best not to schedule a worker just for this (unless the >> signalled device otherwise needs to). I think it is better in this >> case to give the signalee some rope to hang themselves, and make it >> the responsibility of the callback to kick things off to a worker if >> needed. >> >> BR, >> -R >> >>> Christian. >>> >>>> A software fallback still has to be provided in case the fence is used >>>> with a device that doesn't support this mechanism. It is useful to >>>> expose >>>> this for graphics cards that have an op to support this. >>>> >>>> Some cards like i915 can export those, but don't have an option to wait, >>>> so they need the software fallback. >>>> >>>> I extended the original patch by Rob Clark. >>>> >>>> v1: Original >>>> v2: Renamed from bikeshed to seqno, moved into dma-fence.c since >>>> not much was left of the file. Lots of documentation added. >>>> v3: Use fence_ops instead of custom callbacks. Moved to own file >>>> to avoid circular dependency between dma-buf.h and fence.h >>>> v4: Add spinlock pointer to seqno_fence_init >>>> >>>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst(a)canonical.com> >>>> --- >>>> Documentation/DocBook/device-drivers.tmpl | 1 >>>> drivers/base/fence.c | 50 +++++++++++++ >>>> include/linux/seqno-fence.h | 109 >>>> +++++++++++++++++++++++++++++ >>>> 3 files changed, 160 insertions(+) >>>> create mode 100644 include/linux/seqno-fence.h >>>> >>>> diff --git a/Documentation/DocBook/device-drivers.tmpl >>>> b/Documentation/DocBook/device-drivers.tmpl >>>> index 7a0c9ddb4818..8c85c20942c2 100644 >>>> --- a/Documentation/DocBook/device-drivers.tmpl >>>> +++ b/Documentation/DocBook/device-drivers.tmpl >>>> @@ -131,6 +131,7 @@ X!Edrivers/base/interface.c >>>> !Edrivers/base/dma-buf.c >>>> !Edrivers/base/fence.c >>>> !Iinclude/linux/fence.h >>>> +!Iinclude/linux/seqno-fence.h >>>> !Edrivers/base/reservation.c >>>> !Iinclude/linux/reservation.h >>>> !Edrivers/base/dma-coherent.c >>>> diff --git a/drivers/base/fence.c b/drivers/base/fence.c >>>> index 12df2bf62034..cd0937127a89 100644 >>>> --- a/drivers/base/fence.c >>>> +++ b/drivers/base/fence.c >>>> @@ -25,6 +25,7 @@ >>>> #include <linux/export.h> >>>> #include <linux/atomic.h> >>>> #include <linux/fence.h> >>>> +#include <linux/seqno-fence.h> >>>> #define CREATE_TRACE_POINTS >>>> #include <trace/events/fence.h> >>>> @@ -413,3 +414,52 @@ __fence_init(struct fence *fence, const struct >>>> fence_ops *ops, >>>> trace_fence_init(fence); >>>> } >>>> EXPORT_SYMBOL(__fence_init); >>>> + >>>> +static const char *seqno_fence_get_driver_name(struct fence *fence) { >>>> + struct seqno_fence *seqno_fence = to_seqno_fence(fence); >>>> + return seqno_fence->ops->get_driver_name(fence); >>>> +} >>>> + >>>> +static const char *seqno_fence_get_timeline_name(struct fence *fence) { >>>> + struct seqno_fence *seqno_fence = to_seqno_fence(fence); >>>> + return seqno_fence->ops->get_timeline_name(fence); >>>> +} >>>> + >>>> +static bool seqno_enable_signaling(struct fence *fence) >>>> +{ >>>> + struct seqno_fence *seqno_fence = to_seqno_fence(fence); >>>> + return seqno_fence->ops->enable_signaling(fence); >>>> +} >>>> + >>>> +static bool seqno_signaled(struct fence *fence) >>>> +{ >>>> + struct seqno_fence *seqno_fence = to_seqno_fence(fence); >>>> + return seqno_fence->ops->signaled && >>>> seqno_fence->ops->signaled(fence); >>>> +} >>>> + >>>> +static void seqno_release(struct fence *fence) >>>> +{ >>>> + struct seqno_fence *f = to_seqno_fence(fence); >>>> + >>>> + dma_buf_put(f->sync_buf); >>>> + if (f->ops->release) >>>> + f->ops->release(fence); >>>> + else >>>> + kfree(f); >>>> +} >>>> + >>>> +static long seqno_wait(struct fence *fence, bool intr, signed long >>>> timeout) >>>> +{ >>>> + struct seqno_fence *f = to_seqno_fence(fence); >>>> + return f->ops->wait(fence, intr, timeout); >>>> +} >>>> + >>>> +const struct fence_ops seqno_fence_ops = { >>>> + .get_driver_name = seqno_fence_get_driver_name, >>>> + .get_timeline_name = seqno_fence_get_timeline_name, >>>> + .enable_signaling = seqno_enable_signaling, >>>> + .signaled = seqno_signaled, >>>> + .wait = seqno_wait, >>>> + .release = seqno_release, >>>> +}; >>>> +EXPORT_SYMBOL(seqno_fence_ops); >>>> diff --git a/include/linux/seqno-fence.h b/include/linux/seqno-fence.h >>>> new file mode 100644 >>>> index 000000000000..952f7909128c >>>> --- /dev/null >>>> +++ b/include/linux/seqno-fence.h >>>> @@ -0,0 +1,109 @@ >>>> +/* >>>> + * seqno-fence, using a dma-buf to synchronize fencing >>>> + * >>>> + * Copyright (C) 2012 Texas Instruments >>>> + * Copyright (C) 2012 Canonical Ltd >>>> + * Authors: >>>> + * Rob Clark <robdclark(a)gmail.com> >>>> + * Maarten Lankhorst <maarten.lankhorst(a)canonical.com> >>>> + * >>>> + * This program is free software; you can redistribute it and/or modify >>>> it >>>> + * under the terms of the GNU General Public License version 2 as >>>> published by >>>> + * the Free Software Foundation. >>>> + * >>>> + * This program is distributed in the hope that it will be useful, but >>>> WITHOUT >>>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY >>>> or >>>> + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public >>>> License >>>> for >>>> + * more details. >>>> + * >>>> + * You should have received a copy of the GNU General Public License >>>> along with >>>> + * this program. If not, see <http://www.gnu.org/licenses/>. >>>> + */ >>>> + >>>> +#ifndef __LINUX_SEQNO_FENCE_H >>>> +#define __LINUX_SEQNO_FENCE_H >>>> + >>>> +#include <linux/fence.h> >>>> +#include <linux/dma-buf.h> >>>> + >>>> +struct seqno_fence { >>>> + struct fence base; >>>> + >>>> + const struct fence_ops *ops; >>>> + struct dma_buf *sync_buf; >>>> + uint32_t seqno_ofs; >>>> +}; >>>> + >>>> +extern const struct fence_ops seqno_fence_ops; >>>> + >>>> +/** >>>> + * to_seqno_fence - cast a fence to a seqno_fence >>>> + * @fence: fence to cast to a seqno_fence >>>> + * >>>> + * Returns NULL if the fence is not a seqno_fence, >>>> + * or the seqno_fence otherwise. >>>> + */ >>>> +static inline struct seqno_fence * >>>> +to_seqno_fence(struct fence *fence) >>>> +{ >>>> + if (fence->ops != &seqno_fence_ops) >>>> + return NULL; >>>> + return container_of(fence, struct seqno_fence, base); >>>> +} >>>> + >>>> +/** >>>> + * seqno_fence_init - initialize a seqno fence >>>> + * @fence: seqno_fence to initialize >>>> + * @lock: pointer to spinlock to use for fence >>>> + * @sync_buf: buffer containing the memory location to signal on >>>> + * @context: the execution context this fence is a part of >>>> + * @seqno_ofs: the offset within @sync_buf >>>> + * @seqno: the sequence # to signal on >>>> + * @ops: the fence_ops for operations on this seqno fence >>>> + * >>>> + * This function initializes a struct seqno_fence with passed >>>> parameters, >>>> + * and takes a reference on sync_buf which is released on fence >>>> destruction. >>>> + * >>>> + * A seqno_fence is a dma_fence which can complete in software when >>>> + * enable_signaling is called, but it also completes when >>>> + * (s32)((sync_buf)[seqno_ofs] - seqno) >= 0 is true >>>> + * >>>> + * The seqno_fence will take a refcount on the sync_buf until it's >>>> + * destroyed, but actual lifetime of sync_buf may be longer if one of >>>> the >>>> + * callers take a reference to it. >>>> + * >>>> + * Certain hardware have instructions to insert this type of wait >>>> condition >>>> + * in the command stream, so no intervention from software would be >>>> needed. >>>> + * This type of fence can be destroyed before completed, however a >>>> reference >>>> + * on the sync_buf dma-buf can be taken. It is encouraged to re-use the >>>> same >>>> + * dma-buf for sync_buf, since mapping or unmapping the sync_buf to the >>>> + * device's vm can be expensive. >>>> + * >>>> + * It is recommended for creators of seqno_fence to call fence_signal >>>> + * before destruction. This will prevent possible issues from >>>> wraparound >>>> at >>>> + * time of issue vs time of check, since users can check >>>> fence_is_signaled >>>> + * before submitting instructions for the hardware to wait on the >>>> fence. >>>> + * However, when ops.enable_signaling is not called, it doesn't have to >>>> be >>>> + * done as soon as possible, just before there's any real danger of >>>> seqno >>>> + * wraparound. >>>> + */ >>>> +static inline void >>>> +seqno_fence_init(struct seqno_fence *fence, spinlock_t *lock, >>>> + struct dma_buf *sync_buf, uint32_t context, uint32_t >>>> seqno_ofs, >>>> + uint32_t seqno, const struct fence_ops *ops) >>>> +{ >>>> + BUG_ON(!fence || !sync_buf || !ops); >>>> + BUG_ON(!ops->wait || !ops->enable_signaling || >>>> !ops->get_driver_name || !ops->get_timeline_name); >>>> + >>>> + /* >>>> + * ops is used in __fence_init for get_driver_name, so needs to >>>> be >>>> + * initialized first >>>> + */ >>>> + fence->ops = ops; >>>> + __fence_init(&fence->base, &seqno_fence_ops, lock, context, >>>> seqno); >>>> + get_dma_buf(sync_buf); >>>> + fence->sync_buf = sync_buf; >>>> + fence->seqno_ofs = seqno_ofs; >>>> +} >>>> + >>>> +#endif /* __LINUX_SEQNO_FENCE_H */ >>>> >>>> _______________________________________________ >>>> dri-devel mailing list >>>> dri-devel(a)lists.freedesktop.org >>>> http://lists.freedesktop.org/mailman/listinfo/dri-devel >>> >>> >>> _______________________________________________ >>> dri-devel mailing list >>> dri-devel(a)lists.freedesktop.org >>> http://lists.freedesktop.org/mailman/listinfo/dri-devel > >

11 years, 4 months

[PATCH v2 0/5] reserved-memory regions/CMA in devicetree, again

by Marek Szyprowski

Hi all! This is an updated version of the second attempt to add basic support for dynamic allocation of memory reserved regions defined in device tree. The initial code for this feature were posted here [1], merged as commit 9d8eab7af79cb4ce2de5de39f82c455b1f796963 ("drivers: of: add initialization code for dma reserved memory") and later reverted by commit 1931ee143b0ab72924944bc06e363d837ba05063. For more information, see [2]. Finally a new bindings has been proposed [3] and Josh Cartwright a few days ago prepared some code which implements those bindings [4]. This finally pushed me again to find some time to finish this task and review the code. Josh agreed to give me the ownership of this series to continue preparing them for mainline inclusion. For more information please refer to the changlelog below. [1]: http://lkml.kernel.org/g/1377527959-5080-1-git-send-email-m.szyprowski@sams… [2]: http://lkml.kernel.org/g/1381476448-14548-1-git-send-email-m.szyprowski@sam… [3]: http://lkml.kernel.org/g/20131030134702.19B57C402A0@trevor.secretlab.ca [4]: http://thread.gmane.org/gmane.linux.documentation/19579 Changelog: v2: - removed copying of the node name - split shared-dma-pool handling into separate files (one for CMA and one for dma_declare_coherent based implementations) for making the code easier to understand - added support for AMBA devices, changed prototypes to use struct decice instead of struct platform_device - renamed some functions to better match other names used in drivers/of/ - restructured the rest of the code a bit for better readability - added 'reusable' property to exmaple linux,cma node in documentation - exclusive dma (dma_coherent) is used for only handling 'shared-dma-pool' regions without 'reusable' property and CMA is used only for handling 'shared-dma-pool' regions with 'reusable' property. v1: http://thread.gmane.org/gmane.linux.documentation/19579 - initial version prepared by Josh Cartwright Summary: Grant Likely (1): of: document bindings for reserved-memory nodes Josh Cartwright (2): drivers: of: implement reserved-memory handling for dma drivers: of: implement reserved-memory handling for cma Marek Szyprowski (2): drivers: of: add initialization code for reserved memory ARM: init: add support for reserved memory defined by device tree .../bindings/reserved-memory/reserved-memory.txt | 138 ++++++++++++ arch/arm/mm/init.c | 3 + drivers/of/Kconfig | 20 ++ drivers/of/Makefile | 3 + drivers/of/of_reserved_mem.c | 219 ++++++++++++++++++++ drivers/of/of_reserved_mem_cma.c | 75 +++++++ drivers/of/of_reserved_mem_dma.c | 78 +++++++ drivers/of/platform.c | 7 + include/asm-generic/vmlinux.lds.h | 11 + include/linux/of_reserved_mem.h | 62 ++++++ 10 files changed, 616 insertions(+) create mode 100644 Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt create mode 100644 drivers/of/of_reserved_mem.c create mode 100644 drivers/of/of_reserved_mem_cma.c create mode 100644 drivers/of/of_reserved_mem_dma.c create mode 100644 include/linux/of_reserved_mem.h -- 1.7.9.5

11 years, 4 months

[GIT PULL]: dma-buf updates for 3.14

by Sumit Semwal

Hi Linus, Here's another tiny pull request for dma-buf framework updates; just some debugfs output updates. (There's another patch related to dma-buf, but it'll get upstreamed via Greg-kh's pull request). Could you please pull? The following changes since commit 45f7fdc2ffb9d5af4dab593843e89da70d1259e3: Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc (2014-02-11 22:28:47 -0800) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/sumits/dma-buf.git tags/dma-buf-for-3.14 for you to fetch changes up to c0b00a525c127d0055c1df6283300e17f601a1a1: dma-buf: update debugfs output (2014-02-13 10:08:52 +0530) ---------------------------------------------------------------- Small dma-buf pull request for 3.14 ---------------------------------------------------------------- Sumit Semwal (1): dma-buf: update debugfs output drivers/base/dma-buf.c | 25 ++++++++++++------------- include/linux/dma-buf.h | 2 +- 2 files changed, 13 insertions(+), 14 deletions(-)

11 years, 4 months

KSM and Android

by Pradeep Sawlani

Hello, In pursuit of saving memory on Android, I started experimenting with Kernel Same Page Merging(KSM). Number of pages shared because of KSM is reported by /sys/kernel/mm/pages_sharing. Documentation/vm/ksm.txt explains this as: "pages_sharing - how many more sites are sharing them i.e. how much saved" After enabling KSM on Android device, this number was reported as 19666 pages. Obvious optimization is to find out source of sharing and see if we can avoid duplicate pages at first place. In order to collect the data needed, It needed few modifications(trace_printk) statement in mm/ksm.c. Data should be collected from second cycle because that's when ksm starts merging pages. First KSM cycle is only used to calculate the checksum, pages are added to unstable tree and eventually moved to stable tree after this. After analyzing data from second KSM cycle, few things which stood out: 1. In the same cycle, KSM can scan same page multiple times. Scanning a page involves comparing page with pages in stable tree, if no match is found checksum is calculated. From the look of it, it seems to be cpu intensive operation and impacts dcache as well. 2. Same page which is already shared by multiple process can be replaced by KSM page. In this case, let say a particular page is mapped 24 times and is replaced by KSM page then eventually all 24 entries will point to KSM page. pages_sharing will account for all 24 pages. so pages _sharing does not actually report amount of memory saved.From the above example actual savings is one page. Both cases happen very often with Android because of its architecture - Zygote spawning(fork) multipleapplications. To calculate actual savings, we should account for same page(pfn)replaced by same KSM page only once. In the case 2 example, page_sharing should account only one page.After recalculating memory saving comes out to be 8602 pages (~34MB). I am trying to find out right solution to fix pages_sharing and eventually optimize KSM to scan pageonce even if it is mapped multiple times. Comments? Has anyone tried this before? Thanks, Pradeep

11 years, 4 months

[PATCH] dma-buf: avoid using IS_ERR_OR_NULL

by Colin Cross

dma_buf_map_attachment and dma_buf_vmap can return NULL or ERR_PTR on a error. This encourages a common buggy pattern in callers: sgt = dma_buf_map_attachment(attach, DMA_BIDIRECTIONAL); if (IS_ERR_OR_NULL(sgt)) return PTR_ERR(sgt); This causes the caller to return 0 on an error. IS_ERR_OR_NULL is almost always a sign of poorly-defined error handling. This patch converts dma_buf_map_attachment to always return ERR_PTR, and fixes the callers that incorrectly handled NULL. There are a few more callers that were not checking for NULL at all, which would have dereferenced a NULL pointer later. There are also a few more callers that correctly handled NULL and ERR_PTR differently, I left those alone but they could also be modified to delete the NULL check. This patch also converts dma_buf_vmap to always return NULL. All the callers to dma_buf_vmap only check for NULL, and would have dereferenced an ERR_PTR and panic'd if one was ever returned. This is not consistent with the rest of the dma buf APIs, but matches the expectations of all of the callers. Signed-off-by: Colin Cross <ccross(a)android.com> --- drivers/base/dma-buf.c | 18 +++++++++++------- drivers/gpu/drm/drm_prime.c | 2 +- drivers/gpu/drm/exynos/exynos_drm_dmabuf.c | 2 +- drivers/media/v4l2-core/videobuf2-dma-contig.c | 2 +- 4 files changed, 14 insertions(+), 10 deletions(-) diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c index 1e16cbd61da2..cfe1d8bc7bb8 100644 --- a/drivers/base/dma-buf.c +++ b/drivers/base/dma-buf.c @@ -251,9 +251,8 @@ EXPORT_SYMBOL_GPL(dma_buf_put); * @dmabuf: [in] buffer to attach device to. * @dev: [in] device to be attached. * - * Returns struct dma_buf_attachment * for this attachment; may return negative - * error codes. - * + * Returns struct dma_buf_attachment * for this attachment; returns ERR_PTR on + * error. */ struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, struct device *dev) @@ -319,9 +318,8 @@ EXPORT_SYMBOL_GPL(dma_buf_detach); * @attach: [in] attachment whose scatterlist is to be returned * @direction: [in] direction of DMA transfer * - * Returns sg_table containing the scatterlist to be returned; may return NULL - * or ERR_PTR. - * + * Returns sg_table containing the scatterlist to be returned; returns ERR_PTR + * on error. */ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach, enum dma_data_direction direction) @@ -334,6 +332,8 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach, return ERR_PTR(-EINVAL); sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction); + if (!sg_table) + sg_table = ERR_PTR(-ENOMEM); return sg_table; } @@ -544,6 +544,8 @@ EXPORT_SYMBOL_GPL(dma_buf_mmap); * These calls are optional in drivers. The intended use for them * is for mapping objects linear in kernel space for high use objects. * Please attempt to use kmap/kunmap before thinking about these interfaces. + * + * Returns NULL on error. */ void *dma_buf_vmap(struct dma_buf *dmabuf) { @@ -566,7 +568,9 @@ void *dma_buf_vmap(struct dma_buf *dmabuf) BUG_ON(dmabuf->vmap_ptr); ptr = dmabuf->ops->vmap(dmabuf); - if (IS_ERR_OR_NULL(ptr)) + if (WARN_ON_ONCE(IS_ERR(ptr))) + ptr = NULL; + if (!ptr) goto out_unlock; dmabuf->vmap_ptr = ptr; diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c index 56805c39c906..bb516fdd195d 100644 --- a/drivers/gpu/drm/drm_prime.c +++ b/drivers/gpu/drm/drm_prime.c @@ -471,7 +471,7 @@ struct drm_gem_object *drm_gem_prime_import(struct drm_device *dev, get_dma_buf(dma_buf); sgt = dma_buf_map_attachment(attach, DMA_BIDIRECTIONAL); - if (IS_ERR_OR_NULL(sgt)) { + if (IS_ERR(sgt)) { ret = PTR_ERR(sgt); goto fail_detach; } diff --git a/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c b/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c index 59827cc5e770..c786cd4f457b 100644 --- a/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c +++ b/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c @@ -224,7 +224,7 @@ struct drm_gem_object *exynos_dmabuf_prime_import(struct drm_device *drm_dev, get_dma_buf(dma_buf); sgt = dma_buf_map_attachment(attach, DMA_BIDIRECTIONAL); - if (IS_ERR_OR_NULL(sgt)) { + if (IS_ERR(sgt)) { ret = PTR_ERR(sgt); goto err_buf_detach; } diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c b/drivers/media/v4l2-core/videobuf2-dma-contig.c index 33d3871d1e13..880be0782dd9 100644 --- a/drivers/media/v4l2-core/videobuf2-dma-contig.c +++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c @@ -719,7 +719,7 @@ static int vb2_dc_map_dmabuf(void *mem_priv) /* get the associated scatterlist for this buffer */ sgt = dma_buf_map_attachment(buf->db_attach, buf->dma_dir); - if (IS_ERR_OR_NULL(sgt)) { + if (IS_ERR(sgt)) { pr_err("Error getting dmabuf scatterlist\n"); return -EINVAL; } -- 1.8.5.1

11 years, 5 months

IOMMU DMA-mapping API for arm64 ?

by Ritesh Harjani

Hi everyone, I tried looking for IOMMU support in ARM64 but what I was able to see is only swiotlb is currently supported. Based on my understanding for IOMMU support, we need DMA-MAPPING API to have IOMMU ops field, similar to what is present in arm32. I could see the iommu field added in dev_archdata in below mentioned patch (arch/arm64/include/asm/device.h), but there is no ops field in arch/arm64/mm/dma-mapping.c ? I also saw one mail discussion between you guys on what is the best place for adding iommu support in ARM64, but couldn't see any followed up patches for the same. Please tell us the current status/updates on the same. Your feedback will be greatly appreciated. commit 73150c983ac1f9b7653cfd3823b1ad4a44aad3bf Author: Will Deacon <will.deacon(a)arm.com> Date: Mon Jun 10 19:34:42 2013 +0100 arm64: device: add iommu pointer to device archdata When using an IOMMU for device mappings, it is necessary to keep a pointer between the device and the IOMMU to which it is attached in order to obtain the correct IOMMU when attaching the device to a domain. This patch adds an iommu pointer to the dev_archdata structure, in a similar manner to other architectures (ARM, PowerPC, x86, ...). Signed-off-by: Will Deacon <will.deacon(a)arm.com> Signed-off-by: Catalin Marinas <catalin.marinas(a)arm.com> Thanks Ritesh

11 years, 5 months

[PATCH] dma-buf: update debugfs output

by Sumit Semwal

Russell King observed 'wierd' looking output from debugfs, and also suggested better ways of getting device names (use KBUILD_MODNAME, dev_name()) This patch addresses these issues to make the debugfs output correct and better looking. Signed-off-by: Sumit Semwal <sumit.semwal(a)linaro.org> --- drivers/base/dma-buf.c | 18 ++++++++---------- include/linux/dma-buf.h | 2 +- 2 files changed, 9 insertions(+), 11 deletions(-) diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c index cfe1d8b..bf89fe3 100644 --- a/drivers/base/dma-buf.c +++ b/drivers/base/dma-buf.c @@ -621,7 +621,7 @@ static int dma_buf_describe(struct seq_file *s) return ret; seq_printf(s, "\nDma-buf Objects:\n"); - seq_printf(s, "\texp_name\tsize\tflags\tmode\tcount\n"); + seq_printf(s, "size\tflags\tmode\tcount\texp_name\n"); list_for_each_entry(buf_obj, &db_list.head, list_node) { ret = mutex_lock_interruptible(&buf_obj->lock); @@ -632,24 +632,22 @@ static int dma_buf_describe(struct seq_file *s) continue; } - seq_printf(s, "\t"); - - seq_printf(s, "\t%s\t%08zu\t%08x\t%08x\t%08ld\n", - buf_obj->exp_name, buf_obj->size, + seq_printf(s, "%08zu\t%08x\t%08x\t%08ld\t%s\n", + buf_obj->size, buf_obj->file->f_flags, buf_obj->file->f_mode, - (long)(buf_obj->file->f_count.counter)); + (long)(buf_obj->file->f_count.counter), buf_obj->exp_name); - seq_printf(s, "\t\tAttached Devices:\n"); + seq_printf(s, "\tAttached Devices:\n"); attach_count = 0; list_for_each_entry(attach_obj, &buf_obj->attachments, node) { - seq_printf(s, "\t\t"); + seq_printf(s, "\t"); - seq_printf(s, "%s\n", attach_obj->dev->init_name); + seq_printf(s, "%s\n", dev_name(attach_obj->dev)); attach_count++; } - seq_printf(s, "\n\t\tTotal %d devices attached\n", + seq_printf(s, "\nTotal %d devices attached\n", attach_count); count++; diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index dfac5ed..f886985 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -171,7 +171,7 @@ struct dma_buf *dma_buf_export_named(void *priv, const struct dma_buf_ops *ops, size_t size, int flags, const char *); #define dma_buf_export(priv, ops, size, flags) \ - dma_buf_export_named(priv, ops, size, flags, __FILE__) + dma_buf_export_named(priv, ops, size, flags, KBUILD_MODNAME) int dma_buf_fd(struct dma_buf *dmabuf, int flags); struct dma_buf *dma_buf_get(int fd); -- 1.8.3.2

11 years, 5 months

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Linaro-mm-sig February 2014