A dma-fence can be attached to a buffer which is being filled or consumed by hw, to allow userspace to pass the buffer without waiting to another device. For example, userspace can call page_flip ioctl to display the next frame of graphics after kicking the GPU but while the GPU is still rendering. The display device sharing the buffer with the GPU would attach a callback to get notified when the GPU's rendering-complete IRQ fires, to update the scan-out address of the display, without having to wake up userspace.
A dma-fence is transient, one-shot deal. It is allocated and attached to one or more dma-buf's. When the one that attached it is done, with the pending operation, it can signal the fence.
+ dma_fence_signal()
The dma-buf-mgr handles tracking, and waiting on, the fences associated with a dma-buf.
TODO maybe need some helper fxn for simple devices, like a display- only drm/kms device which simply wants to wait for exclusive fence to be signaled, and then attach a non-exclusive fence while scanout is in progress.
The one pending on the fence can add an async callback: + dma_fence_add_callback() The callback can optionally be cancelled with remove_wait_queue()
Or wait synchronously (optionally with timeout or interruptible): + dma_fence_wait()
A default software-only implementation is provided, which can be used by drivers attaching a fence to a buffer when they have no other means for hw sync. But a memory backed fence is also envisioned, because it is common that GPU's can write to, or poll on some memory location for synchronization. For example:
fence = dma_buf_get_fence(dmabuf); if (fence->ops == &bikeshed_fence_ops) { dma_buf *fence_buf; dma_bikeshed_fence_get_buf(fence, &fence_buf, &offset); ... tell the hw the memory location to wait on ... } else { /* fall-back to sw sync * / dma_fence_add_callback(fence, my_cb); }
On SoC platforms, if some other hw mechanism is provided for synchronizing between IP blocks, it could be supported as an alternate implementation with it's own fence ops in a similar way.
To facilitate other non-sw implementations, the enable_signaling callback can be used to keep track if a device not supporting hw sync is waiting on the fence, and in this case should arrange to call dma_fence_signal() at some point after the condition has changed, to notify other devices waiting on the fence. If there are no sw waiters, this can be skipped to avoid waking the CPU unnecessarily. The handler of the enable_signaling op should take a refcount until the fence is signaled, then release its ref.
The intention is to provide a userspace interface (presumably via eventfd) later, to be used in conjunction with dma-buf's mmap support for sw access to buffers (or for userspace apps that would prefer to do their own synchronization).
v1: Original v2: After discussion w/ danvet and mlankhorst on #dri-devel, we decided that dma-fence didn't need to care about the sw->hw signaling path (it can be handled same as sw->sw case), and therefore the fence->ops can be simplified and more handled in the core. So remove the signal, add_callback, cancel_callback, and wait ops, and replace with a simple enable_signaling() op which can be used to inform a fence supporting hw->hw signaling that one or more devices which do not support hw signaling are waiting (and therefore it should enable an irq or do whatever is necessary in order that the CPU is notified when the fence is passed). v3: Fix locking fail in attach_fence() and get_fence() v4: Remove tie-in w/ dma-buf.. after discussion w/ danvet and mlankorst we decided that we need to be able to attach one fence to N dma-buf's, so using the list_head in dma-fence struct would be problematic. v5: [ Maarten Lankhorst ] Updated for dma-bikeshed-fence and dma-buf-manager. v6: [ Maarten Lankhorst ] I removed dma_fence_cancel_callback and some comments about checking if fence fired or not. This is broken by design. waitqueue_active during destruction is now fatal, since the signaller should be holding a reference in enable_signalling until it signalled the fence. Pass the original dma_fence_cb along, and call __remove_wait in the dma_fence_callback handler, so that no cleanup needs to be performed. v7: [ Maarten Lankhorst ] Set cb->func and only enable sw signaling if fence wasn't signaled yet, for example for hardware fences that may choose to signal blindly.
Signed-off-by: Maarten Lankhorst maarten.lankhorst@canonical.com --- drivers/base/Makefile | 2 drivers/base/dma-fence.c | 287 +++++++++++++++++++++++++++++++++++++++++++++ include/linux/dma-fence.h | 96 +++++++++++++++ 3 files changed, 384 insertions(+), 1 deletion(-) create mode 100644 drivers/base/dma-fence.c create mode 100644 include/linux/dma-fence.h
diff --git a/drivers/base/Makefile b/drivers/base/Makefile index 5aa2d70..6e9f217 100644 --- a/drivers/base/Makefile +++ b/drivers/base/Makefile @@ -10,7 +10,7 @@ obj-$(CONFIG_CMA) += dma-contiguous.o obj-y += power/ obj-$(CONFIG_HAS_DMA) += dma-mapping.o obj-$(CONFIG_HAVE_GENERIC_DMA_COHERENT) += dma-coherent.o -obj-$(CONFIG_DMA_SHARED_BUFFER) += dma-buf.o +obj-$(CONFIG_DMA_SHARED_BUFFER) += dma-buf.o dma-fence.o obj-$(CONFIG_ISA) += isa.o obj-$(CONFIG_FW_LOADER) += firmware_class.o obj-$(CONFIG_NUMA) += node.o diff --git a/drivers/base/dma-fence.c b/drivers/base/dma-fence.c new file mode 100644 index 0000000..c280ee7 --- /dev/null +++ b/drivers/base/dma-fence.c @@ -0,0 +1,287 @@ +/* + * Fence mechanism for dma-buf to allow for asynchronous dma access + * + * Copyright (C) 2012 Texas Instruments + * Author: Rob Clark rob.clark@linaro.org + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published by + * the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program. If not, see http://www.gnu.org/licenses/. + */ + +#include <linux/slab.h> +#include <linux/sched.h> +#include <linux/export.h> +#include <linux/dma-fence.h> + +/** + * dma_fence_signal - Signal a fence. + * + * @fence: The fence to signal + * + * All registered callbacks will be called directly (synchronously) and + * all blocked waters will be awoken. + * + * TODO: any value in adding a dma_fence_cancel(), for example to recov + * from hung gpu? It would behave like dma_fence_signal() but return + * an error to waiters and cb's to let them know that the condition they + * are waiting for will never happen. + */ +int dma_fence_signal(struct dma_fence *fence) +{ + unsigned long flags; + int ret = -EINVAL; + + if (WARN_ON(!fence)) + return -EINVAL; + + spin_lock_irqsave(&fence->event_queue.lock, flags); + if (!fence->signaled) { + fence->signaled = true; + __wake_up_locked_key(&fence->event_queue, TASK_NORMAL, + &fence->event_queue); + ret = 0; + } else WARN(1, "Already signaled"); + spin_unlock_irqrestore(&fence->event_queue.lock, flags); + + return ret; +} +EXPORT_SYMBOL_GPL(dma_fence_signal); + +static void release_fence(struct kref *kref) +{ + struct dma_fence *fence = + container_of(kref, struct dma_fence, refcount); + + BUG_ON(waitqueue_active(&fence->event_queue)); + + if (fence->ops->release) + fence->ops->release(fence); + + kfree(fence); +} + +/** + * dma_fence_put - Release a reference to the fence. + */ +void dma_fence_put(struct dma_fence *fence) +{ + WARN_ON(!fence); + kref_put(&fence->refcount, release_fence); +} +EXPORT_SYMBOL_GPL(dma_fence_put); + +/** + * dma_fence_get - Take a reference to the fence. + * + * In most cases this is used only internally by dma-fence. + */ +void dma_fence_get(struct dma_fence *fence) +{ + WARN_ON(!fence); + kref_get(&fence->refcount); +} +EXPORT_SYMBOL_GPL(dma_fence_get); + +static int check_signaling(struct dma_fence *fence) +{ + bool enable_signaling = false, signaled; + unsigned long flags; + + spin_lock_irqsave(&fence->event_queue.lock, flags); + signaled = fence->signaled; + if (!signaled && !fence->needs_sw_signal) + enable_signaling = fence->needs_sw_signal = true; + spin_unlock_irqrestore(&fence->event_queue.lock, flags); + + if (enable_signaling) { + int ret; + + /* At this point, if enable_signaling returns any error + * a wakeup has to be performanced regardless. + * -ENOENT signals fence was already signaled. Any other error + * inidicates a catastrophic hardware error. + * + * If any hardware error occurs, nothing can be done against + * it, so it's treated like the fence was already signaled. + * No synchronization can be performed, so we have to assume + * the fence was already signaled. + */ + ret = fence->ops->enable_signaling(fence); + if (ret) { + signaled = true; + dma_fence_signal(fence); + } + } + + if (!signaled) + return 0; + else + return -ENOENT; +} + +int __dma_fence_wake_func(wait_queue_t *wait, unsigned mode, + int flags, void *key) +{ + struct dma_fence_cb *cb = + container_of(wait, struct dma_fence_cb, base); + + __remove_wait_queue(key, wait); + return cb->func(cb, wait->private); +} + +/** + * dma_fence_add_callback - Add a callback to be called when the fence + * is signaled. + * + * @fence: The fence to wait on + * @cb: The callback to register + * + * Any number of callbacks can be registered to a fence, but a callback + * can only be registered to once fence at a time. + * + * Note that the callback can be called from an atomic context. If + * fence is already signaled, this function will return -ENOENT (and + * *not* call the callback) + */ +int dma_fence_add_callback(struct dma_fence *fence, struct dma_fence_cb *cb, + dma_fence_func_t func, void *priv) +{ + unsigned long flags; + int ret; + + if (WARN_ON(!fence || !func)) + return -EINVAL; + + ret = check_signaling(fence); + + spin_lock_irqsave(&fence->event_queue.lock, flags); + if (!ret && fence->signaled) + ret = -ENOENT; + + if (!ret) { + cb->base.flags = 0; + cb->base.func = __dma_fence_wake_func; + cb->base.private = priv; + cb->fence = fence; + cb->func = func; + __add_wait_queue(&fence->event_queue, &cb->base); + } + spin_unlock_irqrestore(&fence->event_queue.lock, flags); + + return ret; +} +EXPORT_SYMBOL_GPL(dma_fence_add_callback); + +/** + * dma_fence_wait - Wait for a fence to be signaled. + * + * @fence: The fence to wait on + * @interruptible: if true, do an interruptible wait + * @timeout: absolute time for timeout, in jiffies. + * + * Returns 0 on success, -EBUSY if a timeout occured, + * -ERESTARTSYS if the wait was interrupted by a signal. + */ +int dma_fence_wait(struct dma_fence *fence, bool interruptible, unsigned long timeout) +{ + unsigned long cur; + int ret; + + if (WARN_ON(!fence)) + return -EINVAL; + + cur = jiffies; + if (time_after_eq(cur, timeout)) + return -EBUSY; + + timeout -= cur; + + ret = check_signaling(fence); + if (ret == -ENOENT) + return 0; + else if (ret) + return ret; + + if (interruptible) + ret = wait_event_interruptible_timeout(fence->event_queue, + fence->signaled, + timeout); + else + ret = wait_event_timeout(fence->event_queue, + fence->signaled, timeout); + + if (ret > 0) + return 0; + else if (!ret) + return -EBUSY; + else + return ret; +} +EXPORT_SYMBOL_GPL(dma_fence_wait); + +/* + * Helpers intended to be used by the ops of the dma_fence implementation: + * + * NOTE: helpers and fxns intended to be used by other dma-fence + * implementations are not exported.. I'm not really sure if it makes + * sense to have a dma-fence implementation that is itself a module. + */ + +void __dma_fence_init(struct dma_fence *fence, struct dma_fence_ops *ops, void *priv) +{ + WARN_ON(!ops || !ops->enable_signaling); + + kref_init(&fence->refcount); + fence->ops = ops; + fence->priv = priv; + init_waitqueue_head(&fence->event_queue); +} +EXPORT_SYMBOL_GPL(__dma_fence_init); + +/* + * Pure sw implementation for dma-fence. The CPU always gets involved. + */ + +static int sw_enable_signaling(struct dma_fence *fence) +{ + /* + * pure sw, no irq's to enable, because the fence creator will + * always call dma_fence_signal() + */ + return 0; +} + +static struct dma_fence_ops sw_fence_ops = { + .enable_signaling = sw_enable_signaling, +}; + +/** + * dma_fence_create - Create a simple sw-only fence. + * + * This fence only supports signaling from/to CPU. Other implementations + * of dma-fence can be used to support hardware to hardware signaling, if + * supported by the hardware, and use the dma_fence_helper_* functions for + * compatibility with other devices that only support sw signaling. + */ +struct dma_fence *dma_fence_create(void) +{ + struct dma_fence *fence; + + fence = kzalloc(sizeof(struct dma_fence), GFP_KERNEL); + if (!fence) + return ERR_PTR(-ENOMEM); + + __dma_fence_init(fence, &sw_fence_ops, 0); + + return fence; +} +EXPORT_SYMBOL_GPL(dma_fence_create); diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h new file mode 100644 index 0000000..70d12c0 --- /dev/null +++ b/include/linux/dma-fence.h @@ -0,0 +1,96 @@ +/* + * Fence mechanism for dma-buf to allow for asynchronous dma access + * + * Copyright (C) 2012 Texas Instruments + * Author: Rob Clark rob.clark@linaro.org + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published by + * the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program. If not, see http://www.gnu.org/licenses/. + */ + +#ifndef __DMA_FENCE_H__ +#define __DMA_FENCE_H__ + +#include <linux/err.h> +#include <linux/list.h> +#include <linux/wait.h> +#include <linux/list.h> +#include <linux/dma-buf.h> + +struct dma_fence; +struct dma_fence_ops; +struct dma_fence_cb; + +struct dma_fence { + struct kref refcount; + struct dma_fence_ops *ops; + wait_queue_head_t event_queue; + void *priv; + + /* has this fence been signaled yet? */ + bool signaled : 1; + + /* do we have one or more waiters or callbacks? */ + bool needs_sw_signal : 1; +}; + +typedef int (*dma_fence_func_t)(struct dma_fence_cb *cb, void *priv); + +struct dma_fence_cb { + wait_queue_t base; + dma_fence_func_t func; + struct dma_fence *fence; +}; + +struct dma_fence_ops { + /** + * For fence implementations that have the capability for hw->hw + * signaling, they can implement this op to enable the necessary + * irqs, or insert commands into cmdstream, etc. This is called + * in the first wait() or add_callback() path to let the fence + * implementation know that there is another driver waiting on + * the signal (ie. hw->sw case). + * + * A return value of -ENOENT will indicate that the fence has + * already passed. Any other errors will be treated as -ENOENT, + * and can happen because of hardware failure. + */ + int (*enable_signaling)(struct dma_fence *fence); + void (*release)(struct dma_fence *fence); +}; + +/* + * TODO does it make sense to be able to enable dma-fence without dma-buf, + * or visa versa? + */ +#ifdef CONFIG_DMA_SHARED_BUFFER + +/* create a basic (pure sw) fence: */ +struct dma_fence *dma_fence_create(void); + +/* intended to be used by other dma_fence implementations: */ +void __dma_fence_init(struct dma_fence *fence, + struct dma_fence_ops *ops, void *priv); + +void dma_fence_get(struct dma_fence *fence); +void dma_fence_put(struct dma_fence *fence); + +int dma_fence_signal(struct dma_fence *fence); +int dma_fence_wait(struct dma_fence *fence, bool interruptible, unsigned long timeout); +int dma_fence_add_callback(struct dma_fence *fence, struct dma_fence_cb *cb, + dma_fence_func_t func, void *priv); + +#else +// TODO +#endif /* CONFIG_DMA_SHARED_BUFFER */ + +#endif /* __DMA_FENCE_H__ */
This type of fence can be used with hardware synchronization for simple hardware that can block execution until the condition dma_buf[offset] >= value has been met, accounting for wraparound.
A software fallback still has to be provided in case the fence is used with a device that doesn't support this mechanism. It is useful to expose this for graphics cards that have an op to support this.
Some cards like i915 can export those, but don't have an option to wait, so they need the software fallback.
I extended the original patch by Rob Clark.
Signed-off-by: Maarten Lankhorst maarten.lankhorst@canonical.com --- drivers/base/Makefile | 2 - drivers/base/dma-bikeshed-fence.c | 44 +++++++++++++++++ include/linux/dma-bikeshed-fence.h | 92 ++++++++++++++++++++++++++++++++++++ 3 files changed, 137 insertions(+), 1 deletion(-) create mode 100644 drivers/base/dma-bikeshed-fence.c create mode 100644 include/linux/dma-bikeshed-fence.h
diff --git a/drivers/base/Makefile b/drivers/base/Makefile index 6e9f217..1e7723b 100644 --- a/drivers/base/Makefile +++ b/drivers/base/Makefile @@ -10,7 +10,7 @@ obj-$(CONFIG_CMA) += dma-contiguous.o obj-y += power/ obj-$(CONFIG_HAS_DMA) += dma-mapping.o obj-$(CONFIG_HAVE_GENERIC_DMA_COHERENT) += dma-coherent.o -obj-$(CONFIG_DMA_SHARED_BUFFER) += dma-buf.o dma-fence.o +obj-$(CONFIG_DMA_SHARED_BUFFER) += dma-buf.o dma-fence.o dma-bikeshed-fence.o obj-$(CONFIG_ISA) += isa.o obj-$(CONFIG_FW_LOADER) += firmware_class.o obj-$(CONFIG_NUMA) += node.o diff --git a/drivers/base/dma-bikeshed-fence.c b/drivers/base/dma-bikeshed-fence.c new file mode 100644 index 0000000..fa063e8 --- /dev/null +++ b/drivers/base/dma-bikeshed-fence.c @@ -0,0 +1,44 @@ +/* + * dma-fence implementation that supports hw synchronization via hw + * read/write of memory semaphore + * + * Copyright (C) 2012 Texas Instruments + * Author: Rob Clark rob.clark@linaro.org + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published by + * the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program. If not, see http://www.gnu.org/licenses/. + */ + +#include <linux/export.h> +#include <linux/slab.h> +#include <linux/dma-bikeshed-fence.h> + +static int enable_signaling(struct dma_fence *fence) +{ + struct dma_bikeshed_fence *bikeshed_fence = to_bikeshed_fence(fence); + return bikeshed_fence->enable_signaling(bikeshed_fence); +} + +static void bikeshed_release(struct dma_fence *fence) +{ + struct dma_bikeshed_fence *f = to_bikeshed_fence(fence); + + if (f->release) + f->release(f); + dma_buf_put(f->sync_buf); +} + +struct dma_fence_ops dma_bikeshed_fence_ops = { + .enable_signaling = enable_signaling, + .release = bikeshed_release +}; +EXPORT_SYMBOL_GPL(dma_bikeshed_fence_ops); diff --git a/include/linux/dma-bikeshed-fence.h b/include/linux/dma-bikeshed-fence.h new file mode 100644 index 0000000..4f19801 --- /dev/null +++ b/include/linux/dma-bikeshed-fence.h @@ -0,0 +1,92 @@ +/* + * dma-fence implementation that supports hw synchronization via hw + * read/write of memory semaphore + * + * Copyright (C) 2012 Texas Instruments + * Author: Rob Clark rob.clark@linaro.org + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published by + * the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program. If not, see http://www.gnu.org/licenses/. + */ + +#ifndef __DMA_BIKESHED_FENCE_H__ +#define __DMA_BIKESHED_FENCE_H__ + +#include <linux/types.h> +#include <linux/dma-fence.h> +#include <linux/dma-buf.h> + +struct dma_bikeshed_fence { + struct dma_fence base; + + struct dma_buf *sync_buf; + uint32_t seqno_ofs; + uint32_t seqno; + + int (*enable_signaling)(struct dma_bikeshed_fence *fence); + void (*release)(struct dma_bikeshed_fence *fence); +}; + +/* + * TODO does it make sense to be able to enable dma-fence without dma-buf, + * or visa versa? + */ +#ifdef CONFIG_DMA_SHARED_BUFFER + +extern struct dma_fence_ops dma_bikeshed_fence_ops; + +static inline bool is_bikeshed_fence(struct dma_fence *fence) +{ + return fence->ops == &dma_bikeshed_fence_ops; +} + +static inline struct dma_bikeshed_fence *to_bikeshed_fence(struct dma_fence *fence) +{ + if (WARN_ON(!is_bikeshed_fence(fence))) + return NULL; + return container_of(fence, struct dma_bikeshed_fence, base); +} + +/** + * dma_bikeshed_fence_init - Initialize a fence + * + * @fence: dma_bikeshed_fence to initialize + * @sync_buf: buffer containing the memory location to signal on + * @seqno_ofs: the offset within @sync_buf + * @seqno: the sequence # to signal on + * @priv: value of priv member + * @enable_signaling: callback which is called when some other device is + * waiting for sw notification of fence + * @release: callback called during destruction before object is freed. + */ +static inline void dma_bikeshed_fence_init(struct dma_bikeshed_fence *fence, + struct dma_buf *sync_buf, + uint32_t seqno_ofs, uint32_t seqno, void *priv, + int (*enable_signaling)(struct dma_bikeshed_fence *fence), + void (*release)(struct dma_bikeshed_fence *fence)) +{ + BUG_ON(!fence || !sync_buf || !enable_signaling); + + __dma_fence_init(&fence->base, &dma_bikeshed_fence_ops, priv); + + get_dma_buf(sync_buf); + fence->sync_buf = sync_buf; + fence->seqno_ofs = seqno_ofs; + fence->seqno = seqno; + fence->enable_signaling = enable_signaling; +} + +#else +// TODO +#endif /* CONFIG_DMA_SHARED_BUFFER */ + +#endif /* __DMA_BIKESHED_FENCE_H__ */
Signed-off-by: Maarten Lankhorst maarten.lankhorst@canonical.com
dma-buf-mgr handles the case of reserving single or multiple dma-bufs while trying to prevent deadlocks from buffers being reserved simultaneously. For this to happen extra functions have been introduced:
+ dma_buf_reserve() + dma_buf_unreserve() + dma_buf_wait_unreserved()
Reserve a single buffer, optionally with a sequence to indicate this is part of a multi-dmabuf reservation. This function will return -EDEADLK and return immediately if reserving would cause a deadlock. In case a single buffer is being reserved, no sequence is needed, otherwise please use the dmabufmgr calls.
If you want to attach a exclusive dma-fence, you have to wait until all shared fences have signalled completion. If there are none, or if a shared fence has to be attached, wait until last exclusive fence has signalled completion.
The new fence has to be attached before unreserving the buffer, and in exclusive mode all previous fences will have be removed from the buffer, and unreffed when done with it.
dmabufmgr methods:
+ dmabufmgr_validate_init() This function inits a dmabufmgr_validate structure and appends it to the tail of the list, with refcount set to 1. + dmabufmgr_validate_put() Convenience function to unref and free a dmabufmgr_validate structure. However if it's used for custom callback signalling, a custom function should be implemented.
+ dmabufmgr_reserve_buffers() This function takes a linked list of dmabufmgr_validate's, each one requires the following members to be set by the caller: - validate->head, list head - validate->bo, must be set to the dma-buf to reserve. - validate->shared, set to true if opened in shared mode. - validate->priv, can be used by the caller to identify this buffer.
This function will then set the following members on succesful completion:
- validate->num_fences, amount of valid fences to wait on before this buffer can be accessed. This can be 0. - validate->fences[0...num_fences-1] fences to wait on
+ dmabufmgr_backoff_reservation() This can be used when the caller encounters an error between reservation and usage. No new fence will be attached and all reservations will be undone without side effects.
+ dmabufmgr_fence_buffer_objects Upon successful completion a new fence will have to be attached. This function releases old fences and attaches the new one.
+ dmabufmgr_wait_completed_cpu A simple cpu waiter convenience function. Waits until all fences have signalled completion before returning.
The rationale of refcounting dmabufmgr_validate lies in the wait dma_fence_cb wait member. Before calling dma_fence_add_callback you should increase the refcount on dmabufmgr_validate with dmabufmgr_validate_get, and on signal completion you should call kref_put(&val->refcount, custom_free_signal); after all callbacks have added you drop the refcount by 1 also, when refcount drops to 0 all callbacks have been signalled, and dmabufmgr_validate has been waited on and can be freed. Since this will require atomic spinlocks to unlink the list and signal completion, a deadlock could occur if you try to call add_callback otherwise, so the refcount is used as a means of preventing this from occuring by having your custom free function take a device specific lock, removing from list and freeing the data. The nice/evil part about this is that this will also guarantee no memory leaks can occur behind your back. This allows delays completion by moving the dmabufmgr_validate list to be a part of the committed reservation.
Signed-off-by: Maarten Lankhorst maarten.lankhorst@canonical.com --- drivers/base/Makefile | 3 - drivers/base/dma-buf-mgr.c | 240 +++++++++++++++++++++++++++++++++++++++++++ drivers/base/dma-buf.c | 113 ++++++++++++++++++++ drivers/base/dma-fence.c | 1 include/linux/dma-buf-mgr.h | 123 ++++++++++++++++++++++ include/linux/dma-buf.h | 31 ++++++ include/linux/dma-fence.h | 2 7 files changed, 511 insertions(+), 2 deletions(-) create mode 100644 drivers/base/dma-buf-mgr.c create mode 100644 include/linux/dma-buf-mgr.h
diff --git a/drivers/base/Makefile b/drivers/base/Makefile index 1e7723b..819281a 100644 --- a/drivers/base/Makefile +++ b/drivers/base/Makefile @@ -10,7 +10,8 @@ obj-$(CONFIG_CMA) += dma-contiguous.o obj-y += power/ obj-$(CONFIG_HAS_DMA) += dma-mapping.o obj-$(CONFIG_HAVE_GENERIC_DMA_COHERENT) += dma-coherent.o -obj-$(CONFIG_DMA_SHARED_BUFFER) += dma-buf.o dma-fence.o dma-bikeshed-fence.o +obj-$(CONFIG_DMA_SHARED_BUFFER) += dma-buf.o dma-fence.o dma-buf-mgr.o \ + dma-bikeshed-fence.o obj-$(CONFIG_ISA) += isa.o obj-$(CONFIG_FW_LOADER) += firmware_class.o obj-$(CONFIG_NUMA) += node.o diff --git a/drivers/base/dma-buf-mgr.c b/drivers/base/dma-buf-mgr.c new file mode 100644 index 0000000..fbcd631 --- /dev/null +++ b/drivers/base/dma-buf-mgr.c @@ -0,0 +1,240 @@ +/* + * Copyright (C) 2012 Canonical Ltd + * + * Based on ttm_bo.c which bears the following copyright notice, + * but is dual licensed: + * + * Copyright (c) 2006-2009 VMware, Inc., Palo Alto, CA., USA + * All Rights Reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the + * "Software"), to deal in the Software without restriction, including + * without limitation the rights to use, copy, modify, merge, publish, + * distribute, sub license, and/or sell copies of the Software, and to + * permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * The above copyright notice and this permission notice (including the + * next paragraph) shall be included in all copies or substantial portions + * of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDERS, AUTHORS AND/OR ITS SUPPLIERS BE LIABLE FOR ANY CLAIM, + * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR + * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE + * USE OR OTHER DEALINGS IN THE SOFTWARE. + * + **************************************************************************/ +/* + * Authors: Thomas Hellstrom <thellstrom-at-vmware-dot-com> + */ + + +#include <linux/dma-buf-mgr.h> +#include <linux/export.h> +#include <linux/sched.h> +#include <linux/slab.h> + +static void dmabufmgr_backoff_reservation_locked(struct list_head *list) +{ + struct dmabufmgr_validate *entry; + + list_for_each_entry(entry, list, head) { + struct dma_buf *bo = entry->bo; + if (!entry->reserved) + continue; + entry->reserved = false; + + entry->num_fences = 0; + + atomic_set(&bo->reserved, 0); + wake_up_all(&bo->event_queue); + } +} + +static int +dmabufmgr_wait_unreserved_locked(struct list_head *list, + struct dma_buf *bo) +{ + int ret; + + spin_unlock(&dma_buf_reserve_lock); + ret = dma_buf_wait_unreserved(bo, true); + spin_lock(&dma_buf_reserve_lock); + if (unlikely(ret != 0)) + dmabufmgr_backoff_reservation_locked(list); + return ret; +} + +void +dmabufmgr_backoff_reservation(struct list_head *list) +{ + if (list_empty(list)) + return; + + spin_lock(&dma_buf_reserve_lock); + dmabufmgr_backoff_reservation_locked(list); + spin_unlock(&dma_buf_reserve_lock); +} +EXPORT_SYMBOL_GPL(dmabufmgr_backoff_reservation); + +int +dmabufmgr_reserve_buffers(struct list_head *list) +{ + struct dmabufmgr_validate *entry; + int ret; + u32 val_seq; + + if (list_empty(list)) + return 0; + + list_for_each_entry(entry, list, head) { + entry->reserved = false; + entry->num_fences = 0; + } + +retry: + spin_lock(&dma_buf_reserve_lock); + val_seq = atomic_inc_return(&dma_buf_reserve_counter); + + list_for_each_entry(entry, list, head) { + struct dma_buf *bo = entry->bo; + +retry_this_bo: + ret = dma_buf_reserve_locked(bo, true, true, true, val_seq); + switch (ret) { + case 0: + break; + case -EBUSY: + ret = dmabufmgr_wait_unreserved_locked(list, bo); + if (unlikely(ret != 0)) { + spin_unlock(&dma_buf_reserve_lock); + return ret; + } + goto retry_this_bo; + case -EAGAIN: + dmabufmgr_backoff_reservation_locked(list); + spin_unlock(&dma_buf_reserve_lock); + ret = dma_buf_wait_unreserved(bo, true); + if (unlikely(ret != 0)) + return ret; + goto retry; + default: + dmabufmgr_backoff_reservation_locked(list); + spin_unlock(&dma_buf_reserve_lock); + return ret; + } + + entry->reserved = true; + + if (entry->shared && + bo->fence_shared_count == DMA_BUF_MAX_SHARED_FENCE) { + WARN_ON_ONCE(1); + dmabufmgr_backoff_reservation_locked(list); + spin_unlock(&dma_buf_reserve_lock); + return -EINVAL; + } + + if (!entry->shared && bo->fence_shared_count) { + entry->num_fences = bo->fence_shared_count; + BUILD_BUG_ON(sizeof(entry->fences) != sizeof(bo->fence_shared)); + memcpy(entry->fences, bo->fence_shared, sizeof(bo->fence_shared)); + } else if (bo->fence_excl) { + entry->num_fences = 1; + entry->fences[0] = bo->fence_excl; + } else + entry->num_fences = 0; + } + spin_unlock(&dma_buf_reserve_lock); + + return 0; +} +EXPORT_SYMBOL_GPL(dmabufmgr_reserve_buffers); + +static int +dmabufmgr_wait_single(struct dmabufmgr_validate *val, bool intr, bool lazy, + unsigned long timeout) +{ + int i, ret = 0; + + for (i = 0; i < val->num_fences && !ret; i++) + ret = dma_fence_wait(val->fences[i], intr, timeout); + return ret; +} + +int +dmabufmgr_wait_completed_cpu(struct list_head *list, bool intr, bool lazy) +{ + struct dmabufmgr_validate *entry; + unsigned long timeout = jiffies + 4 * HZ; + int ret; + + list_for_each_entry(entry, list, head) { + ret = dmabufmgr_wait_single(entry, intr, lazy, timeout); + if (ret && ret != -ERESTARTSYS) + pr_err("waiting returns %i\n", ret); + if (ret) + return ret; + } + return 0; +} +EXPORT_SYMBOL_GPL(dmabufmgr_wait_completed_cpu); + +void +dmabufmgr_fence_buffer_objects(struct dma_fence *fence, struct list_head *list) +{ + struct dmabufmgr_validate *entry; + struct dma_buf *bo; + + if (list_empty(list) || WARN_ON(!fence)) + return; + + /* Until deferred fput hits mainline, release old things here */ + list_for_each_entry(entry, list, head) { + bo = entry->bo; + + if (!entry->shared) { + int i; + for (i = 0; i < bo->fence_shared_count; ++i) { + dma_fence_put(bo->fence_shared[i]); + bo->fence_shared[i] = NULL; + } + bo->fence_shared_count = 0; + if (bo->fence_excl) { + dma_fence_put(bo->fence_excl); + bo->fence_excl = NULL; + } + } + + entry->reserved = false; + } + + spin_lock(&dma_buf_reserve_lock); + + list_for_each_entry(entry, list, head) { + bo = entry->bo; + + dma_fence_get(fence); + if (entry->shared) + bo->fence_shared[bo->fence_shared_count++] = fence; + else + bo->fence_excl = fence; + + dma_buf_unreserve_locked(bo); + } + + spin_unlock(&dma_buf_reserve_lock); +} +EXPORT_SYMBOL_GPL(dmabufmgr_fence_buffer_objects); + +void dmabufmgr_validate_free(struct kref *ref) +{ + struct dmabufmgr_validate *val; + val = container_of(ref, struct dmabufmgr_validate, refcount); + list_del(&val->head); + kfree(val); +} +EXPORT_SYMBOL_GPL(dmabufmgr_validate_free); diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c index 058d616..602dd77 100644 --- a/drivers/base/dma-buf.c +++ b/drivers/base/dma-buf.c @@ -27,12 +27,17 @@ #include <linux/dma-buf.h> #include <linux/anon_inodes.h> #include <linux/export.h> +#include <linux/sched.h> + +atomic_t dma_buf_reserve_counter = ATOMIC_INIT(1); +DEFINE_SPINLOCK(dma_buf_reserve_lock);
static inline int is_dma_buf_file(struct file *);
static int dma_buf_release(struct inode *inode, struct file *file) { struct dma_buf *dmabuf; + int i;
if (!is_dma_buf_file(file)) return -EINVAL; @@ -40,6 +45,15 @@ static int dma_buf_release(struct inode *inode, struct file *file) dmabuf = file->private_data;
dmabuf->ops->release(dmabuf); + + BUG_ON(waitqueue_active(&dmabuf->event_queue)); + BUG_ON(atomic_read(&dmabuf->reserved)); + + if (dmabuf->fence_excl) + dma_fence_put(dmabuf->fence_excl); + for (i = 0; i < dmabuf->fence_shared_count; ++i) + dma_fence_put(dmabuf->fence_shared[i]); + kfree(dmabuf); return 0; } @@ -119,6 +133,7 @@ struct dma_buf *dma_buf_export(void *priv, const struct dma_buf_ops *ops,
mutex_init(&dmabuf->lock); INIT_LIST_HEAD(&dmabuf->attachments); + init_waitqueue_head(&dmabuf->event_queue);
return dmabuf; } @@ -567,3 +582,101 @@ void dma_buf_vunmap(struct dma_buf *dmabuf, void *vaddr) dmabuf->ops->vunmap(dmabuf, vaddr); } EXPORT_SYMBOL_GPL(dma_buf_vunmap); + +int +dma_buf_reserve_locked(struct dma_buf *dmabuf, bool interruptible, + bool no_wait, bool use_sequence, u32 sequence) +{ + int ret; + + while (unlikely(atomic_cmpxchg(&dmabuf->reserved, 0, 1) != 0)) { + /** + * Deadlock avoidance for multi-dmabuf reserving. + */ + if (use_sequence && dmabuf->seq_valid) { + /** + * We've already reserved this one. + */ + if (unlikely(sequence == dmabuf->val_seq)) + return -EDEADLK; + /** + * Already reserved by a thread that will not back + * off for us. We need to back off. + */ + if (unlikely(sequence - dmabuf->val_seq < (1 << 31))) + return -EAGAIN; + } + + if (no_wait) + return -EBUSY; + + spin_unlock(&dma_buf_reserve_lock); + ret = dma_buf_wait_unreserved(dmabuf, interruptible); + spin_lock(&dma_buf_reserve_lock); + + if (unlikely(ret)) + return ret; + } + + if (use_sequence) { + /** + * Wake up waiters that may need to recheck for deadlock, + * if we decreased the sequence number. + */ + if (unlikely((dmabuf->val_seq - sequence < (1 << 31)) + || !dmabuf->seq_valid)) + wake_up_all(&dmabuf->event_queue); + + dmabuf->val_seq = sequence; + dmabuf->seq_valid = true; + } else { + dmabuf->seq_valid = false; + } + + return 0; +} +EXPORT_SYMBOL_GPL(dma_buf_reserve_locked); + +int +dma_buf_reserve(struct dma_buf *dmabuf, bool interruptible, bool no_wait, + bool use_sequence, u32 sequence) +{ + int ret; + + spin_lock(&dma_buf_reserve_lock); + ret = dma_buf_reserve_locked(dmabuf, interruptible, no_wait, + use_sequence, sequence); + spin_unlock(&dma_buf_reserve_lock); + + return ret; +} +EXPORT_SYMBOL_GPL(dma_buf_reserve); + +int +dma_buf_wait_unreserved(struct dma_buf *dmabuf, bool interruptible) +{ + if (interruptible) { + return wait_event_interruptible(dmabuf->event_queue, + atomic_read(&dmabuf->reserved) == 0); + } else { + wait_event(dmabuf->event_queue, + atomic_read(&dmabuf->reserved) == 0); + return 0; + } +} +EXPORT_SYMBOL_GPL(dma_buf_wait_unreserved); + +void dma_buf_unreserve_locked(struct dma_buf *dmabuf) +{ + atomic_set(&dmabuf->reserved, 0); + wake_up_all(&dmabuf->event_queue); +} +EXPORT_SYMBOL_GPL(dma_buf_unreserve_locked); + +void dma_buf_unreserve(struct dma_buf *dmabuf) +{ + spin_lock(&dma_buf_reserve_lock); + dma_buf_unreserve_locked(dmabuf); + spin_unlock(&dma_buf_reserve_lock); +} +EXPORT_SYMBOL_GPL(dma_buf_unreserve); diff --git a/drivers/base/dma-fence.c b/drivers/base/dma-fence.c index c280ee7..f104ad5 100644 --- a/drivers/base/dma-fence.c +++ b/drivers/base/dma-fence.c @@ -20,6 +20,7 @@ #include <linux/slab.h> #include <linux/sched.h> #include <linux/export.h> +#include <linux/dma-buf.h> #include <linux/dma-fence.h>
/** diff --git a/include/linux/dma-buf-mgr.h b/include/linux/dma-buf-mgr.h new file mode 100644 index 0000000..386b874 --- /dev/null +++ b/include/linux/dma-buf-mgr.h @@ -0,0 +1,123 @@ +/* + * Header file for dma buffer sharing framework. + * + * Copyright(C) 2011 Linaro Limited. All rights reserved. + * Author: Sumit Semwal sumit.semwal@ti.com + * + * Many thanks to linaro-mm-sig list, and specially + * Arnd Bergmann arnd@arndb.de, Rob Clark rob@ti.com and + * Daniel Vetter daniel@ffwll.ch for their support in creation and + * refining of this idea. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published by + * the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program. If not, see http://www.gnu.org/licenses/. + */ +#ifndef __DMA_BUF_MGR_H__ +#define __DMA_BUF_MGR_H__ + +#include <linux/dma-buf.h> +#include <linux/list.h> + +/** based on ttm_execbuf_util + */ +struct dmabufmgr_validate { + struct list_head head; + struct kref refcount; + + /* internal use: signals if reservation is succesful on this buffer */ + bool reserved; + + bool shared; + struct dma_buf *bo; + void *priv; + + unsigned num_fences, num_waits; + struct dma_fence *fences[DMA_BUF_MAX_SHARED_FENCE]; + struct dma_fence_cb wait[DMA_BUF_MAX_SHARED_FENCE]; +}; + +#ifdef CONFIG_DMA_SHARED_BUFFER + +static inline void +dmabufmgr_validate_init(struct dmabufmgr_validate *val, + struct list_head *list, struct dma_buf *bo, + void *priv, bool shared) +{ + kref_init(&val->refcount); + list_add_tail(&val->head, list); + val->bo = bo; + val->priv = priv; + val->shared = shared; +} + +void dmabufmgr_validate_free(struct kref *ref); + +static inline struct dmabufmgr_validate * +dmabufmgr_validate_get(struct dmabufmgr_validate *val) +{ + kref_get(&val->refcount); + return val; +} + +static inline bool +dmabufmgr_validate_put(struct dmabufmgr_validate *val) +{ + return kref_put(&val->refcount, dmabufmgr_validate_free); +} + +/** reserve a linked list of struct dmabufmgr_validate entries */ +extern int +dmabufmgr_reserve_buffers(struct list_head *list); + +/** Undo reservation */ +extern void +dmabufmgr_backoff_reservation(struct list_head *list); + +/** Commit reservation */ +extern void +dmabufmgr_fence_buffer_objects(struct dma_fence *fence, struct list_head *list); + +/** Wait for completion on cpu + * intr: interruptible wait + * lazy: try once every tick instead of busywait + */ +extern int +dmabufmgr_wait_completed_cpu(struct list_head *list, bool intr, bool lazy); + +#else /* CONFIG_DMA_SHARED_BUFFER */ + +/** reserve a linked list of struct dmabufmgr_validate entries */ +static inline int +dmabufmgr_reserve_buffers(struct list_head *list) +{ + return list_empty(list) ? 0 : -ENODEV; +} + +/** Undo reservation */ +static inline void +dmabufmgr_backoff_reservation(struct list_head *list) +{} + +/** Commit reservation */ +static inline void +dmabufmgr_fence_buffer_objects(struct dma_fence *fence, struct list_head *list) +{} + +static inline int +dmabufmgr_wait_completed_cpu(struct list_head *list, bool intr, bool lazy) +{ + return list_empty(list) ? 0 : -ENODEV; +} + +#endif /* CONFIG_DMA_SHARED_BUFFER */ + +#endif /* __DMA_BUF_MGR_H__ */ diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index 9533b9b..d3d76e5 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -35,6 +35,13 @@ struct device; struct dma_buf; struct dma_buf_attachment;
+#include <linux/dma-fence.h> + +extern atomic_t dma_buf_reserve_counter; +extern spinlock_t dma_buf_reserve_lock; + +#define DMA_BUF_MAX_SHARED_FENCE 8 + /** * struct dma_buf_ops - operations possible on struct dma_buf * @attach: [optional] allows different devices to 'attach' themselves to the @@ -113,6 +120,8 @@ struct dma_buf_ops { * @attachments: list of dma_buf_attachment that denotes all devices attached. * @ops: dma_buf_ops associated with this buffer object. * @priv: exporter specific private data for this buffer object. + * @bufmgr_entry: used by dmabufmgr + * @bufdev: used by dmabufmgr */ struct dma_buf { size_t size; @@ -122,6 +131,18 @@ struct dma_buf { /* mutex to serialize list manipulation and attach/detach */ struct mutex lock; void *priv; + + /** event queue for waking up when this dmabuf becomes unreserved */ + wait_queue_head_t event_queue; + + atomic_t reserved; + + /** These require dma_buf_reserve to be called before modification */ + bool seq_valid; + u32 val_seq; + struct dma_fence *fence_excl; + struct dma_fence *fence_shared[DMA_BUF_MAX_SHARED_FENCE]; + u32 fence_shared_count; };
/** @@ -188,6 +209,14 @@ int dma_buf_mmap(struct dma_buf *, struct vm_area_struct *, unsigned long); void *dma_buf_vmap(struct dma_buf *); void dma_buf_vunmap(struct dma_buf *, void *vaddr); +int dma_buf_reserve_locked(struct dma_buf *, bool intr, bool no_wait, + bool use_seq, u32 seq); +int dma_buf_reserve(struct dma_buf *, bool intr, bool no_wait, + bool use_seq, u32 seq); +int dma_buf_wait_unreserved(struct dma_buf *, bool interruptible); +void dma_buf_unreserve_locked(struct dma_buf *); +void dma_buf_unreserve(struct dma_buf *); + #else
static inline struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, @@ -300,6 +329,8 @@ static inline void *dma_buf_vmap(struct dma_buf *dmabuf) static inline void dma_buf_vunmap(struct dma_buf *dmabuf, void *vaddr) { } + +// TODO #endif /* CONFIG_DMA_SHARED_BUFFER */
#endif /* __DMA_BUF_H__ */ diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h index 70d12c0..ff50ea2 100644 --- a/include/linux/dma-fence.h +++ b/include/linux/dma-fence.h @@ -24,7 +24,7 @@ #include <linux/list.h> #include <linux/wait.h> #include <linux/list.h> -#include <linux/dma-buf.h> +#include <linux/kref.h>
struct dma_fence; struct dma_fence_ops;
Op 07-08-12 19:53, Maarten Lankhorst schreef:
A dma-fence can be attached to a buffer which is being filled or consumed by hw, to allow userspace to pass the buffer without waiting to another device. For example, userspace can call page_flip ioctl to display the next frame of graphics after kicking the GPU but while the GPU is still rendering. The display device sharing the buffer with the GPU would attach a callback to get notified when the GPU's rendering-complete IRQ fires, to update the scan-out address of the display, without having to wake up userspace.
I implemented this for intel and debugged it with intel <-> nouveau interaction. Unfortunately the nouveau patches aren't ready at this point, but the git repo I'm using is available at:
http://cgit.freedesktop.org/~mlankhorst/linux/
It has the patch series and a sample implementation for intel, based on drm-intel-next tree.
I tried to keep it deadlock and race condition free as much as possible, but locking gets complicated enough that if I'm unlucky something might have slipped through regardless.
Especially the locking in i915_gem_reset_requests, is screwed up. This shows what a real PITA it is to abort callbacks prematurely while keeping everything stable. As such, aborting requests should only be done in exceptional circumstances, in this case hardware died and things are already locked up anyhow..
~Maarten
Hi Maarten,
On 8 August 2012 00:17, Maarten Lankhorst maarten.lankhorst@canonical.com wrote:
Op 07-08-12 19:53, Maarten Lankhorst schreef:
A dma-fence can be attached to a buffer which is being filled or consumed by hw, to allow userspace to pass the buffer without waiting to another device. For example, userspace can call page_flip ioctl to display the next frame of graphics after kicking the GPU but while the GPU is still rendering. The display device sharing the buffer with the GPU would attach a callback to get notified when the GPU's rendering-complete IRQ fires, to update the scan-out address of the display, without having to wake up userspace.
Thanks for this patchset; Could you please also fill up Documentation/dma-buf-sharing.txt, to include the relevant bits?
We've tried to make sure the Documentation corresponding is kept up-to-date as the framework has grown, and new features are added to it - and I think features as important as dma-fence and dmabufmgr do warrant a healthy update.
I implemented this for intel and debugged it with intel <-> nouveau interaction. Unfortunately the nouveau patches aren't ready at this point, but the git repo I'm using is available at:
http://cgit.freedesktop.org/~mlankhorst/linux/
It has the patch series and a sample implementation for intel, based on drm-intel-next tree.
I tried to keep it deadlock and race condition free as much as possible, but locking gets complicated enough that if I'm unlucky something might have slipped through regardless.
Especially the locking in i915_gem_reset_requests, is screwed up. This shows what a real PITA it is to abort callbacks prematurely while keeping everything stable. As such, aborting requests should only be done in exceptional circumstances, in this case hardware died and things are already locked up anyhow..
~Maarten
Hey Sumit,
Op 08-08-12 08:35, Sumit Semwal schreef:
Hi Maarten,
On 8 August 2012 00:17, Maarten Lankhorst maarten.lankhorst@canonical.com wrote:
Op 07-08-12 19:53, Maarten Lankhorst schreef:
A dma-fence can be attached to a buffer which is being filled or consumed by hw, to allow userspace to pass the buffer without waiting to another device. For example, userspace can call page_flip ioctl to display the next frame of graphics after kicking the GPU but while the GPU is still rendering. The display device sharing the buffer with the GPU would attach a callback to get notified when the GPU's rendering-complete IRQ fires, to update the scan-out address of the display, without having to wake up userspace.
Thanks for this patchset; Could you please also fill up Documentation/dma-buf-sharing.txt, to include the relevant bits?
We've tried to make sure the Documentation corresponding is kept up-to-date as the framework has grown, and new features are added to it - and I think features as important as dma-fence and dmabufmgr do warrant a healthy update.
Ok I'll clean it up and add the documentation, one other question. If code that requires dmabuf needs to select CONFIG_DMA_SHARED_BUFFER, why does dma-buf.h have fallbacks for !CONFIG_DMA_SHARED_BUFFER? This seems weird, would you have any objection if I removed those?
~Maarten
linaro-mm-sig@lists.linaro.org