Hi Jonathan,
This is the V2 of my patchset that introduces a new userspace interface based on DMABUF objects to complement the fileio API, and adds write() support to the existing fileio API.
Changes since v1:
- the patches that were merged in v1 have been (obviously) dropped from this patchset; - the patch that was setting the write-combine cache setting has been dropped as well, as it was simply not useful. - [01/12]: * Only remove the outgoing queue, and keep the incoming queue, as we want the buffer to start streaming data as soon as it is enabled. * Remove IIO_BLOCK_STATE_DEQUEUED, since it is now functionally the same as IIO_BLOCK_STATE_DONE. - [02/12]: * Fix block->state not being reset in iio_dma_buffer_request_update() for output buffers. * Only update block->bytes_used once and add a comment about why we update it. * Add a comment about why we're setting a different state for output buffers in iio_dma_buffer_request_update() * Remove useless cast to bool (!!) in iio_dma_buffer_io() - [05/12]: Only allow the new IOCTLs on the buffer FD created with IIO_BUFFER_GET_FD_IOCTL(). - [12/12]: * Explicitly state that the new interface is optional and is not implemented by all drivers. * The IOCTLs can now only be called on the buffer FD returned by IIO_BUFFER_GET_FD_IOCTL. * Move the page up a bit in the index since it is core stuff and not driver-specific.
The patches not listed here have not been modified since v1.
Cheers, -Paul
Alexandru Ardelean (1): iio: buffer-dma: split iio_dma_buffer_fileio_free() function
Paul Cercueil (11): iio: buffer-dma: Get rid of outgoing queue iio: buffer-dma: Enable buffer write support iio: buffer-dmaengine: Support specifying buffer direction iio: buffer-dmaengine: Enable write support iio: core: Add new DMABUF interface infrastructure iio: buffer-dma: Use DMABUFs instead of custom solution iio: buffer-dma: Implement new DMABUF based userspace API iio: buffer-dmaengine: Support new DMABUF based userspace API iio: core: Add support for cyclic buffers iio: buffer-dmaengine: Add support for cyclic buffers Documentation: iio: Document high-speed DMABUF based API
Documentation/driver-api/dma-buf.rst | 2 + Documentation/iio/dmabuf_api.rst | 94 +++ Documentation/iio/index.rst | 2 + drivers/iio/adc/adi-axi-adc.c | 3 +- drivers/iio/buffer/industrialio-buffer-dma.c | 610 ++++++++++++++---- .../buffer/industrialio-buffer-dmaengine.c | 42 +- drivers/iio/industrialio-buffer.c | 60 ++ include/linux/iio/buffer-dma.h | 38 +- include/linux/iio/buffer-dmaengine.h | 5 +- include/linux/iio/buffer_impl.h | 8 + include/uapi/linux/iio/buffer.h | 30 + 11 files changed, 749 insertions(+), 145 deletions(-) create mode 100644 Documentation/iio/dmabuf_api.rst
The buffer-dma code was using two queues, incoming and outgoing, to manage the state of the blocks in use.
While this totally works, it adds some complexity to the code, especially since the code only manages 2 blocks. It is much easier to just check each block's state manually, and keep a counter for the next block to dequeue.
Since the new DMABUF based API wouldn't use the outgoing queue anyway, getting rid of it now makes the upcoming changes simpler.
With this change, the IIO_BLOCK_STATE_DEQUEUED is now useless, and can be removed.
v2: - Only remove the outgoing queue, and keep the incoming queue, as we want the buffer to start streaming data as soon as it is enabled. - Remove IIO_BLOCK_STATE_DEQUEUED, since it is now functionally the same as IIO_BLOCK_STATE_DONE.
Signed-off-by: Paul Cercueil paul@crapouillou.net --- drivers/iio/buffer/industrialio-buffer-dma.c | 44 ++++++++++---------- include/linux/iio/buffer-dma.h | 7 ++-- 2 files changed, 26 insertions(+), 25 deletions(-)
diff --git a/drivers/iio/buffer/industrialio-buffer-dma.c b/drivers/iio/buffer/industrialio-buffer-dma.c index d348af8b9705..1fc91467d1aa 100644 --- a/drivers/iio/buffer/industrialio-buffer-dma.c +++ b/drivers/iio/buffer/industrialio-buffer-dma.c @@ -179,7 +179,7 @@ static struct iio_dma_buffer_block *iio_dma_buffer_alloc_block( }
block->size = size; - block->state = IIO_BLOCK_STATE_DEQUEUED; + block->state = IIO_BLOCK_STATE_DONE; block->queue = queue; INIT_LIST_HEAD(&block->head); kref_init(&block->kref); @@ -191,16 +191,8 @@ static struct iio_dma_buffer_block *iio_dma_buffer_alloc_block(
static void _iio_dma_buffer_block_done(struct iio_dma_buffer_block *block) { - struct iio_dma_buffer_queue *queue = block->queue; - - /* - * The buffer has already been freed by the application, just drop the - * reference. - */ - if (block->state != IIO_BLOCK_STATE_DEAD) { + if (block->state != IIO_BLOCK_STATE_DEAD) block->state = IIO_BLOCK_STATE_DONE; - list_add_tail(&block->head, &queue->outgoing); - } }
/** @@ -261,7 +253,6 @@ static bool iio_dma_block_reusable(struct iio_dma_buffer_block *block) * not support abort and has not given back the block yet. */ switch (block->state) { - case IIO_BLOCK_STATE_DEQUEUED: case IIO_BLOCK_STATE_QUEUED: case IIO_BLOCK_STATE_DONE: return true; @@ -317,7 +308,6 @@ int iio_dma_buffer_request_update(struct iio_buffer *buffer) * dead. This means we can reset the lists without having to fear * corrution. */ - INIT_LIST_HEAD(&queue->outgoing); spin_unlock_irq(&queue->list_lock);
INIT_LIST_HEAD(&queue->incoming); @@ -456,14 +446,20 @@ static struct iio_dma_buffer_block *iio_dma_buffer_dequeue( struct iio_dma_buffer_queue *queue) { struct iio_dma_buffer_block *block; + unsigned int idx;
spin_lock_irq(&queue->list_lock); - block = list_first_entry_or_null(&queue->outgoing, struct - iio_dma_buffer_block, head); - if (block != NULL) { - list_del(&block->head); - block->state = IIO_BLOCK_STATE_DEQUEUED; + + idx = queue->fileio.next_dequeue; + block = queue->fileio.blocks[idx]; + + if (block->state == IIO_BLOCK_STATE_DONE) { + idx = (idx + 1) % ARRAY_SIZE(queue->fileio.blocks); + queue->fileio.next_dequeue = idx; + } else { + block = NULL; } + spin_unlock_irq(&queue->list_lock);
return block; @@ -539,6 +535,7 @@ size_t iio_dma_buffer_data_available(struct iio_buffer *buf) struct iio_dma_buffer_queue *queue = iio_buffer_to_queue(buf); struct iio_dma_buffer_block *block; size_t data_available = 0; + unsigned int i;
/* * For counting the available bytes we'll use the size of the block not @@ -552,8 +549,15 @@ size_t iio_dma_buffer_data_available(struct iio_buffer *buf) data_available += queue->fileio.active_block->size;
spin_lock_irq(&queue->list_lock); - list_for_each_entry(block, &queue->outgoing, head) - data_available += block->size; + + for (i = 0; i < ARRAY_SIZE(queue->fileio.blocks); i++) { + block = queue->fileio.blocks[i]; + + if (block != queue->fileio.active_block + && block->state == IIO_BLOCK_STATE_DONE) + data_available += block->size; + } + spin_unlock_irq(&queue->list_lock); mutex_unlock(&queue->lock);
@@ -617,7 +621,6 @@ int iio_dma_buffer_init(struct iio_dma_buffer_queue *queue, queue->ops = ops;
INIT_LIST_HEAD(&queue->incoming); - INIT_LIST_HEAD(&queue->outgoing);
mutex_init(&queue->lock); spin_lock_init(&queue->list_lock); @@ -645,7 +648,6 @@ void iio_dma_buffer_exit(struct iio_dma_buffer_queue *queue) continue; queue->fileio.blocks[i]->state = IIO_BLOCK_STATE_DEAD; } - INIT_LIST_HEAD(&queue->outgoing); spin_unlock_irq(&queue->list_lock);
INIT_LIST_HEAD(&queue->incoming); diff --git a/include/linux/iio/buffer-dma.h b/include/linux/iio/buffer-dma.h index 6564bdcdac66..18d3702fa95d 100644 --- a/include/linux/iio/buffer-dma.h +++ b/include/linux/iio/buffer-dma.h @@ -19,14 +19,12 @@ struct device;
/** * enum iio_block_state - State of a struct iio_dma_buffer_block - * @IIO_BLOCK_STATE_DEQUEUED: Block is not queued * @IIO_BLOCK_STATE_QUEUED: Block is on the incoming queue * @IIO_BLOCK_STATE_ACTIVE: Block is currently being processed by the DMA * @IIO_BLOCK_STATE_DONE: Block is on the outgoing queue * @IIO_BLOCK_STATE_DEAD: Block has been marked as to be freed */ enum iio_block_state { - IIO_BLOCK_STATE_DEQUEUED, IIO_BLOCK_STATE_QUEUED, IIO_BLOCK_STATE_ACTIVE, IIO_BLOCK_STATE_DONE, @@ -73,12 +71,15 @@ struct iio_dma_buffer_block { * @active_block: Block being used in read() * @pos: Read offset in the active block * @block_size: Size of each block + * @next_dequeue: index of next block that will be dequeued */ struct iio_dma_buffer_queue_fileio { struct iio_dma_buffer_block *blocks[2]; struct iio_dma_buffer_block *active_block; size_t pos; size_t block_size; + + unsigned int next_dequeue; };
/** @@ -93,7 +94,6 @@ struct iio_dma_buffer_queue_fileio { * list and typically also a list of active blocks in the part that handles * the DMA controller * @incoming: List of buffers on the incoming queue - * @outgoing: List of buffers on the outgoing queue * @active: Whether the buffer is currently active * @fileio: FileIO state */ @@ -105,7 +105,6 @@ struct iio_dma_buffer_queue { struct mutex lock; spinlock_t list_lock; struct list_head incoming; - struct list_head outgoing;
bool active;
On Mon, 7 Feb 2022 12:59:22 +0000 Paul Cercueil paul@crapouillou.net wrote:
The buffer-dma code was using two queues, incoming and outgoing, to manage the state of the blocks in use.
While this totally works, it adds some complexity to the code, especially since the code only manages 2 blocks. It is much easier to just check each block's state manually, and keep a counter for the next block to dequeue.
Since the new DMABUF based API wouldn't use the outgoing queue anyway, getting rid of it now makes the upcoming changes simpler.
With this change, the IIO_BLOCK_STATE_DEQUEUED is now useless, and can be removed.
v2: - Only remove the outgoing queue, and keep the incoming queue, as we want the buffer to start streaming data as soon as it is enabled. - Remove IIO_BLOCK_STATE_DEQUEUED, since it is now functionally the same as IIO_BLOCK_STATE_DONE.
Signed-off-by: Paul Cercueil paul@crapouillou.net
Trivial process thing but change log should be here, not above as we don't want it to end up in the main git log.
drivers/iio/buffer/industrialio-buffer-dma.c | 44 ++++++++++---------- include/linux/iio/buffer-dma.h | 7 ++-- 2 files changed, 26 insertions(+), 25 deletions(-)
Hi Jonathan,
Le dim., févr. 13 2022 at 18:57:40 +0000, Jonathan Cameron jic23@kernel.org a écrit :
On Mon, 7 Feb 2022 12:59:22 +0000 Paul Cercueil paul@crapouillou.net wrote:
The buffer-dma code was using two queues, incoming and outgoing, to manage the state of the blocks in use.
While this totally works, it adds some complexity to the code, especially since the code only manages 2 blocks. It is much easier to just check each block's state manually, and keep a counter for the next block to dequeue.
Since the new DMABUF based API wouldn't use the outgoing queue anyway, getting rid of it now makes the upcoming changes simpler.
With this change, the IIO_BLOCK_STATE_DEQUEUED is now useless, and can be removed.
v2: - Only remove the outgoing queue, and keep the incoming queue, as we want the buffer to start streaming data as soon as it is enabled. - Remove IIO_BLOCK_STATE_DEQUEUED, since it is now functionally the same as IIO_BLOCK_STATE_DONE.
Signed-off-by: Paul Cercueil paul@crapouillou.net
Trivial process thing but change log should be here, not above as we don't want it to end up in the main git log.
I'm kinda used to do this now, it's the policy for sending patches to the DRM tree. I like it because "git notes" disappear after rebases and it's a pain. At least like this I don't lose the changelog.
But okay, I'll change it for v3, if there's a v3.
Cheers, -Paul
drivers/iio/buffer/industrialio-buffer-dma.c | 44 ++++++++++---------- include/linux/iio/buffer-dma.h | 7 ++-- 2 files changed, 26 insertions(+), 25 deletions(-)
On Mon, 7 Feb 2022 12:59:22 +0000 Paul Cercueil paul@crapouillou.net wrote:
The buffer-dma code was using two queues, incoming and outgoing, to manage the state of the blocks in use.
While this totally works, it adds some complexity to the code, especially since the code only manages 2 blocks. It is much easier to just check each block's state manually, and keep a counter for the next block to dequeue.
Since the new DMABUF based API wouldn't use the outgoing queue anyway, getting rid of it now makes the upcoming changes simpler.
With this change, the IIO_BLOCK_STATE_DEQUEUED is now useless, and can be removed.
v2: - Only remove the outgoing queue, and keep the incoming queue, as we want the buffer to start streaming data as soon as it is enabled. - Remove IIO_BLOCK_STATE_DEQUEUED, since it is now functionally the same as IIO_BLOCK_STATE_DONE.
Signed-off-by: Paul Cercueil paul@crapouillou.net
Hi Paul,
In the interests of moving things forward / simplifying what people need to look at: This change looks good to me on it's own.
Lars had some comments on v1. Lars, could you take look at this and verify if this versions addresses the points you raised (I think it does but they were your comments so better you judge)
Thanks,
Jonathan
drivers/iio/buffer/industrialio-buffer-dma.c | 44 ++++++++++---------- include/linux/iio/buffer-dma.h | 7 ++-- 2 files changed, 26 insertions(+), 25 deletions(-)
diff --git a/drivers/iio/buffer/industrialio-buffer-dma.c b/drivers/iio/buffer/industrialio-buffer-dma.c index d348af8b9705..1fc91467d1aa 100644 --- a/drivers/iio/buffer/industrialio-buffer-dma.c +++ b/drivers/iio/buffer/industrialio-buffer-dma.c @@ -179,7 +179,7 @@ static struct iio_dma_buffer_block *iio_dma_buffer_alloc_block( } block->size = size;
- block->state = IIO_BLOCK_STATE_DEQUEUED;
- block->state = IIO_BLOCK_STATE_DONE; block->queue = queue; INIT_LIST_HEAD(&block->head); kref_init(&block->kref);
@@ -191,16 +191,8 @@ static struct iio_dma_buffer_block *iio_dma_buffer_alloc_block( static void _iio_dma_buffer_block_done(struct iio_dma_buffer_block *block) {
- struct iio_dma_buffer_queue *queue = block->queue;
- /*
* The buffer has already been freed by the application, just drop the
* reference.
*/
- if (block->state != IIO_BLOCK_STATE_DEAD) {
- if (block->state != IIO_BLOCK_STATE_DEAD) block->state = IIO_BLOCK_STATE_DONE;
list_add_tail(&block->head, &queue->outgoing);
- }
} /** @@ -261,7 +253,6 @@ static bool iio_dma_block_reusable(struct iio_dma_buffer_block *block) * not support abort and has not given back the block yet. */ switch (block->state) {
- case IIO_BLOCK_STATE_DEQUEUED: case IIO_BLOCK_STATE_QUEUED: case IIO_BLOCK_STATE_DONE: return true;
@@ -317,7 +308,6 @@ int iio_dma_buffer_request_update(struct iio_buffer *buffer) * dead. This means we can reset the lists without having to fear * corrution. */
- INIT_LIST_HEAD(&queue->outgoing); spin_unlock_irq(&queue->list_lock);
INIT_LIST_HEAD(&queue->incoming); @@ -456,14 +446,20 @@ static struct iio_dma_buffer_block *iio_dma_buffer_dequeue( struct iio_dma_buffer_queue *queue) { struct iio_dma_buffer_block *block;
- unsigned int idx;
spin_lock_irq(&queue->list_lock);
- block = list_first_entry_or_null(&queue->outgoing, struct
iio_dma_buffer_block, head);
- if (block != NULL) {
list_del(&block->head);
block->state = IIO_BLOCK_STATE_DEQUEUED;
- idx = queue->fileio.next_dequeue;
- block = queue->fileio.blocks[idx];
- if (block->state == IIO_BLOCK_STATE_DONE) {
idx = (idx + 1) % ARRAY_SIZE(queue->fileio.blocks);
queue->fileio.next_dequeue = idx;
- } else {
}block = NULL;
- spin_unlock_irq(&queue->list_lock);
return block; @@ -539,6 +535,7 @@ size_t iio_dma_buffer_data_available(struct iio_buffer *buf) struct iio_dma_buffer_queue *queue = iio_buffer_to_queue(buf); struct iio_dma_buffer_block *block; size_t data_available = 0;
- unsigned int i;
/* * For counting the available bytes we'll use the size of the block not @@ -552,8 +549,15 @@ size_t iio_dma_buffer_data_available(struct iio_buffer *buf) data_available += queue->fileio.active_block->size; spin_lock_irq(&queue->list_lock);
- list_for_each_entry(block, &queue->outgoing, head)
data_available += block->size;
- for (i = 0; i < ARRAY_SIZE(queue->fileio.blocks); i++) {
block = queue->fileio.blocks[i];
if (block != queue->fileio.active_block
&& block->state == IIO_BLOCK_STATE_DONE)
data_available += block->size;
- }
- spin_unlock_irq(&queue->list_lock); mutex_unlock(&queue->lock);
@@ -617,7 +621,6 @@ int iio_dma_buffer_init(struct iio_dma_buffer_queue *queue, queue->ops = ops; INIT_LIST_HEAD(&queue->incoming);
- INIT_LIST_HEAD(&queue->outgoing);
mutex_init(&queue->lock); spin_lock_init(&queue->list_lock); @@ -645,7 +648,6 @@ void iio_dma_buffer_exit(struct iio_dma_buffer_queue *queue) continue; queue->fileio.blocks[i]->state = IIO_BLOCK_STATE_DEAD; }
- INIT_LIST_HEAD(&queue->outgoing); spin_unlock_irq(&queue->list_lock);
INIT_LIST_HEAD(&queue->incoming); diff --git a/include/linux/iio/buffer-dma.h b/include/linux/iio/buffer-dma.h index 6564bdcdac66..18d3702fa95d 100644 --- a/include/linux/iio/buffer-dma.h +++ b/include/linux/iio/buffer-dma.h @@ -19,14 +19,12 @@ struct device; /**
- enum iio_block_state - State of a struct iio_dma_buffer_block
*/
- @IIO_BLOCK_STATE_DEQUEUED: Block is not queued
- @IIO_BLOCK_STATE_QUEUED: Block is on the incoming queue
- @IIO_BLOCK_STATE_ACTIVE: Block is currently being processed by the DMA
- @IIO_BLOCK_STATE_DONE: Block is on the outgoing queue
- @IIO_BLOCK_STATE_DEAD: Block has been marked as to be freed
enum iio_block_state {
- IIO_BLOCK_STATE_DEQUEUED, IIO_BLOCK_STATE_QUEUED, IIO_BLOCK_STATE_ACTIVE, IIO_BLOCK_STATE_DONE,
@@ -73,12 +71,15 @@ struct iio_dma_buffer_block {
- @active_block: Block being used in read()
- @pos: Read offset in the active block
- @block_size: Size of each block
*/
- @next_dequeue: index of next block that will be dequeued
struct iio_dma_buffer_queue_fileio { struct iio_dma_buffer_block *blocks[2]; struct iio_dma_buffer_block *active_block; size_t pos; size_t block_size;
- unsigned int next_dequeue;
}; /** @@ -93,7 +94,6 @@ struct iio_dma_buffer_queue_fileio {
- list and typically also a list of active blocks in the part that handles
- the DMA controller
- @incoming: List of buffers on the incoming queue
*/
- @outgoing: List of buffers on the outgoing queue
- @active: Whether the buffer is currently active
- @fileio: FileIO state
@@ -105,7 +105,6 @@ struct iio_dma_buffer_queue { struct mutex lock; spinlock_t list_lock; struct list_head incoming;
- struct list_head outgoing;
bool active;
Adding write support to the buffer-dma code is easy - the write() function basically needs to do the exact same thing as the read() function: dequeue a block, read or write the data, enqueue the block when entirely processed.
Therefore, the iio_buffer_dma_read() and the new iio_buffer_dma_write() now both call a function iio_buffer_dma_io(), which will perform this task.
The .space_available() callback can return the exact same value as the .data_available() callback for input buffers, since in both cases we count the exact same thing (the number of bytes in each available block).
Note that we preemptively reset block->bytes_used to the buffer's size in iio_dma_buffer_request_update(), as in the future the iio_dma_buffer_enqueue() function won't reset it.
v2: - Fix block->state not being reset in iio_dma_buffer_request_update() for output buffers. - Only update block->bytes_used once and add a comment about why we update it. - Add a comment about why we're setting a different state for output buffers in iio_dma_buffer_request_update() - Remove useless cast to bool (!!) in iio_dma_buffer_io()
Signed-off-by: Paul Cercueil paul@crapouillou.net Reviewed-by: Alexandru Ardelean ardeleanalex@gmail.com --- drivers/iio/buffer/industrialio-buffer-dma.c | 88 ++++++++++++++++---- include/linux/iio/buffer-dma.h | 7 ++ 2 files changed, 79 insertions(+), 16 deletions(-)
diff --git a/drivers/iio/buffer/industrialio-buffer-dma.c b/drivers/iio/buffer/industrialio-buffer-dma.c index 1fc91467d1aa..a9f1b673374f 100644 --- a/drivers/iio/buffer/industrialio-buffer-dma.c +++ b/drivers/iio/buffer/industrialio-buffer-dma.c @@ -195,6 +195,18 @@ static void _iio_dma_buffer_block_done(struct iio_dma_buffer_block *block) block->state = IIO_BLOCK_STATE_DONE; }
+static void iio_dma_buffer_queue_wake(struct iio_dma_buffer_queue *queue) +{ + __poll_t flags; + + if (queue->buffer.direction == IIO_BUFFER_DIRECTION_IN) + flags = EPOLLIN | EPOLLRDNORM; + else + flags = EPOLLOUT | EPOLLWRNORM; + + wake_up_interruptible_poll(&queue->buffer.pollq, flags); +} + /** * iio_dma_buffer_block_done() - Indicate that a block has been completed * @block: The completed block @@ -212,7 +224,7 @@ void iio_dma_buffer_block_done(struct iio_dma_buffer_block *block) spin_unlock_irqrestore(&queue->list_lock, flags);
iio_buffer_block_put_atomic(block); - wake_up_interruptible_poll(&queue->buffer.pollq, EPOLLIN | EPOLLRDNORM); + iio_dma_buffer_queue_wake(queue); } EXPORT_SYMBOL_GPL(iio_dma_buffer_block_done);
@@ -241,7 +253,7 @@ void iio_dma_buffer_block_list_abort(struct iio_dma_buffer_queue *queue, } spin_unlock_irqrestore(&queue->list_lock, flags);
- wake_up_interruptible_poll(&queue->buffer.pollq, EPOLLIN | EPOLLRDNORM); + iio_dma_buffer_queue_wake(queue); } EXPORT_SYMBOL_GPL(iio_dma_buffer_block_list_abort);
@@ -335,8 +347,24 @@ int iio_dma_buffer_request_update(struct iio_buffer *buffer) queue->fileio.blocks[i] = block; }
- block->state = IIO_BLOCK_STATE_QUEUED; - list_add_tail(&block->head, &queue->incoming); + /* + * block->bytes_used may have been modified previously, e.g. by + * iio_dma_buffer_block_list_abort(). Reset it here to the + * block's so that iio_dma_buffer_io() will work. + */ + block->bytes_used = block->size; + + /* + * If it's an input buffer, mark the block as queued, and + * iio_dma_buffer_enable() will submit it. Otherwise mark it as + * done, which means it's ready to be dequeued. + */ + if (queue->buffer.direction == IIO_BUFFER_DIRECTION_IN) { + block->state = IIO_BLOCK_STATE_QUEUED; + list_add_tail(&block->head, &queue->incoming); + } else { + block->state = IIO_BLOCK_STATE_DONE; + } }
out_unlock: @@ -465,20 +493,12 @@ static struct iio_dma_buffer_block *iio_dma_buffer_dequeue( return block; }
-/** - * iio_dma_buffer_read() - DMA buffer read callback - * @buffer: Buffer to read form - * @n: Number of bytes to read - * @user_buffer: Userspace buffer to copy the data to - * - * Should be used as the read callback for iio_buffer_access_ops - * struct for DMA buffers. - */ -int iio_dma_buffer_read(struct iio_buffer *buffer, size_t n, - char __user *user_buffer) +static int iio_dma_buffer_io(struct iio_buffer *buffer, + size_t n, char __user *user_buffer, bool is_write) { struct iio_dma_buffer_queue *queue = iio_buffer_to_queue(buffer); struct iio_dma_buffer_block *block; + void *addr; int ret;
if (n < buffer->bytes_per_datum) @@ -501,8 +521,13 @@ int iio_dma_buffer_read(struct iio_buffer *buffer, size_t n, n = rounddown(n, buffer->bytes_per_datum); if (n > block->bytes_used - queue->fileio.pos) n = block->bytes_used - queue->fileio.pos; + addr = block->vaddr + queue->fileio.pos;
- if (copy_to_user(user_buffer, block->vaddr + queue->fileio.pos, n)) { + if (is_write) + ret = copy_from_user(addr, user_buffer, n); + else + ret = copy_to_user(user_buffer, addr, n); + if (ret) { ret = -EFAULT; goto out_unlock; } @@ -521,8 +546,39 @@ int iio_dma_buffer_read(struct iio_buffer *buffer, size_t n,
return ret; } + +/** + * iio_dma_buffer_read() - DMA buffer read callback + * @buffer: Buffer to read form + * @n: Number of bytes to read + * @user_buffer: Userspace buffer to copy the data to + * + * Should be used as the read callback for iio_buffer_access_ops + * struct for DMA buffers. + */ +int iio_dma_buffer_read(struct iio_buffer *buffer, size_t n, + char __user *user_buffer) +{ + return iio_dma_buffer_io(buffer, n, user_buffer, false); +} EXPORT_SYMBOL_GPL(iio_dma_buffer_read);
+/** + * iio_dma_buffer_write() - DMA buffer write callback + * @buffer: Buffer to read form + * @n: Number of bytes to read + * @user_buffer: Userspace buffer to copy the data from + * + * Should be used as the write callback for iio_buffer_access_ops + * struct for DMA buffers. + */ +int iio_dma_buffer_write(struct iio_buffer *buffer, size_t n, + const char __user *user_buffer) +{ + return iio_dma_buffer_io(buffer, n, (__force char *)user_buffer, true); +} +EXPORT_SYMBOL_GPL(iio_dma_buffer_write); + /** * iio_dma_buffer_data_available() - DMA buffer data_available callback * @buf: Buffer to check for data availability diff --git a/include/linux/iio/buffer-dma.h b/include/linux/iio/buffer-dma.h index 18d3702fa95d..490b93f76fa8 100644 --- a/include/linux/iio/buffer-dma.h +++ b/include/linux/iio/buffer-dma.h @@ -132,6 +132,8 @@ int iio_dma_buffer_disable(struct iio_buffer *buffer, struct iio_dev *indio_dev); int iio_dma_buffer_read(struct iio_buffer *buffer, size_t n, char __user *user_buffer); +int iio_dma_buffer_write(struct iio_buffer *buffer, size_t n, + const char __user *user_buffer); size_t iio_dma_buffer_data_available(struct iio_buffer *buffer); int iio_dma_buffer_set_bytes_per_datum(struct iio_buffer *buffer, size_t bpd); int iio_dma_buffer_set_length(struct iio_buffer *buffer, unsigned int length); @@ -142,4 +144,9 @@ int iio_dma_buffer_init(struct iio_dma_buffer_queue *queue, void iio_dma_buffer_exit(struct iio_dma_buffer_queue *queue); void iio_dma_buffer_release(struct iio_dma_buffer_queue *queue);
+static inline size_t iio_dma_buffer_space_available(struct iio_buffer *buffer) +{ + return iio_dma_buffer_data_available(buffer); +} + #endif
On Mon, 7 Feb 2022 12:59:23 +0000 Paul Cercueil paul@crapouillou.net wrote:
Adding write support to the buffer-dma code is easy - the write() function basically needs to do the exact same thing as the read() function: dequeue a block, read or write the data, enqueue the block when entirely processed.
Therefore, the iio_buffer_dma_read() and the new iio_buffer_dma_write() now both call a function iio_buffer_dma_io(), which will perform this task.
The .space_available() callback can return the exact same value as the .data_available() callback for input buffers, since in both cases we count the exact same thing (the number of bytes in each available block).
Note that we preemptively reset block->bytes_used to the buffer's size in iio_dma_buffer_request_update(), as in the future the iio_dma_buffer_enqueue() function won't reset it.
v2: - Fix block->state not being reset in iio_dma_buffer_request_update() for output buffers. - Only update block->bytes_used once and add a comment about why we update it. - Add a comment about why we're setting a different state for output buffers in iio_dma_buffer_request_update() - Remove useless cast to bool (!!) in iio_dma_buffer_io()
Signed-off-by: Paul Cercueil paul@crapouillou.net Reviewed-by: Alexandru Ardelean ardeleanalex@gmail.com
One comment inline.
I'd be tempted to queue this up with that fixed, but do we have any users? Even though it's trivial I'm not that keen on code upstream well in advance of it being used.
Thanks,
Jonathan
drivers/iio/buffer/industrialio-buffer-dma.c | 88 ++++++++++++++++---- include/linux/iio/buffer-dma.h | 7 ++ 2 files changed, 79 insertions(+), 16 deletions(-)
diff --git a/drivers/iio/buffer/industrialio-buffer-dma.c b/drivers/iio/buffer/industrialio-buffer-dma.c index 1fc91467d1aa..a9f1b673374f 100644 --- a/drivers/iio/buffer/industrialio-buffer-dma.c +++ b/drivers/iio/buffer/industrialio-buffer-dma.c @@ -195,6 +195,18 @@ static void _iio_dma_buffer_block_done(struct iio_dma_buffer_block *block) block->state = IIO_BLOCK_STATE_DONE; } +static void iio_dma_buffer_queue_wake(struct iio_dma_buffer_queue *queue) +{
- __poll_t flags;
- if (queue->buffer.direction == IIO_BUFFER_DIRECTION_IN)
flags = EPOLLIN | EPOLLRDNORM;
- else
flags = EPOLLOUT | EPOLLWRNORM;
- wake_up_interruptible_poll(&queue->buffer.pollq, flags);
+}
/**
- iio_dma_buffer_block_done() - Indicate that a block has been completed
- @block: The completed block
@@ -212,7 +224,7 @@ void iio_dma_buffer_block_done(struct iio_dma_buffer_block *block) spin_unlock_irqrestore(&queue->list_lock, flags); iio_buffer_block_put_atomic(block);
- wake_up_interruptible_poll(&queue->buffer.pollq, EPOLLIN | EPOLLRDNORM);
- iio_dma_buffer_queue_wake(queue);
} EXPORT_SYMBOL_GPL(iio_dma_buffer_block_done); @@ -241,7 +253,7 @@ void iio_dma_buffer_block_list_abort(struct iio_dma_buffer_queue *queue, } spin_unlock_irqrestore(&queue->list_lock, flags);
- wake_up_interruptible_poll(&queue->buffer.pollq, EPOLLIN | EPOLLRDNORM);
- iio_dma_buffer_queue_wake(queue);
} EXPORT_SYMBOL_GPL(iio_dma_buffer_block_list_abort); @@ -335,8 +347,24 @@ int iio_dma_buffer_request_update(struct iio_buffer *buffer) queue->fileio.blocks[i] = block; }
block->state = IIO_BLOCK_STATE_QUEUED;
list_add_tail(&block->head, &queue->incoming);
/*
* block->bytes_used may have been modified previously, e.g. by
* iio_dma_buffer_block_list_abort(). Reset it here to the
* block's so that iio_dma_buffer_io() will work.
*/
block->bytes_used = block->size;
/*
* If it's an input buffer, mark the block as queued, and
* iio_dma_buffer_enable() will submit it. Otherwise mark it as
* done, which means it's ready to be dequeued.
*/
if (queue->buffer.direction == IIO_BUFFER_DIRECTION_IN) {
block->state = IIO_BLOCK_STATE_QUEUED;
list_add_tail(&block->head, &queue->incoming);
} else {
block->state = IIO_BLOCK_STATE_DONE;
}}
out_unlock: @@ -465,20 +493,12 @@ static struct iio_dma_buffer_block *iio_dma_buffer_dequeue( return block; } -/**
- iio_dma_buffer_read() - DMA buffer read callback
- @buffer: Buffer to read form
- @n: Number of bytes to read
- @user_buffer: Userspace buffer to copy the data to
- Should be used as the read callback for iio_buffer_access_ops
- struct for DMA buffers.
- */
-int iio_dma_buffer_read(struct iio_buffer *buffer, size_t n,
- char __user *user_buffer)
+static int iio_dma_buffer_io(struct iio_buffer *buffer,
size_t n, char __user *user_buffer, bool is_write)
{ struct iio_dma_buffer_queue *queue = iio_buffer_to_queue(buffer); struct iio_dma_buffer_block *block;
- void *addr; int ret;
if (n < buffer->bytes_per_datum) @@ -501,8 +521,13 @@ int iio_dma_buffer_read(struct iio_buffer *buffer, size_t n, n = rounddown(n, buffer->bytes_per_datum); if (n > block->bytes_used - queue->fileio.pos) n = block->bytes_used - queue->fileio.pos;
- addr = block->vaddr + queue->fileio.pos;
- if (copy_to_user(user_buffer, block->vaddr + queue->fileio.pos, n)) {
- if (is_write)
ret = copy_from_user(addr, user_buffer, n);
- else
ret = copy_to_user(user_buffer, addr, n);
- if (ret) { ret = -EFAULT; goto out_unlock; }
@@ -521,8 +546,39 @@ int iio_dma_buffer_read(struct iio_buffer *buffer, size_t n, return ret; }
+/**
- iio_dma_buffer_read() - DMA buffer read callback
- @buffer: Buffer to read form
- @n: Number of bytes to read
- @user_buffer: Userspace buffer to copy the data to
- Should be used as the read callback for iio_buffer_access_ops
- struct for DMA buffers.
- */
+int iio_dma_buffer_read(struct iio_buffer *buffer, size_t n,
- char __user *user_buffer)
+{
- return iio_dma_buffer_io(buffer, n, user_buffer, false);
+} EXPORT_SYMBOL_GPL(iio_dma_buffer_read); +/**
- iio_dma_buffer_write() - DMA buffer write callback
- @buffer: Buffer to read form
- @n: Number of bytes to read
- @user_buffer: Userspace buffer to copy the data from
- Should be used as the write callback for iio_buffer_access_ops
- struct for DMA buffers.
- */
+int iio_dma_buffer_write(struct iio_buffer *buffer, size_t n,
const char __user *user_buffer)
+{
- return iio_dma_buffer_io(buffer, n, (__force char *)user_buffer, true);
Casting away the const is a little nasty. Perhaps it's worth adding a parameter to iio_dma_buffer_io so you can have different parameters for the read and write cases and hence keep the const in place? return iio_dma_buffer_io(buffer, n, NULL, user_buffer, true); and return iio_dma_buffer_io(buffer,n, user_buffer, NULL, false);
+} +EXPORT_SYMBOL_GPL(iio_dma_buffer_write);
/**
- iio_dma_buffer_data_available() - DMA buffer data_available callback
- @buf: Buffer to check for data availability
diff --git a/include/linux/iio/buffer-dma.h b/include/linux/iio/buffer-dma.h index 18d3702fa95d..490b93f76fa8 100644 --- a/include/linux/iio/buffer-dma.h +++ b/include/linux/iio/buffer-dma.h @@ -132,6 +132,8 @@ int iio_dma_buffer_disable(struct iio_buffer *buffer, struct iio_dev *indio_dev); int iio_dma_buffer_read(struct iio_buffer *buffer, size_t n, char __user *user_buffer); +int iio_dma_buffer_write(struct iio_buffer *buffer, size_t n,
const char __user *user_buffer);
size_t iio_dma_buffer_data_available(struct iio_buffer *buffer); int iio_dma_buffer_set_bytes_per_datum(struct iio_buffer *buffer, size_t bpd); int iio_dma_buffer_set_length(struct iio_buffer *buffer, unsigned int length); @@ -142,4 +144,9 @@ int iio_dma_buffer_init(struct iio_dma_buffer_queue *queue, void iio_dma_buffer_exit(struct iio_dma_buffer_queue *queue); void iio_dma_buffer_release(struct iio_dma_buffer_queue *queue); +static inline size_t iio_dma_buffer_space_available(struct iio_buffer *buffer) +{
- return iio_dma_buffer_data_available(buffer);
+}
#endif
Hi Jonathan,
Le lun., mars 28 2022 at 18:24:09 +0100, Jonathan Cameron jic23@kernel.org a écrit :
On Mon, 7 Feb 2022 12:59:23 +0000 Paul Cercueil paul@crapouillou.net wrote:
Adding write support to the buffer-dma code is easy - the write() function basically needs to do the exact same thing as the read() function: dequeue a block, read or write the data, enqueue the block when entirely processed.
Therefore, the iio_buffer_dma_read() and the new iio_buffer_dma_write() now both call a function iio_buffer_dma_io(), which will perform this task.
The .space_available() callback can return the exact same value as the .data_available() callback for input buffers, since in both cases we count the exact same thing (the number of bytes in each available block).
Note that we preemptively reset block->bytes_used to the buffer's size in iio_dma_buffer_request_update(), as in the future the iio_dma_buffer_enqueue() function won't reset it.
v2: - Fix block->state not being reset in iio_dma_buffer_request_update() for output buffers. - Only update block->bytes_used once and add a comment about why we update it. - Add a comment about why we're setting a different state for output buffers in iio_dma_buffer_request_update() - Remove useless cast to bool (!!) in iio_dma_buffer_io()
Signed-off-by: Paul Cercueil paul@crapouillou.net Reviewed-by: Alexandru Ardelean ardeleanalex@gmail.com
One comment inline.
I'd be tempted to queue this up with that fixed, but do we have any users? Even though it's trivial I'm not that keen on code upstream well in advance of it being used.
There's a userspace user in libiio. On the kernel side we do have drivers that use it in ADI's downstream kernel, that we plan to upstream in the long term (but it can take some time, as we need to upstream other things first, like JESD204B support).
drivers/iio/buffer/industrialio-buffer-dma.c | 88 ++++++++++++++++---- include/linux/iio/buffer-dma.h | 7 ++ 2 files changed, 79 insertions(+), 16 deletions(-)
diff --git a/drivers/iio/buffer/industrialio-buffer-dma.c b/drivers/iio/buffer/industrialio-buffer-dma.c index 1fc91467d1aa..a9f1b673374f 100644 --- a/drivers/iio/buffer/industrialio-buffer-dma.c +++ b/drivers/iio/buffer/industrialio-buffer-dma.c @@ -195,6 +195,18 @@ static void _iio_dma_buffer_block_done(struct iio_dma_buffer_block *block) block->state = IIO_BLOCK_STATE_DONE; }
+static void iio_dma_buffer_queue_wake(struct iio_dma_buffer_queue *queue) +{
- __poll_t flags;
- if (queue->buffer.direction == IIO_BUFFER_DIRECTION_IN)
flags = EPOLLIN | EPOLLRDNORM;
- else
flags = EPOLLOUT | EPOLLWRNORM;
- wake_up_interruptible_poll(&queue->buffer.pollq, flags);
+}
/**
- iio_dma_buffer_block_done() - Indicate that a block has been
completed
- @block: The completed block
@@ -212,7 +224,7 @@ void iio_dma_buffer_block_done(struct iio_dma_buffer_block *block) spin_unlock_irqrestore(&queue->list_lock, flags);
iio_buffer_block_put_atomic(block);
- wake_up_interruptible_poll(&queue->buffer.pollq, EPOLLIN |
EPOLLRDNORM);
- iio_dma_buffer_queue_wake(queue);
} EXPORT_SYMBOL_GPL(iio_dma_buffer_block_done);
@@ -241,7 +253,7 @@ void iio_dma_buffer_block_list_abort(struct iio_dma_buffer_queue *queue, } spin_unlock_irqrestore(&queue->list_lock, flags);
- wake_up_interruptible_poll(&queue->buffer.pollq, EPOLLIN |
EPOLLRDNORM);
- iio_dma_buffer_queue_wake(queue);
} EXPORT_SYMBOL_GPL(iio_dma_buffer_block_list_abort);
@@ -335,8 +347,24 @@ int iio_dma_buffer_request_update(struct iio_buffer *buffer) queue->fileio.blocks[i] = block; }
block->state = IIO_BLOCK_STATE_QUEUED;
list_add_tail(&block->head, &queue->incoming);
/*
* block->bytes_used may have been modified previously, e.g. by
* iio_dma_buffer_block_list_abort(). Reset it here to the
* block's so that iio_dma_buffer_io() will work.
*/
block->bytes_used = block->size;
/*
* If it's an input buffer, mark the block as queued, and
* iio_dma_buffer_enable() will submit it. Otherwise mark it as
* done, which means it's ready to be dequeued.
*/
if (queue->buffer.direction == IIO_BUFFER_DIRECTION_IN) {
block->state = IIO_BLOCK_STATE_QUEUED;
list_add_tail(&block->head, &queue->incoming);
} else {
block->state = IIO_BLOCK_STATE_DONE;
}}
out_unlock: @@ -465,20 +493,12 @@ static struct iio_dma_buffer_block *iio_dma_buffer_dequeue( return block; }
-/**
- iio_dma_buffer_read() - DMA buffer read callback
- @buffer: Buffer to read form
- @n: Number of bytes to read
- @user_buffer: Userspace buffer to copy the data to
- Should be used as the read callback for iio_buffer_access_ops
- struct for DMA buffers.
- */
-int iio_dma_buffer_read(struct iio_buffer *buffer, size_t n,
- char __user *user_buffer)
+static int iio_dma_buffer_io(struct iio_buffer *buffer,
size_t n, char __user *user_buffer, bool is_write)
{ struct iio_dma_buffer_queue *queue = iio_buffer_to_queue(buffer); struct iio_dma_buffer_block *block;
void *addr; int ret;
if (n < buffer->bytes_per_datum)
@@ -501,8 +521,13 @@ int iio_dma_buffer_read(struct iio_buffer *buffer, size_t n, n = rounddown(n, buffer->bytes_per_datum); if (n > block->bytes_used - queue->fileio.pos) n = block->bytes_used - queue->fileio.pos;
- addr = block->vaddr + queue->fileio.pos;
- if (copy_to_user(user_buffer, block->vaddr + queue->fileio.pos,
n)) {
- if (is_write)
ret = copy_from_user(addr, user_buffer, n);
- else
ret = copy_to_user(user_buffer, addr, n);
- if (ret) { ret = -EFAULT; goto out_unlock; }
@@ -521,8 +546,39 @@ int iio_dma_buffer_read(struct iio_buffer *buffer, size_t n,
return ret; }
+/**
- iio_dma_buffer_read() - DMA buffer read callback
- @buffer: Buffer to read form
- @n: Number of bytes to read
- @user_buffer: Userspace buffer to copy the data to
- Should be used as the read callback for iio_buffer_access_ops
- struct for DMA buffers.
- */
+int iio_dma_buffer_read(struct iio_buffer *buffer, size_t n,
- char __user *user_buffer)
+{
- return iio_dma_buffer_io(buffer, n, user_buffer, false);
+} EXPORT_SYMBOL_GPL(iio_dma_buffer_read);
+/**
- iio_dma_buffer_write() - DMA buffer write callback
- @buffer: Buffer to read form
- @n: Number of bytes to read
- @user_buffer: Userspace buffer to copy the data from
- Should be used as the write callback for iio_buffer_access_ops
- struct for DMA buffers.
- */
+int iio_dma_buffer_write(struct iio_buffer *buffer, size_t n,
const char __user *user_buffer)
+{
- return iio_dma_buffer_io(buffer, n, (__force char *)user_buffer,
true);
Casting away the const is a little nasty. Perhaps it's worth adding a parameter to iio_dma_buffer_io so you can have different parameters for the read and write cases and hence keep the const in place? return iio_dma_buffer_io(buffer, n, NULL, user_buffer, true); and return iio_dma_buffer_io(buffer,n, user_buffer, NULL, false);
I can do that.
Cheers, -Paul
+} +EXPORT_SYMBOL_GPL(iio_dma_buffer_write);
/**
- iio_dma_buffer_data_available() - DMA buffer data_available
callback
- @buf: Buffer to check for data availability
diff --git a/include/linux/iio/buffer-dma.h b/include/linux/iio/buffer-dma.h index 18d3702fa95d..490b93f76fa8 100644 --- a/include/linux/iio/buffer-dma.h +++ b/include/linux/iio/buffer-dma.h @@ -132,6 +132,8 @@ int iio_dma_buffer_disable(struct iio_buffer *buffer, struct iio_dev *indio_dev); int iio_dma_buffer_read(struct iio_buffer *buffer, size_t n, char __user *user_buffer); +int iio_dma_buffer_write(struct iio_buffer *buffer, size_t n,
const char __user *user_buffer);
size_t iio_dma_buffer_data_available(struct iio_buffer *buffer); int iio_dma_buffer_set_bytes_per_datum(struct iio_buffer *buffer, size_t bpd); int iio_dma_buffer_set_length(struct iio_buffer *buffer, unsigned int length); @@ -142,4 +144,9 @@ int iio_dma_buffer_init(struct iio_dma_buffer_queue *queue, void iio_dma_buffer_exit(struct iio_dma_buffer_queue *queue); void iio_dma_buffer_release(struct iio_dma_buffer_queue *queue);
+static inline size_t iio_dma_buffer_space_available(struct iio_buffer *buffer) +{
- return iio_dma_buffer_data_available(buffer);
+}
#endif
On Mon, 2022-03-28 at 19:39 +0100, Paul Cercueil wrote:
Hi Jonathan,
Le lun., mars 28 2022 at 18:24:09 +0100, Jonathan Cameron jic23@kernel.org a écrit :
On Mon, 7 Feb 2022 12:59:23 +0000 Paul Cercueil paul@crapouillou.net wrote:
Adding write support to the buffer-dma code is easy - the write() function basically needs to do the exact same thing as the read() function: dequeue a block, read or write the data, enqueue the block when entirely processed.
Therefore, the iio_buffer_dma_read() and the new iio_buffer_dma_write() now both call a function iio_buffer_dma_io(), which will perform this task.
The .space_available() callback can return the exact same value as the .data_available() callback for input buffers, since in both cases we count the exact same thing (the number of bytes in each available block).
Note that we preemptively reset block->bytes_used to the buffer's size in iio_dma_buffer_request_update(), as in the future the iio_dma_buffer_enqueue() function won't reset it.
v2: - Fix block->state not being reset in iio_dma_buffer_request_update() for output buffers. - Only update block->bytes_used once and add a comment about why we update it. - Add a comment about why we're setting a different state for output buffers in iio_dma_buffer_request_update() - Remove useless cast to bool (!!) in iio_dma_buffer_io()
Signed-off-by: Paul Cercueil paul@crapouillou.net Reviewed-by: Alexandru Ardelean ardeleanalex@gmail.com
One comment inline.
I'd be tempted to queue this up with that fixed, but do we have any users? Even though it's trivial I'm not that keen on code upstream well in advance of it being used.
There's a userspace user in libiio. On the kernel side we do have drivers that use it in ADI's downstream kernel, that we plan to upstream in the long term (but it can take some time, as we need to upstream other things first, like JESD204B support).
You mean, users for DMA output buffers? If so, I have on my queue to add the dac counterpart of this one:
https://elixir.bootlin.com/linux/latest/source/drivers/iio/adc/adi-axi-adc.c
Which is a user of DMA buffers. Though this one does not depend on JESD204, I suspect it will also be a tricky process mainly because I think there are major issues on how things are done right now (on the ADC driver).
But yeah, not a topic here and I do plan to first start the discussion on the mailing list before starting developing (hopefully in the coming weeks)...
- Nuno Sá
On Wed, Feb 9, 2022 at 9:10 AM Paul Cercueil paul@crapouillou.net wrote:
Adding write support to the buffer-dma code is easy - the write() function basically needs to do the exact same thing as the read() function: dequeue a block, read or write the data, enqueue the block when entirely processed.
Therefore, the iio_buffer_dma_read() and the new iio_buffer_dma_write() now both call a function iio_buffer_dma_io(), which will perform this task.
The .space_available() callback can return the exact same value as the .data_available() callback for input buffers, since in both cases we count the exact same thing (the number of bytes in each available block).
Note that we preemptively reset block->bytes_used to the buffer's size in iio_dma_buffer_request_update(), as in the future the iio_dma_buffer_enqueue() function won't reset it.
...
v2: - Fix block->state not being reset in iio_dma_buffer_request_update() for output buffers. - Only update block->bytes_used once and add a comment about why we update it. - Add a comment about why we're setting a different state for output buffers in iio_dma_buffer_request_update() - Remove useless cast to bool (!!) in iio_dma_buffer_io()
Usual place for changelog is after the cutter '--- ' line below...
Signed-off-by: Paul Cercueil paul@crapouillou.net Reviewed-by: Alexandru Ardelean ardeleanalex@gmail.com
...somewhere here.
drivers/iio/buffer/industrialio-buffer-dma.c | 88 ++++++++++++++++---- include/linux/iio/buffer-dma.h | 7 ++
...
+static int iio_dma_buffer_io(struct iio_buffer *buffer,
size_t n, char __user *user_buffer, bool is_write)
I believe there is a room for size_t n on the previous line.
...
if (is_write)
I would name it is_from_user.
ret = copy_from_user(addr, user_buffer, n);
else
ret = copy_to_user(user_buffer, addr, n);
...
+int iio_dma_buffer_write(struct iio_buffer *buffer, size_t n,
const char __user *user_buffer)
+{
return iio_dma_buffer_io(buffer, n, (__force char *)user_buffer, true);
Why do you drop address space markers?
+}
Update the devm_iio_dmaengine_buffer_setup() function to support specifying the buffer direction.
Update the iio_dmaengine_buffer_submit() function to handle input buffers as well as output buffers.
Signed-off-by: Paul Cercueil paul@crapouillou.net Reviewed-by: Alexandru Ardelean ardeleanalex@gmail.com --- drivers/iio/adc/adi-axi-adc.c | 3 ++- .../buffer/industrialio-buffer-dmaengine.c | 24 +++++++++++++++---- include/linux/iio/buffer-dmaengine.h | 5 +++- 3 files changed, 25 insertions(+), 7 deletions(-)
diff --git a/drivers/iio/adc/adi-axi-adc.c b/drivers/iio/adc/adi-axi-adc.c index a73e3c2d212f..0a6f2c32b1b9 100644 --- a/drivers/iio/adc/adi-axi-adc.c +++ b/drivers/iio/adc/adi-axi-adc.c @@ -113,7 +113,8 @@ static int adi_axi_adc_config_dma_buffer(struct device *dev, dma_name = "rx";
return devm_iio_dmaengine_buffer_setup(indio_dev->dev.parent, - indio_dev, dma_name); + indio_dev, dma_name, + IIO_BUFFER_DIRECTION_IN); }
static int adi_axi_adc_read_raw(struct iio_dev *indio_dev, diff --git a/drivers/iio/buffer/industrialio-buffer-dmaengine.c b/drivers/iio/buffer/industrialio-buffer-dmaengine.c index f8ce26a24c57..ac26b04aa4a9 100644 --- a/drivers/iio/buffer/industrialio-buffer-dmaengine.c +++ b/drivers/iio/buffer/industrialio-buffer-dmaengine.c @@ -64,14 +64,25 @@ static int iio_dmaengine_buffer_submit_block(struct iio_dma_buffer_queue *queue, struct dmaengine_buffer *dmaengine_buffer = iio_buffer_to_dmaengine_buffer(&queue->buffer); struct dma_async_tx_descriptor *desc; + enum dma_transfer_direction dma_dir; + size_t max_size; dma_cookie_t cookie;
- block->bytes_used = min(block->size, dmaengine_buffer->max_size); - block->bytes_used = round_down(block->bytes_used, - dmaengine_buffer->align); + max_size = min(block->size, dmaengine_buffer->max_size); + max_size = round_down(max_size, dmaengine_buffer->align); + + if (queue->buffer.direction == IIO_BUFFER_DIRECTION_IN) { + block->bytes_used = max_size; + dma_dir = DMA_DEV_TO_MEM; + } else { + dma_dir = DMA_MEM_TO_DEV; + } + + if (!block->bytes_used || block->bytes_used > max_size) + return -EINVAL;
desc = dmaengine_prep_slave_single(dmaengine_buffer->chan, - block->phys_addr, block->bytes_used, DMA_DEV_TO_MEM, + block->phys_addr, block->bytes_used, dma_dir, DMA_PREP_INTERRUPT); if (!desc) return -ENOMEM; @@ -275,7 +286,8 @@ static struct iio_buffer *devm_iio_dmaengine_buffer_alloc(struct device *dev, */ int devm_iio_dmaengine_buffer_setup(struct device *dev, struct iio_dev *indio_dev, - const char *channel) + const char *channel, + enum iio_buffer_direction dir) { struct iio_buffer *buffer;
@@ -286,6 +298,8 @@ int devm_iio_dmaengine_buffer_setup(struct device *dev,
indio_dev->modes |= INDIO_BUFFER_HARDWARE;
+ buffer->direction = dir; + return iio_device_attach_buffer(indio_dev, buffer); } EXPORT_SYMBOL_GPL(devm_iio_dmaengine_buffer_setup); diff --git a/include/linux/iio/buffer-dmaengine.h b/include/linux/iio/buffer-dmaengine.h index 5c355be89814..538d0479cdd6 100644 --- a/include/linux/iio/buffer-dmaengine.h +++ b/include/linux/iio/buffer-dmaengine.h @@ -7,11 +7,14 @@ #ifndef __IIO_DMAENGINE_H__ #define __IIO_DMAENGINE_H__
+#include <linux/iio/buffer.h> + struct iio_dev; struct device;
int devm_iio_dmaengine_buffer_setup(struct device *dev, struct iio_dev *indio_dev, - const char *channel); + const char *channel, + enum iio_buffer_direction dir);
#endif
Use the iio_dma_buffer_write() and iio_dma_buffer_space_available() functions provided by the buffer-dma core, to enable write support in the buffer-dmaengine code.
Signed-off-by: Paul Cercueil paul@crapouillou.net Reviewed-by: Alexandru Ardelean ardeleanalex@gmail.com --- drivers/iio/buffer/industrialio-buffer-dmaengine.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/iio/buffer/industrialio-buffer-dmaengine.c b/drivers/iio/buffer/industrialio-buffer-dmaengine.c index ac26b04aa4a9..5cde8fd81c7f 100644 --- a/drivers/iio/buffer/industrialio-buffer-dmaengine.c +++ b/drivers/iio/buffer/industrialio-buffer-dmaengine.c @@ -123,12 +123,14 @@ static void iio_dmaengine_buffer_release(struct iio_buffer *buf)
static const struct iio_buffer_access_funcs iio_dmaengine_buffer_ops = { .read = iio_dma_buffer_read, + .write = iio_dma_buffer_write, .set_bytes_per_datum = iio_dma_buffer_set_bytes_per_datum, .set_length = iio_dma_buffer_set_length, .request_update = iio_dma_buffer_request_update, .enable = iio_dma_buffer_enable, .disable = iio_dma_buffer_disable, .data_available = iio_dma_buffer_data_available, + .space_available = iio_dma_buffer_space_available, .release = iio_dmaengine_buffer_release,
.modes = INDIO_BUFFER_HARDWARE,
On Mon, 7 Feb 2022 12:59:25 +0000 Paul Cercueil paul@crapouillou.net wrote:
Use the iio_dma_buffer_write() and iio_dma_buffer_space_available() functions provided by the buffer-dma core, to enable write support in the buffer-dmaengine code.
Signed-off-by: Paul Cercueil paul@crapouillou.net Reviewed-by: Alexandru Ardelean ardeleanalex@gmail.com
This (and previous) look fine to me. Just that question of a user for the new functionality...
Jonathan
drivers/iio/buffer/industrialio-buffer-dmaengine.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/iio/buffer/industrialio-buffer-dmaengine.c b/drivers/iio/buffer/industrialio-buffer-dmaengine.c index ac26b04aa4a9..5cde8fd81c7f 100644 --- a/drivers/iio/buffer/industrialio-buffer-dmaengine.c +++ b/drivers/iio/buffer/industrialio-buffer-dmaengine.c @@ -123,12 +123,14 @@ static void iio_dmaengine_buffer_release(struct iio_buffer *buf) static const struct iio_buffer_access_funcs iio_dmaengine_buffer_ops = { .read = iio_dma_buffer_read,
- .write = iio_dma_buffer_write, .set_bytes_per_datum = iio_dma_buffer_set_bytes_per_datum, .set_length = iio_dma_buffer_set_length, .request_update = iio_dma_buffer_request_update, .enable = iio_dma_buffer_enable, .disable = iio_dma_buffer_disable, .data_available = iio_dma_buffer_data_available,
- .space_available = iio_dma_buffer_space_available, .release = iio_dmaengine_buffer_release,
.modes = INDIO_BUFFER_HARDWARE,
Add the necessary infrastructure to the IIO core to support a new optional DMABUF based interface.
The advantage of this new DMABUF based interface vs. the read() interface, is that it avoids an extra copy of the data between the kernel and userspace. This is particularly userful for high-speed devices which produce several megabytes or even gigabytes of data per second.
The data in this new DMABUF interface is managed at the granularity of DMABUF objects. Reducing the granularity from byte level to block level is done to reduce the userspace-kernelspace synchronization overhead since performing syscalls for each byte at a few Mbps is just not feasible.
This of course leads to a slightly increased latency. For this reason an application can choose the size of the DMABUFs as well as how many it allocates. E.g. two DMABUFs would be a traditional double buffering scheme. But using a higher number might be necessary to avoid underflow/overflow situations in the presence of scheduling latencies.
As part of the interface, 2 new IOCTLs have been added:
IIO_BUFFER_DMABUF_ALLOC_IOCTL(struct iio_dmabuf_alloc_req *): Each call will allocate a new DMABUF object. The return value (if not a negative errno value as error) will be the file descriptor of the new DMABUF.
IIO_BUFFER_DMABUF_ENQUEUE_IOCTL(struct iio_dmabuf *): Place the DMABUF object into the queue pending for hardware process.
These two IOCTLs have to be performed on the IIO buffer's file descriptor, obtained using the IIO_BUFFER_GET_FD_IOCTL() ioctl.
To access the data stored in a block by userspace the block must be mapped to the process's memory. This is done by calling mmap() on the DMABUF's file descriptor.
Before accessing the data through the map, you must use the DMA_BUF_IOCTL_SYNC(struct dma_buf_sync *) ioctl, with the DMA_BUF_SYNC_START flag, to make sure that the data is available. This call may block until the hardware is done with this block. Once you are done reading or writing the data, you must use this ioctl again with the DMA_BUF_SYNC_END flag, before enqueueing the DMABUF to the kernel's queue.
If you need to know when the hardware is done with a DMABUF, you can poll its file descriptor for the EPOLLOUT event.
Finally, to destroy a DMABUF object, simply call close() on its file descriptor.
A typical workflow for the new interface is:
for block in blocks: DMABUF_ALLOC block mmap block
enable buffer
while !done for block in blocks: DMABUF_ENQUEUE block
DMABUF_SYNC_START block process data DMABUF_SYNC_END block
disable buffer
for block in blocks: close block
v2: Only allow the new IOCTLs on the buffer FD created with IIO_BUFFER_GET_FD_IOCTL().
Signed-off-by: Paul Cercueil paul@crapouillou.net --- drivers/iio/industrialio-buffer.c | 55 +++++++++++++++++++++++++++++++ include/linux/iio/buffer_impl.h | 8 +++++ include/uapi/linux/iio/buffer.h | 29 ++++++++++++++++ 3 files changed, 92 insertions(+)
diff --git a/drivers/iio/industrialio-buffer.c b/drivers/iio/industrialio-buffer.c index 94eb9f6cf128..72f333a519bc 100644 --- a/drivers/iio/industrialio-buffer.c +++ b/drivers/iio/industrialio-buffer.c @@ -17,6 +17,7 @@ #include <linux/fs.h> #include <linux/cdev.h> #include <linux/slab.h> +#include <linux/mm.h> #include <linux/poll.h> #include <linux/sched/signal.h>
@@ -1520,11 +1521,65 @@ static int iio_buffer_chrdev_release(struct inode *inode, struct file *filep) return 0; }
+static int iio_buffer_enqueue_dmabuf(struct iio_buffer *buffer, + struct iio_dmabuf __user *user_buf) +{ + struct iio_dmabuf dmabuf; + + if (!buffer->access->enqueue_dmabuf) + return -EPERM; + + if (copy_from_user(&dmabuf, user_buf, sizeof(dmabuf))) + return -EFAULT; + + if (dmabuf.flags & ~IIO_BUFFER_DMABUF_SUPPORTED_FLAGS) + return -EINVAL; + + return buffer->access->enqueue_dmabuf(buffer, &dmabuf); +} + +static int iio_buffer_alloc_dmabuf(struct iio_buffer *buffer, + struct iio_dmabuf_alloc_req __user *user_req) +{ + struct iio_dmabuf_alloc_req req; + + if (!buffer->access->alloc_dmabuf) + return -EPERM; + + if (copy_from_user(&req, user_req, sizeof(req))) + return -EFAULT; + + if (req.resv) + return -EINVAL; + + return buffer->access->alloc_dmabuf(buffer, &req); +} + +static long iio_buffer_chrdev_ioctl(struct file *filp, + unsigned int cmd, unsigned long arg) +{ + struct iio_dev_buffer_pair *ib = filp->private_data; + struct iio_buffer *buffer = ib->buffer; + void __user *_arg = (void __user *)arg; + + switch (cmd) { + case IIO_BUFFER_DMABUF_ALLOC_IOCTL: + return iio_buffer_alloc_dmabuf(buffer, _arg); + case IIO_BUFFER_DMABUF_ENQUEUE_IOCTL: + /* TODO: support non-blocking enqueue operation */ + return iio_buffer_enqueue_dmabuf(buffer, _arg); + default: + return IIO_IOCTL_UNHANDLED; + } +} + static const struct file_operations iio_buffer_chrdev_fileops = { .owner = THIS_MODULE, .llseek = noop_llseek, .read = iio_buffer_read, .write = iio_buffer_write, + .unlocked_ioctl = iio_buffer_chrdev_ioctl, + .compat_ioctl = compat_ptr_ioctl, .poll = iio_buffer_poll, .release = iio_buffer_chrdev_release, }; diff --git a/include/linux/iio/buffer_impl.h b/include/linux/iio/buffer_impl.h index e2ca8ea23e19..728541bc2c63 100644 --- a/include/linux/iio/buffer_impl.h +++ b/include/linux/iio/buffer_impl.h @@ -39,6 +39,9 @@ struct iio_buffer; * device stops sampling. Calles are balanced with @enable. * @release: called when the last reference to the buffer is dropped, * should free all resources allocated by the buffer. + * @alloc_dmabuf: called from userspace via ioctl to allocate one DMABUF. + * @enqueue_dmabuf: called from userspace via ioctl to queue this DMABUF + * object to this buffer. Requires a valid DMABUF fd. * @modes: Supported operating modes by this buffer type * @flags: A bitmask combination of INDIO_BUFFER_FLAG_* * @@ -68,6 +71,11 @@ struct iio_buffer_access_funcs {
void (*release)(struct iio_buffer *buffer);
+ int (*alloc_dmabuf)(struct iio_buffer *buffer, + struct iio_dmabuf_alloc_req *req); + int (*enqueue_dmabuf)(struct iio_buffer *buffer, + struct iio_dmabuf *block); + unsigned int modes; unsigned int flags; }; diff --git a/include/uapi/linux/iio/buffer.h b/include/uapi/linux/iio/buffer.h index 13939032b3f6..e4621b926262 100644 --- a/include/uapi/linux/iio/buffer.h +++ b/include/uapi/linux/iio/buffer.h @@ -5,6 +5,35 @@ #ifndef _UAPI_IIO_BUFFER_H_ #define _UAPI_IIO_BUFFER_H_
+#include <linux/types.h> + +#define IIO_BUFFER_DMABUF_SUPPORTED_FLAGS 0x00000000 + +/** + * struct iio_dmabuf_alloc_req - Descriptor for allocating IIO DMABUFs + * @size: the size of a single DMABUF + * @resv: reserved + */ +struct iio_dmabuf_alloc_req { + __u64 size; + __u64 resv; +}; + +/** + * struct iio_dmabuf - Descriptor for a single IIO DMABUF object + * @fd: file descriptor of the DMABUF object + * @flags: one or more IIO_BUFFER_DMABUF_* flags + * @bytes_used: number of bytes used in this DMABUF for the data transfer. + * If zero, the full buffer is used. + */ +struct iio_dmabuf { + __u32 fd; + __u32 flags; + __u64 bytes_used; +}; + #define IIO_BUFFER_GET_FD_IOCTL _IOWR('i', 0x91, int) +#define IIO_BUFFER_DMABUF_ALLOC_IOCTL _IOW('i', 0x92, struct iio_dmabuf_alloc_req) +#define IIO_BUFFER_DMABUF_ENQUEUE_IOCTL _IOW('i', 0x93, struct iio_dmabuf)
#endif /* _UAPI_IIO_BUFFER_H_ */
On Mon, 7 Feb 2022 12:59:26 +0000 Paul Cercueil paul@crapouillou.net wrote:
Add the necessary infrastructure to the IIO core to support a new optional DMABUF based interface.
The advantage of this new DMABUF based interface vs. the read() interface, is that it avoids an extra copy of the data between the kernel and userspace. This is particularly userful for high-speed
useful
devices which produce several megabytes or even gigabytes of data per second.
The data in this new DMABUF interface is managed at the granularity of DMABUF objects. Reducing the granularity from byte level to block level is done to reduce the userspace-kernelspace synchronization overhead since performing syscalls for each byte at a few Mbps is just not feasible.
This of course leads to a slightly increased latency. For this reason an application can choose the size of the DMABUFs as well as how many it allocates. E.g. two DMABUFs would be a traditional double buffering scheme. But using a higher number might be necessary to avoid underflow/overflow situations in the presence of scheduling latencies.
As part of the interface, 2 new IOCTLs have been added:
IIO_BUFFER_DMABUF_ALLOC_IOCTL(struct iio_dmabuf_alloc_req *): Each call will allocate a new DMABUF object. The return value (if not a negative errno value as error) will be the file descriptor of the new DMABUF.
IIO_BUFFER_DMABUF_ENQUEUE_IOCTL(struct iio_dmabuf *): Place the DMABUF object into the queue pending for hardware process.
These two IOCTLs have to be performed on the IIO buffer's file descriptor, obtained using the IIO_BUFFER_GET_FD_IOCTL() ioctl.
Just to check, do they work on the old deprecated chardev route? Normally we can directly access the first buffer without the ioctl.
To access the data stored in a block by userspace the block must be mapped to the process's memory. This is done by calling mmap() on the DMABUF's file descriptor.
Before accessing the data through the map, you must use the DMA_BUF_IOCTL_SYNC(struct dma_buf_sync *) ioctl, with the DMA_BUF_SYNC_START flag, to make sure that the data is available. This call may block until the hardware is done with this block. Once you are done reading or writing the data, you must use this ioctl again with the DMA_BUF_SYNC_END flag, before enqueueing the DMABUF to the kernel's queue.
If you need to know when the hardware is done with a DMABUF, you can poll its file descriptor for the EPOLLOUT event.
Finally, to destroy a DMABUF object, simply call close() on its file descriptor.
A typical workflow for the new interface is:
for block in blocks: DMABUF_ALLOC block mmap block
enable buffer
while !done for block in blocks: DMABUF_ENQUEUE block
DMABUF_SYNC_START block process data DMABUF_SYNC_END block
disable buffer
for block in blocks: close block
Given my very limited knowledge of dma-buf, I'll leave commenting on the flow to others who know if this looks 'standards' or not ;)
Code looks sane to me..
v2: Only allow the new IOCTLs on the buffer FD created with IIO_BUFFER_GET_FD_IOCTL().
Signed-off-by: Paul Cercueil paul@crapouillou.net
drivers/iio/industrialio-buffer.c | 55 +++++++++++++++++++++++++++++++ include/linux/iio/buffer_impl.h | 8 +++++ include/uapi/linux/iio/buffer.h | 29 ++++++++++++++++ 3 files changed, 92 insertions(+)
diff --git a/drivers/iio/industrialio-buffer.c b/drivers/iio/industrialio-buffer.c index 94eb9f6cf128..72f333a519bc 100644 --- a/drivers/iio/industrialio-buffer.c +++ b/drivers/iio/industrialio-buffer.c @@ -17,6 +17,7 @@ #include <linux/fs.h> #include <linux/cdev.h> #include <linux/slab.h> +#include <linux/mm.h> #include <linux/poll.h> #include <linux/sched/signal.h> @@ -1520,11 +1521,65 @@ static int iio_buffer_chrdev_release(struct inode *inode, struct file *filep) return 0; } +static int iio_buffer_enqueue_dmabuf(struct iio_buffer *buffer,
struct iio_dmabuf __user *user_buf)
+{
- struct iio_dmabuf dmabuf;
- if (!buffer->access->enqueue_dmabuf)
return -EPERM;
- if (copy_from_user(&dmabuf, user_buf, sizeof(dmabuf)))
return -EFAULT;
- if (dmabuf.flags & ~IIO_BUFFER_DMABUF_SUPPORTED_FLAGS)
return -EINVAL;
- return buffer->access->enqueue_dmabuf(buffer, &dmabuf);
+}
+static int iio_buffer_alloc_dmabuf(struct iio_buffer *buffer,
struct iio_dmabuf_alloc_req __user *user_req)
+{
- struct iio_dmabuf_alloc_req req;
- if (!buffer->access->alloc_dmabuf)
return -EPERM;
- if (copy_from_user(&req, user_req, sizeof(req)))
return -EFAULT;
- if (req.resv)
return -EINVAL;
- return buffer->access->alloc_dmabuf(buffer, &req);
+}
+static long iio_buffer_chrdev_ioctl(struct file *filp,
unsigned int cmd, unsigned long arg)
+{
- struct iio_dev_buffer_pair *ib = filp->private_data;
- struct iio_buffer *buffer = ib->buffer;
- void __user *_arg = (void __user *)arg;
- switch (cmd) {
- case IIO_BUFFER_DMABUF_ALLOC_IOCTL:
return iio_buffer_alloc_dmabuf(buffer, _arg);
- case IIO_BUFFER_DMABUF_ENQUEUE_IOCTL:
/* TODO: support non-blocking enqueue operation */
return iio_buffer_enqueue_dmabuf(buffer, _arg);
- default:
return IIO_IOCTL_UNHANDLED;
- }
+}
static const struct file_operations iio_buffer_chrdev_fileops = { .owner = THIS_MODULE, .llseek = noop_llseek, .read = iio_buffer_read, .write = iio_buffer_write,
- .unlocked_ioctl = iio_buffer_chrdev_ioctl,
- .compat_ioctl = compat_ptr_ioctl, .poll = iio_buffer_poll, .release = iio_buffer_chrdev_release,
}; diff --git a/include/linux/iio/buffer_impl.h b/include/linux/iio/buffer_impl.h index e2ca8ea23e19..728541bc2c63 100644 --- a/include/linux/iio/buffer_impl.h +++ b/include/linux/iio/buffer_impl.h @@ -39,6 +39,9 @@ struct iio_buffer;
device stops sampling. Calles are balanced with @enable.
- @release: called when the last reference to the buffer is dropped,
should free all resources allocated by the buffer.
- @alloc_dmabuf: called from userspace via ioctl to allocate one DMABUF.
- @enqueue_dmabuf: called from userspace via ioctl to queue this DMABUF
object to this buffer. Requires a valid DMABUF fd.
- @modes: Supported operating modes by this buffer type
- @flags: A bitmask combination of INDIO_BUFFER_FLAG_*
@@ -68,6 +71,11 @@ struct iio_buffer_access_funcs { void (*release)(struct iio_buffer *buffer);
- int (*alloc_dmabuf)(struct iio_buffer *buffer,
struct iio_dmabuf_alloc_req *req);
- int (*enqueue_dmabuf)(struct iio_buffer *buffer,
struct iio_dmabuf *block);
- unsigned int modes; unsigned int flags;
}; diff --git a/include/uapi/linux/iio/buffer.h b/include/uapi/linux/iio/buffer.h index 13939032b3f6..e4621b926262 100644 --- a/include/uapi/linux/iio/buffer.h +++ b/include/uapi/linux/iio/buffer.h @@ -5,6 +5,35 @@ #ifndef _UAPI_IIO_BUFFER_H_ #define _UAPI_IIO_BUFFER_H_ +#include <linux/types.h>
+#define IIO_BUFFER_DMABUF_SUPPORTED_FLAGS 0x00000000
+/**
- struct iio_dmabuf_alloc_req - Descriptor for allocating IIO DMABUFs
- @size: the size of a single DMABUF
- @resv: reserved
- */
+struct iio_dmabuf_alloc_req {
- __u64 size;
- __u64 resv;
+};
+/**
- struct iio_dmabuf - Descriptor for a single IIO DMABUF object
- @fd: file descriptor of the DMABUF object
- @flags: one or more IIO_BUFFER_DMABUF_* flags
- @bytes_used: number of bytes used in this DMABUF for the data transfer.
If zero, the full buffer is used.
- */
+struct iio_dmabuf {
- __u32 fd;
- __u32 flags;
- __u64 bytes_used;
+};
#define IIO_BUFFER_GET_FD_IOCTL _IOWR('i', 0x91, int) +#define IIO_BUFFER_DMABUF_ALLOC_IOCTL _IOW('i', 0x92, struct iio_dmabuf_alloc_req) +#define IIO_BUFFER_DMABUF_ENQUEUE_IOCTL _IOW('i', 0x93, struct iio_dmabuf) #endif /* _UAPI_IIO_BUFFER_H_ */
Hi Jonathan,
Le lun., mars 28 2022 at 18:37:01 +0100, Jonathan Cameron jic23@kernel.org a écrit :
On Mon, 7 Feb 2022 12:59:26 +0000 Paul Cercueil paul@crapouillou.net wrote:
Add the necessary infrastructure to the IIO core to support a new optional DMABUF based interface.
The advantage of this new DMABUF based interface vs. the read() interface, is that it avoids an extra copy of the data between the kernel and userspace. This is particularly userful for high-speed
useful
devices which produce several megabytes or even gigabytes of data per second.
The data in this new DMABUF interface is managed at the granularity of DMABUF objects. Reducing the granularity from byte level to block level is done to reduce the userspace-kernelspace synchronization overhead since performing syscalls for each byte at a few Mbps is just not feasible.
This of course leads to a slightly increased latency. For this reason an application can choose the size of the DMABUFs as well as how many it allocates. E.g. two DMABUFs would be a traditional double buffering scheme. But using a higher number might be necessary to avoid underflow/overflow situations in the presence of scheduling latencies.
As part of the interface, 2 new IOCTLs have been added:
IIO_BUFFER_DMABUF_ALLOC_IOCTL(struct iio_dmabuf_alloc_req *): Each call will allocate a new DMABUF object. The return value (if not a negative errno value as error) will be the file descriptor of the new DMABUF.
IIO_BUFFER_DMABUF_ENQUEUE_IOCTL(struct iio_dmabuf *): Place the DMABUF object into the queue pending for hardware process.
These two IOCTLs have to be performed on the IIO buffer's file descriptor, obtained using the IIO_BUFFER_GET_FD_IOCTL() ioctl.
Just to check, do they work on the old deprecated chardev route? Normally we can directly access the first buffer without the ioctl.
They do not. I think it's fine this way, since as you said, the old chardev route is deprecated. But I can add support for it with enough peer pressure.
To access the data stored in a block by userspace the block must be mapped to the process's memory. This is done by calling mmap() on the DMABUF's file descriptor.
Before accessing the data through the map, you must use the DMA_BUF_IOCTL_SYNC(struct dma_buf_sync *) ioctl, with the DMA_BUF_SYNC_START flag, to make sure that the data is available. This call may block until the hardware is done with this block. Once you are done reading or writing the data, you must use this ioctl again with the DMA_BUF_SYNC_END flag, before enqueueing the DMABUF to the kernel's queue.
If you need to know when the hardware is done with a DMABUF, you can poll its file descriptor for the EPOLLOUT event.
Finally, to destroy a DMABUF object, simply call close() on its file descriptor.
A typical workflow for the new interface is:
for block in blocks: DMABUF_ALLOC block mmap block
enable buffer
while !done for block in blocks: DMABUF_ENQUEUE block
DMABUF_SYNC_START block process data DMABUF_SYNC_END block
disable buffer
for block in blocks: close block
Given my very limited knowledge of dma-buf, I'll leave commenting on the flow to others who know if this looks 'standards' or not ;)
Code looks sane to me..
Thanks.
Cheers, -Paul
v2: Only allow the new IOCTLs on the buffer FD created with IIO_BUFFER_GET_FD_IOCTL().
Signed-off-by: Paul Cercueil paul@crapouillou.net
drivers/iio/industrialio-buffer.c | 55 +++++++++++++++++++++++++++++++ include/linux/iio/buffer_impl.h | 8 +++++ include/uapi/linux/iio/buffer.h | 29 ++++++++++++++++ 3 files changed, 92 insertions(+)
diff --git a/drivers/iio/industrialio-buffer.c b/drivers/iio/industrialio-buffer.c index 94eb9f6cf128..72f333a519bc 100644 --- a/drivers/iio/industrialio-buffer.c +++ b/drivers/iio/industrialio-buffer.c @@ -17,6 +17,7 @@ #include <linux/fs.h> #include <linux/cdev.h> #include <linux/slab.h> +#include <linux/mm.h> #include <linux/poll.h> #include <linux/sched/signal.h>
@@ -1520,11 +1521,65 @@ static int iio_buffer_chrdev_release(struct inode *inode, struct file *filep) return 0; }
+static int iio_buffer_enqueue_dmabuf(struct iio_buffer *buffer,
struct iio_dmabuf __user *user_buf)
+{
- struct iio_dmabuf dmabuf;
- if (!buffer->access->enqueue_dmabuf)
return -EPERM;
- if (copy_from_user(&dmabuf, user_buf, sizeof(dmabuf)))
return -EFAULT;
- if (dmabuf.flags & ~IIO_BUFFER_DMABUF_SUPPORTED_FLAGS)
return -EINVAL;
- return buffer->access->enqueue_dmabuf(buffer, &dmabuf);
+}
+static int iio_buffer_alloc_dmabuf(struct iio_buffer *buffer,
struct iio_dmabuf_alloc_req __user *user_req)
+{
- struct iio_dmabuf_alloc_req req;
- if (!buffer->access->alloc_dmabuf)
return -EPERM;
- if (copy_from_user(&req, user_req, sizeof(req)))
return -EFAULT;
- if (req.resv)
return -EINVAL;
- return buffer->access->alloc_dmabuf(buffer, &req);
+}
+static long iio_buffer_chrdev_ioctl(struct file *filp,
unsigned int cmd, unsigned long arg)
+{
- struct iio_dev_buffer_pair *ib = filp->private_data;
- struct iio_buffer *buffer = ib->buffer;
- void __user *_arg = (void __user *)arg;
- switch (cmd) {
- case IIO_BUFFER_DMABUF_ALLOC_IOCTL:
return iio_buffer_alloc_dmabuf(buffer, _arg);
- case IIO_BUFFER_DMABUF_ENQUEUE_IOCTL:
/* TODO: support non-blocking enqueue operation */
return iio_buffer_enqueue_dmabuf(buffer, _arg);
- default:
return IIO_IOCTL_UNHANDLED;
- }
+}
static const struct file_operations iio_buffer_chrdev_fileops = { .owner = THIS_MODULE, .llseek = noop_llseek, .read = iio_buffer_read, .write = iio_buffer_write,
- .unlocked_ioctl = iio_buffer_chrdev_ioctl,
- .compat_ioctl = compat_ptr_ioctl, .poll = iio_buffer_poll, .release = iio_buffer_chrdev_release,
}; diff --git a/include/linux/iio/buffer_impl.h b/include/linux/iio/buffer_impl.h index e2ca8ea23e19..728541bc2c63 100644 --- a/include/linux/iio/buffer_impl.h +++ b/include/linux/iio/buffer_impl.h @@ -39,6 +39,9 @@ struct iio_buffer;
device stops sampling. Calles are balanced
with @enable.
- @release: called when the last reference to the buffer is
dropped,
should free all resources allocated by the buffer.
- @alloc_dmabuf: called from userspace via ioctl to allocate one
DMABUF.
- @enqueue_dmabuf: called from userspace via ioctl to queue this
DMABUF
object to this buffer. Requires a valid DMABUF fd.
- @modes: Supported operating modes by this buffer type
- @flags: A bitmask combination of INDIO_BUFFER_FLAG_*
@@ -68,6 +71,11 @@ struct iio_buffer_access_funcs {
void (*release)(struct iio_buffer *buffer);
- int (*alloc_dmabuf)(struct iio_buffer *buffer,
struct iio_dmabuf_alloc_req *req);
- int (*enqueue_dmabuf)(struct iio_buffer *buffer,
struct iio_dmabuf *block);
- unsigned int modes; unsigned int flags;
}; diff --git a/include/uapi/linux/iio/buffer.h b/include/uapi/linux/iio/buffer.h index 13939032b3f6..e4621b926262 100644 --- a/include/uapi/linux/iio/buffer.h +++ b/include/uapi/linux/iio/buffer.h @@ -5,6 +5,35 @@ #ifndef _UAPI_IIO_BUFFER_H_ #define _UAPI_IIO_BUFFER_H_
+#include <linux/types.h>
+#define IIO_BUFFER_DMABUF_SUPPORTED_FLAGS 0x00000000
+/**
- struct iio_dmabuf_alloc_req - Descriptor for allocating IIO
DMABUFs
- @size: the size of a single DMABUF
- @resv: reserved
- */
+struct iio_dmabuf_alloc_req {
- __u64 size;
- __u64 resv;
+};
+/**
- struct iio_dmabuf - Descriptor for a single IIO DMABUF object
- @fd: file descriptor of the DMABUF object
- @flags: one or more IIO_BUFFER_DMABUF_* flags
- @bytes_used: number of bytes used in this DMABUF for the data
transfer.
If zero, the full buffer is used.
- */
+struct iio_dmabuf {
- __u32 fd;
- __u32 flags;
- __u64 bytes_used;
+};
#define IIO_BUFFER_GET_FD_IOCTL _IOWR('i', 0x91, int) +#define IIO_BUFFER_DMABUF_ALLOC_IOCTL _IOW('i', 0x92, struct iio_dmabuf_alloc_req) +#define IIO_BUFFER_DMABUF_ENQUEUE_IOCTL _IOW('i', 0x93, struct iio_dmabuf)
#endif /* _UAPI_IIO_BUFFER_H_ */
On Mon, 28 Mar 2022 19:44:19 +0100 Paul Cercueil paul@crapouillou.net wrote:
Hi Jonathan,
Le lun., mars 28 2022 at 18:37:01 +0100, Jonathan Cameron jic23@kernel.org a écrit :
On Mon, 7 Feb 2022 12:59:26 +0000 Paul Cercueil paul@crapouillou.net wrote:
Add the necessary infrastructure to the IIO core to support a new optional DMABUF based interface.
The advantage of this new DMABUF based interface vs. the read() interface, is that it avoids an extra copy of the data between the kernel and userspace. This is particularly userful for high-speed
useful
devices which produce several megabytes or even gigabytes of data per second.
The data in this new DMABUF interface is managed at the granularity of DMABUF objects. Reducing the granularity from byte level to block level is done to reduce the userspace-kernelspace synchronization overhead since performing syscalls for each byte at a few Mbps is just not feasible.
This of course leads to a slightly increased latency. For this reason an application can choose the size of the DMABUFs as well as how many it allocates. E.g. two DMABUFs would be a traditional double buffering scheme. But using a higher number might be necessary to avoid underflow/overflow situations in the presence of scheduling latencies.
As part of the interface, 2 new IOCTLs have been added:
IIO_BUFFER_DMABUF_ALLOC_IOCTL(struct iio_dmabuf_alloc_req *): Each call will allocate a new DMABUF object. The return value (if not a negative errno value as error) will be the file descriptor of the new DMABUF.
IIO_BUFFER_DMABUF_ENQUEUE_IOCTL(struct iio_dmabuf *): Place the DMABUF object into the queue pending for hardware process.
These two IOCTLs have to be performed on the IIO buffer's file descriptor, obtained using the IIO_BUFFER_GET_FD_IOCTL() ioctl.
Just to check, do they work on the old deprecated chardev route? Normally we can directly access the first buffer without the ioctl.
They do not. I think it's fine this way, since as you said, the old chardev route is deprecated. But I can add support for it with enough peer pressure.
Agreed. Definitely fine to not support the 'old way'.
J
To access the data stored in a block by userspace the block must be mapped to the process's memory. This is done by calling mmap() on the DMABUF's file descriptor.
Before accessing the data through the map, you must use the DMA_BUF_IOCTL_SYNC(struct dma_buf_sync *) ioctl, with the DMA_BUF_SYNC_START flag, to make sure that the data is available. This call may block until the hardware is done with this block. Once you are done reading or writing the data, you must use this ioctl again with the DMA_BUF_SYNC_END flag, before enqueueing the DMABUF to the kernel's queue.
If you need to know when the hardware is done with a DMABUF, you can poll its file descriptor for the EPOLLOUT event.
Finally, to destroy a DMABUF object, simply call close() on its file descriptor.
A typical workflow for the new interface is:
for block in blocks: DMABUF_ALLOC block mmap block
enable buffer
while !done for block in blocks: DMABUF_ENQUEUE block
DMABUF_SYNC_START block process data DMABUF_SYNC_END block
disable buffer
for block in blocks: close block
Given my very limited knowledge of dma-buf, I'll leave commenting on the flow to others who know if this looks 'standards' or not ;)
Code looks sane to me..
Thanks.
Cheers, -Paul
v2: Only allow the new IOCTLs on the buffer FD created with IIO_BUFFER_GET_FD_IOCTL().
Signed-off-by: Paul Cercueil paul@crapouillou.net
drivers/iio/industrialio-buffer.c | 55 +++++++++++++++++++++++++++++++ include/linux/iio/buffer_impl.h | 8 +++++ include/uapi/linux/iio/buffer.h | 29 ++++++++++++++++ 3 files changed, 92 insertions(+)
diff --git a/drivers/iio/industrialio-buffer.c b/drivers/iio/industrialio-buffer.c index 94eb9f6cf128..72f333a519bc 100644 --- a/drivers/iio/industrialio-buffer.c +++ b/drivers/iio/industrialio-buffer.c @@ -17,6 +17,7 @@ #include <linux/fs.h> #include <linux/cdev.h> #include <linux/slab.h> +#include <linux/mm.h> #include <linux/poll.h> #include <linux/sched/signal.h>
@@ -1520,11 +1521,65 @@ static int iio_buffer_chrdev_release(struct inode *inode, struct file *filep) return 0; }
+static int iio_buffer_enqueue_dmabuf(struct iio_buffer *buffer,
struct iio_dmabuf __user *user_buf)
+{
- struct iio_dmabuf dmabuf;
- if (!buffer->access->enqueue_dmabuf)
return -EPERM;
- if (copy_from_user(&dmabuf, user_buf, sizeof(dmabuf)))
return -EFAULT;
- if (dmabuf.flags & ~IIO_BUFFER_DMABUF_SUPPORTED_FLAGS)
return -EINVAL;
- return buffer->access->enqueue_dmabuf(buffer, &dmabuf);
+}
+static int iio_buffer_alloc_dmabuf(struct iio_buffer *buffer,
struct iio_dmabuf_alloc_req __user *user_req)
+{
- struct iio_dmabuf_alloc_req req;
- if (!buffer->access->alloc_dmabuf)
return -EPERM;
- if (copy_from_user(&req, user_req, sizeof(req)))
return -EFAULT;
- if (req.resv)
return -EINVAL;
- return buffer->access->alloc_dmabuf(buffer, &req);
+}
+static long iio_buffer_chrdev_ioctl(struct file *filp,
unsigned int cmd, unsigned long arg)
+{
- struct iio_dev_buffer_pair *ib = filp->private_data;
- struct iio_buffer *buffer = ib->buffer;
- void __user *_arg = (void __user *)arg;
- switch (cmd) {
- case IIO_BUFFER_DMABUF_ALLOC_IOCTL:
return iio_buffer_alloc_dmabuf(buffer, _arg);
- case IIO_BUFFER_DMABUF_ENQUEUE_IOCTL:
/* TODO: support non-blocking enqueue operation */
return iio_buffer_enqueue_dmabuf(buffer, _arg);
- default:
return IIO_IOCTL_UNHANDLED;
- }
+}
static const struct file_operations iio_buffer_chrdev_fileops = { .owner = THIS_MODULE, .llseek = noop_llseek, .read = iio_buffer_read, .write = iio_buffer_write,
- .unlocked_ioctl = iio_buffer_chrdev_ioctl,
- .compat_ioctl = compat_ptr_ioctl, .poll = iio_buffer_poll, .release = iio_buffer_chrdev_release,
}; diff --git a/include/linux/iio/buffer_impl.h b/include/linux/iio/buffer_impl.h index e2ca8ea23e19..728541bc2c63 100644 --- a/include/linux/iio/buffer_impl.h +++ b/include/linux/iio/buffer_impl.h @@ -39,6 +39,9 @@ struct iio_buffer;
device stops sampling. Calles are balanced
with @enable.
- @release: called when the last reference to the buffer is
dropped,
should free all resources allocated by the buffer.
- @alloc_dmabuf: called from userspace via ioctl to allocate one
DMABUF.
- @enqueue_dmabuf: called from userspace via ioctl to queue this
DMABUF
object to this buffer. Requires a valid DMABUF fd.
- @modes: Supported operating modes by this buffer type
- @flags: A bitmask combination of INDIO_BUFFER_FLAG_*
@@ -68,6 +71,11 @@ struct iio_buffer_access_funcs {
void (*release)(struct iio_buffer *buffer);
- int (*alloc_dmabuf)(struct iio_buffer *buffer,
struct iio_dmabuf_alloc_req *req);
- int (*enqueue_dmabuf)(struct iio_buffer *buffer,
struct iio_dmabuf *block);
- unsigned int modes; unsigned int flags;
}; diff --git a/include/uapi/linux/iio/buffer.h b/include/uapi/linux/iio/buffer.h index 13939032b3f6..e4621b926262 100644 --- a/include/uapi/linux/iio/buffer.h +++ b/include/uapi/linux/iio/buffer.h @@ -5,6 +5,35 @@ #ifndef _UAPI_IIO_BUFFER_H_ #define _UAPI_IIO_BUFFER_H_
+#include <linux/types.h>
+#define IIO_BUFFER_DMABUF_SUPPORTED_FLAGS 0x00000000
+/**
- struct iio_dmabuf_alloc_req - Descriptor for allocating IIO
DMABUFs
- @size: the size of a single DMABUF
- @resv: reserved
- */
+struct iio_dmabuf_alloc_req {
- __u64 size;
- __u64 resv;
+};
+/**
- struct iio_dmabuf - Descriptor for a single IIO DMABUF object
- @fd: file descriptor of the DMABUF object
- @flags: one or more IIO_BUFFER_DMABUF_* flags
- @bytes_used: number of bytes used in this DMABUF for the data
transfer.
If zero, the full buffer is used.
- */
+struct iio_dmabuf {
- __u32 fd;
- __u32 flags;
- __u64 bytes_used;
+};
#define IIO_BUFFER_GET_FD_IOCTL _IOWR('i', 0x91, int) +#define IIO_BUFFER_DMABUF_ALLOC_IOCTL _IOW('i', 0x92, struct iio_dmabuf_alloc_req) +#define IIO_BUFFER_DMABUF_ENQUEUE_IOCTL _IOW('i', 0x93, struct iio_dmabuf)
#endif /* _UAPI_IIO_BUFFER_H_ */
On Tue, Feb 8, 2022 at 5:26 PM Paul Cercueil paul@crapouillou.net wrote:
Add the necessary infrastructure to the IIO core to support a new optional DMABUF based interface.
The advantage of this new DMABUF based interface vs. the read() interface, is that it avoids an extra copy of the data between the kernel and userspace. This is particularly userful for high-speed
useful
devices which produce several megabytes or even gigabytes of data per second.
The data in this new DMABUF interface is managed at the granularity of DMABUF objects. Reducing the granularity from byte level to block level is done to reduce the userspace-kernelspace synchronization overhead since performing syscalls for each byte at a few Mbps is just not feasible.
This of course leads to a slightly increased latency. For this reason an application can choose the size of the DMABUFs as well as how many it allocates. E.g. two DMABUFs would be a traditional double buffering scheme. But using a higher number might be necessary to avoid underflow/overflow situations in the presence of scheduling latencies.
As part of the interface, 2 new IOCTLs have been added:
IIO_BUFFER_DMABUF_ALLOC_IOCTL(struct iio_dmabuf_alloc_req *): Each call will allocate a new DMABUF object. The return value (if not a negative errno value as error) will be the file descriptor of the new DMABUF.
IIO_BUFFER_DMABUF_ENQUEUE_IOCTL(struct iio_dmabuf *): Place the DMABUF object into the queue pending for hardware process.
These two IOCTLs have to be performed on the IIO buffer's file descriptor, obtained using the IIO_BUFFER_GET_FD_IOCTL() ioctl.
To access the data stored in a block by userspace the block must be mapped to the process's memory. This is done by calling mmap() on the DMABUF's file descriptor.
Before accessing the data through the map, you must use the DMA_BUF_IOCTL_SYNC(struct dma_buf_sync *) ioctl, with the DMA_BUF_SYNC_START flag, to make sure that the data is available. This call may block until the hardware is done with this block. Once you are done reading or writing the data, you must use this ioctl again with the DMA_BUF_SYNC_END flag, before enqueueing the DMABUF to the kernel's queue.
If you need to know when the hardware is done with a DMABUF, you can poll its file descriptor for the EPOLLOUT event.
Finally, to destroy a DMABUF object, simply call close() on its file descriptor.
...
v2: Only allow the new IOCTLs on the buffer FD created with IIO_BUFFER_GET_FD_IOCTL().
Move changelogs after the cutter '--- ' line.
...
static const struct file_operations iio_buffer_chrdev_fileops = { .owner = THIS_MODULE, .llseek = noop_llseek, .read = iio_buffer_read, .write = iio_buffer_write,
.unlocked_ioctl = iio_buffer_chrdev_ioctl,
.compat_ioctl = compat_ptr_ioctl,
Is this member always available (implying the kernel configuration)?
...
+#define IIO_BUFFER_DMABUF_SUPPORTED_FLAGS 0x00000000
No flags available right now?
...
- @bytes_used: number of bytes used in this DMABUF for the data transfer.
If zero, the full buffer is used.
Wouldn't be error prone to have 0 defined like this?
From: Alexandru Ardelean alexandru.ardelean@analog.com
A part of the logic in the iio_dma_buffer_exit() is required for the change to add mmap support to IIO buffers. This change splits the logic into a separate function, which will be re-used later.
Signed-off-by: Alexandru Ardelean alexandru.ardelean@analog.com Cc: Alexandru Ardelean ardeleanalex@gmail.com Signed-off-by: Paul Cercueil paul@crapouillou.net --- drivers/iio/buffer/industrialio-buffer-dma.c | 43 +++++++++++--------- 1 file changed, 24 insertions(+), 19 deletions(-)
diff --git a/drivers/iio/buffer/industrialio-buffer-dma.c b/drivers/iio/buffer/industrialio-buffer-dma.c index a9f1b673374f..15ea7bc3ac08 100644 --- a/drivers/iio/buffer/industrialio-buffer-dma.c +++ b/drivers/iio/buffer/industrialio-buffer-dma.c @@ -374,6 +374,29 @@ int iio_dma_buffer_request_update(struct iio_buffer *buffer) } EXPORT_SYMBOL_GPL(iio_dma_buffer_request_update);
+static void iio_dma_buffer_fileio_free(struct iio_dma_buffer_queue *queue) +{ + unsigned int i; + + spin_lock_irq(&queue->list_lock); + for (i = 0; i < ARRAY_SIZE(queue->fileio.blocks); i++) { + if (!queue->fileio.blocks[i]) + continue; + queue->fileio.blocks[i]->state = IIO_BLOCK_STATE_DEAD; + } + spin_unlock_irq(&queue->list_lock); + + INIT_LIST_HEAD(&queue->incoming); + + for (i = 0; i < ARRAY_SIZE(queue->fileio.blocks); i++) { + if (!queue->fileio.blocks[i]) + continue; + iio_buffer_block_put(queue->fileio.blocks[i]); + queue->fileio.blocks[i] = NULL; + } + queue->fileio.active_block = NULL; +} + static void iio_dma_buffer_submit_block(struct iio_dma_buffer_queue *queue, struct iio_dma_buffer_block *block) { @@ -694,27 +717,9 @@ EXPORT_SYMBOL_GPL(iio_dma_buffer_init); */ void iio_dma_buffer_exit(struct iio_dma_buffer_queue *queue) { - unsigned int i; - mutex_lock(&queue->lock);
- spin_lock_irq(&queue->list_lock); - for (i = 0; i < ARRAY_SIZE(queue->fileio.blocks); i++) { - if (!queue->fileio.blocks[i]) - continue; - queue->fileio.blocks[i]->state = IIO_BLOCK_STATE_DEAD; - } - spin_unlock_irq(&queue->list_lock); - - INIT_LIST_HEAD(&queue->incoming); - - for (i = 0; i < ARRAY_SIZE(queue->fileio.blocks); i++) { - if (!queue->fileio.blocks[i]) - continue; - iio_buffer_block_put(queue->fileio.blocks[i]); - queue->fileio.blocks[i] = NULL; - } - queue->fileio.active_block = NULL; + iio_dma_buffer_fileio_free(queue); queue->ops = NULL;
mutex_unlock(&queue->lock);
Enhance the current fileio code by using DMABUF objects instead of custom buffers.
This adds more code than it removes, but: - a lot of the complexity can be dropped, e.g. custom kref and iio_buffer_block_put_atomic() are not needed anymore; - it will be much easier to introduce an API to export these DMABUF objects to userspace in a following patch.
Signed-off-by: Paul Cercueil paul@crapouillou.net --- drivers/iio/buffer/industrialio-buffer-dma.c | 192 ++++++++++++------- include/linux/iio/buffer-dma.h | 8 +- 2 files changed, 122 insertions(+), 78 deletions(-)
diff --git a/drivers/iio/buffer/industrialio-buffer-dma.c b/drivers/iio/buffer/industrialio-buffer-dma.c index 15ea7bc3ac08..54e6000cd2ee 100644 --- a/drivers/iio/buffer/industrialio-buffer-dma.c +++ b/drivers/iio/buffer/industrialio-buffer-dma.c @@ -14,6 +14,7 @@ #include <linux/poll.h> #include <linux/iio/buffer_impl.h> #include <linux/iio/buffer-dma.h> +#include <linux/dma-buf.h> #include <linux/dma-mapping.h> #include <linux/sizes.h>
@@ -90,103 +91,145 @@ * callback is called from within the custom callback. */
-static void iio_buffer_block_release(struct kref *kref) -{ - struct iio_dma_buffer_block *block = container_of(kref, - struct iio_dma_buffer_block, kref); - - WARN_ON(block->state != IIO_BLOCK_STATE_DEAD); - - dma_free_coherent(block->queue->dev, PAGE_ALIGN(block->size), - block->vaddr, block->phys_addr); - - iio_buffer_put(&block->queue->buffer); - kfree(block); -} - -static void iio_buffer_block_get(struct iio_dma_buffer_block *block) -{ - kref_get(&block->kref); -} - -static void iio_buffer_block_put(struct iio_dma_buffer_block *block) -{ - kref_put(&block->kref, iio_buffer_block_release); -} - -/* - * dma_free_coherent can sleep, hence we need to take some special care to be - * able to drop a reference from an atomic context. - */ -static LIST_HEAD(iio_dma_buffer_dead_blocks); -static DEFINE_SPINLOCK(iio_dma_buffer_dead_blocks_lock); - -static void iio_dma_buffer_cleanup_worker(struct work_struct *work) -{ - struct iio_dma_buffer_block *block, *_block; - LIST_HEAD(block_list); - - spin_lock_irq(&iio_dma_buffer_dead_blocks_lock); - list_splice_tail_init(&iio_dma_buffer_dead_blocks, &block_list); - spin_unlock_irq(&iio_dma_buffer_dead_blocks_lock); - - list_for_each_entry_safe(block, _block, &block_list, head) - iio_buffer_block_release(&block->kref); -} -static DECLARE_WORK(iio_dma_buffer_cleanup_work, iio_dma_buffer_cleanup_worker); - -static void iio_buffer_block_release_atomic(struct kref *kref) -{ +struct iio_buffer_dma_buf_attachment { + struct scatterlist sgl; + struct sg_table sg_table; struct iio_dma_buffer_block *block; - unsigned long flags; - - block = container_of(kref, struct iio_dma_buffer_block, kref); - - spin_lock_irqsave(&iio_dma_buffer_dead_blocks_lock, flags); - list_add_tail(&block->head, &iio_dma_buffer_dead_blocks); - spin_unlock_irqrestore(&iio_dma_buffer_dead_blocks_lock, flags); - - schedule_work(&iio_dma_buffer_cleanup_work); -} - -/* - * Version of iio_buffer_block_put() that can be called from atomic context - */ -static void iio_buffer_block_put_atomic(struct iio_dma_buffer_block *block) -{ - kref_put(&block->kref, iio_buffer_block_release_atomic); -} +};
static struct iio_dma_buffer_queue *iio_buffer_to_queue(struct iio_buffer *buf) { return container_of(buf, struct iio_dma_buffer_queue, buffer); }
+static struct iio_buffer_dma_buf_attachment * +to_iio_buffer_dma_buf_attachment(struct sg_table *table) +{ + return container_of(table, struct iio_buffer_dma_buf_attachment, sg_table); +} + +static void iio_buffer_block_get(struct iio_dma_buffer_block *block) +{ + get_dma_buf(block->dmabuf); +} + +static void iio_buffer_block_put(struct iio_dma_buffer_block *block) +{ + dma_buf_put(block->dmabuf); +} + +static int iio_buffer_dma_buf_attach(struct dma_buf *dbuf, + struct dma_buf_attachment *at) +{ + at->priv = dbuf->priv; + + return 0; +} + +static struct sg_table *iio_buffer_dma_buf_map(struct dma_buf_attachment *at, + enum dma_data_direction dma_dir) +{ + struct iio_dma_buffer_block *block = at->priv; + struct iio_buffer_dma_buf_attachment *dba; + int ret; + + dba = kzalloc(sizeof(*dba), GFP_KERNEL); + if (!dba) + return ERR_PTR(-ENOMEM); + + sg_init_one(&dba->sgl, block->vaddr, PAGE_ALIGN(block->size)); + dba->sg_table.sgl = &dba->sgl; + dba->sg_table.nents = 1; + dba->block = block; + + ret = dma_map_sgtable(at->dev, &dba->sg_table, dma_dir, 0); + if (ret) { + kfree(dba); + return ERR_PTR(ret); + } + + return &dba->sg_table; +} + +static void iio_buffer_dma_buf_unmap(struct dma_buf_attachment *at, + struct sg_table *sg_table, + enum dma_data_direction dma_dir) +{ + struct iio_buffer_dma_buf_attachment *dba = + to_iio_buffer_dma_buf_attachment(sg_table); + + dma_unmap_sgtable(at->dev, &dba->sg_table, dma_dir, 0); + kfree(dba); +} + +static void iio_buffer_dma_buf_release(struct dma_buf *dbuf) +{ + struct iio_dma_buffer_block *block = dbuf->priv; + struct iio_dma_buffer_queue *queue = block->queue; + + WARN_ON(block->state != IIO_BLOCK_STATE_DEAD); + + mutex_lock(&queue->lock); + + dma_free_coherent(queue->dev, PAGE_ALIGN(block->size), + block->vaddr, block->phys_addr); + kfree(block); + + mutex_unlock(&queue->lock); + iio_buffer_put(&queue->buffer); +} + +static const struct dma_buf_ops iio_dma_buffer_dmabuf_ops = { + .attach = iio_buffer_dma_buf_attach, + .map_dma_buf = iio_buffer_dma_buf_map, + .unmap_dma_buf = iio_buffer_dma_buf_unmap, + .release = iio_buffer_dma_buf_release, +}; + static struct iio_dma_buffer_block *iio_dma_buffer_alloc_block( struct iio_dma_buffer_queue *queue, size_t size) { struct iio_dma_buffer_block *block; + DEFINE_DMA_BUF_EXPORT_INFO(einfo); + struct dma_buf *dmabuf; + int err = -ENOMEM;
block = kzalloc(sizeof(*block), GFP_KERNEL); if (!block) - return NULL; + return ERR_PTR(err);
block->vaddr = dma_alloc_coherent(queue->dev, PAGE_ALIGN(size), &block->phys_addr, GFP_KERNEL); - if (!block->vaddr) { - kfree(block); - return NULL; + if (!block->vaddr) + goto err_free_block; + + einfo.ops = &iio_dma_buffer_dmabuf_ops; + einfo.size = PAGE_ALIGN(size); + einfo.priv = block; + einfo.flags = O_RDWR; + + dmabuf = dma_buf_export(&einfo); + if (IS_ERR(dmabuf)) { + err = PTR_ERR(dmabuf); + goto err_free_dma; }
+ block->dmabuf = dmabuf; block->size = size; block->state = IIO_BLOCK_STATE_DONE; block->queue = queue; INIT_LIST_HEAD(&block->head); - kref_init(&block->kref);
iio_buffer_get(&queue->buffer);
return block; + +err_free_dma: + dma_free_coherent(queue->dev, PAGE_ALIGN(size), + block->vaddr, block->phys_addr); +err_free_block: + kfree(block); + return ERR_PTR(err); }
static void _iio_dma_buffer_block_done(struct iio_dma_buffer_block *block) @@ -223,7 +266,7 @@ void iio_dma_buffer_block_done(struct iio_dma_buffer_block *block) _iio_dma_buffer_block_done(block); spin_unlock_irqrestore(&queue->list_lock, flags);
- iio_buffer_block_put_atomic(block); + iio_buffer_block_put(block); iio_dma_buffer_queue_wake(queue); } EXPORT_SYMBOL_GPL(iio_dma_buffer_block_done); @@ -249,7 +292,8 @@ void iio_dma_buffer_block_list_abort(struct iio_dma_buffer_queue *queue, list_del(&block->head); block->bytes_used = 0; _iio_dma_buffer_block_done(block); - iio_buffer_block_put_atomic(block); + + iio_buffer_block_put(block); } spin_unlock_irqrestore(&queue->list_lock, flags);
@@ -340,8 +384,8 @@ int iio_dma_buffer_request_update(struct iio_buffer *buffer)
if (!block) { block = iio_dma_buffer_alloc_block(queue, size); - if (!block) { - ret = -ENOMEM; + if (IS_ERR(block)) { + ret = PTR_ERR(block); goto out_unlock; } queue->fileio.blocks[i] = block; diff --git a/include/linux/iio/buffer-dma.h b/include/linux/iio/buffer-dma.h index 490b93f76fa8..6b3fa7d2124b 100644 --- a/include/linux/iio/buffer-dma.h +++ b/include/linux/iio/buffer-dma.h @@ -8,7 +8,6 @@ #define __INDUSTRIALIO_DMA_BUFFER_H__
#include <linux/list.h> -#include <linux/kref.h> #include <linux/spinlock.h> #include <linux/mutex.h> #include <linux/iio/buffer_impl.h> @@ -16,6 +15,7 @@ struct iio_dma_buffer_queue; struct iio_dma_buffer_ops; struct device; +struct dma_buf;
/** * enum iio_block_state - State of a struct iio_dma_buffer_block @@ -39,8 +39,8 @@ enum iio_block_state { * @vaddr: Virutal address of the blocks memory * @phys_addr: Physical address of the blocks memory * @queue: Parent DMA buffer queue - * @kref: kref used to manage the lifetime of block * @state: Current state of the block + * @dmabuf: Underlying DMABUF object */ struct iio_dma_buffer_block { /* May only be accessed by the owner of the block */ @@ -56,13 +56,13 @@ struct iio_dma_buffer_block { size_t size; struct iio_dma_buffer_queue *queue;
- /* Must not be accessed outside the core. */ - struct kref kref; /* * Must not be accessed outside the core. Access needs to hold * queue->list_lock if the block is not owned by the core. */ enum iio_block_state state; + + struct dma_buf *dmabuf; };
/**
On Mon, 7 Feb 2022 12:59:28 +0000 Paul Cercueil paul@crapouillou.net wrote:
Enhance the current fileio code by using DMABUF objects instead of custom buffers.
This adds more code than it removes, but:
- a lot of the complexity can be dropped, e.g. custom kref and iio_buffer_block_put_atomic() are not needed anymore;
- it will be much easier to introduce an API to export these DMABUF objects to userspace in a following patch.
Signed-off-by: Paul Cercueil paul@crapouillou.net
Hi Paul,
I'm a bit rusty on dma mappings, but you seem to have a mixture of streaming and coherent mappings going on in here.
Is it the case that the current code is using the coherent mappings and a potential 'other user' of the dma buffer might need streaming mappings?
Jonathan
drivers/iio/buffer/industrialio-buffer-dma.c | 192 ++++++++++++------- include/linux/iio/buffer-dma.h | 8 +- 2 files changed, 122 insertions(+), 78 deletions(-)
diff --git a/drivers/iio/buffer/industrialio-buffer-dma.c b/drivers/iio/buffer/industrialio-buffer-dma.c index 15ea7bc3ac08..54e6000cd2ee 100644 --- a/drivers/iio/buffer/industrialio-buffer-dma.c +++ b/drivers/iio/buffer/industrialio-buffer-dma.c @@ -14,6 +14,7 @@ #include <linux/poll.h> #include <linux/iio/buffer_impl.h> #include <linux/iio/buffer-dma.h> +#include <linux/dma-buf.h> #include <linux/dma-mapping.h> #include <linux/sizes.h> @@ -90,103 +91,145 @@
- callback is called from within the custom callback.
*/ -static void iio_buffer_block_release(struct kref *kref) -{
- struct iio_dma_buffer_block *block = container_of(kref,
struct iio_dma_buffer_block, kref);
- WARN_ON(block->state != IIO_BLOCK_STATE_DEAD);
- dma_free_coherent(block->queue->dev, PAGE_ALIGN(block->size),
block->vaddr, block->phys_addr);
- iio_buffer_put(&block->queue->buffer);
- kfree(block);
-}
-static void iio_buffer_block_get(struct iio_dma_buffer_block *block) -{
- kref_get(&block->kref);
-}
-static void iio_buffer_block_put(struct iio_dma_buffer_block *block) -{
- kref_put(&block->kref, iio_buffer_block_release);
-}
-/*
- dma_free_coherent can sleep, hence we need to take some special care to be
- able to drop a reference from an atomic context.
- */
-static LIST_HEAD(iio_dma_buffer_dead_blocks); -static DEFINE_SPINLOCK(iio_dma_buffer_dead_blocks_lock);
-static void iio_dma_buffer_cleanup_worker(struct work_struct *work) -{
- struct iio_dma_buffer_block *block, *_block;
- LIST_HEAD(block_list);
- spin_lock_irq(&iio_dma_buffer_dead_blocks_lock);
- list_splice_tail_init(&iio_dma_buffer_dead_blocks, &block_list);
- spin_unlock_irq(&iio_dma_buffer_dead_blocks_lock);
- list_for_each_entry_safe(block, _block, &block_list, head)
iio_buffer_block_release(&block->kref);
-} -static DECLARE_WORK(iio_dma_buffer_cleanup_work, iio_dma_buffer_cleanup_worker);
-static void iio_buffer_block_release_atomic(struct kref *kref) -{ +struct iio_buffer_dma_buf_attachment {
- struct scatterlist sgl;
- struct sg_table sg_table; struct iio_dma_buffer_block *block;
- unsigned long flags;
- block = container_of(kref, struct iio_dma_buffer_block, kref);
- spin_lock_irqsave(&iio_dma_buffer_dead_blocks_lock, flags);
- list_add_tail(&block->head, &iio_dma_buffer_dead_blocks);
- spin_unlock_irqrestore(&iio_dma_buffer_dead_blocks_lock, flags);
- schedule_work(&iio_dma_buffer_cleanup_work);
-}
-/*
- Version of iio_buffer_block_put() that can be called from atomic context
- */
-static void iio_buffer_block_put_atomic(struct iio_dma_buffer_block *block) -{
- kref_put(&block->kref, iio_buffer_block_release_atomic);
-} +}; static struct iio_dma_buffer_queue *iio_buffer_to_queue(struct iio_buffer *buf) { return container_of(buf, struct iio_dma_buffer_queue, buffer); } +static struct iio_buffer_dma_buf_attachment * +to_iio_buffer_dma_buf_attachment(struct sg_table *table) +{
- return container_of(table, struct iio_buffer_dma_buf_attachment, sg_table);
+}
+static void iio_buffer_block_get(struct iio_dma_buffer_block *block) +{
- get_dma_buf(block->dmabuf);
+}
+static void iio_buffer_block_put(struct iio_dma_buffer_block *block) +{
- dma_buf_put(block->dmabuf);
+}
+static int iio_buffer_dma_buf_attach(struct dma_buf *dbuf,
struct dma_buf_attachment *at)
+{
- at->priv = dbuf->priv;
- return 0;
+}
+static struct sg_table *iio_buffer_dma_buf_map(struct dma_buf_attachment *at,
enum dma_data_direction dma_dir)
+{
- struct iio_dma_buffer_block *block = at->priv;
- struct iio_buffer_dma_buf_attachment *dba;
- int ret;
- dba = kzalloc(sizeof(*dba), GFP_KERNEL);
- if (!dba)
return ERR_PTR(-ENOMEM);
- sg_init_one(&dba->sgl, block->vaddr, PAGE_ALIGN(block->size));
- dba->sg_table.sgl = &dba->sgl;
- dba->sg_table.nents = 1;
- dba->block = block;
- ret = dma_map_sgtable(at->dev, &dba->sg_table, dma_dir, 0);
- if (ret) {
kfree(dba);
return ERR_PTR(ret);
- }
- return &dba->sg_table;
+}
+static void iio_buffer_dma_buf_unmap(struct dma_buf_attachment *at,
struct sg_table *sg_table,
enum dma_data_direction dma_dir)
+{
- struct iio_buffer_dma_buf_attachment *dba =
to_iio_buffer_dma_buf_attachment(sg_table);
- dma_unmap_sgtable(at->dev, &dba->sg_table, dma_dir, 0);
- kfree(dba);
+}
+static void iio_buffer_dma_buf_release(struct dma_buf *dbuf) +{
- struct iio_dma_buffer_block *block = dbuf->priv;
- struct iio_dma_buffer_queue *queue = block->queue;
- WARN_ON(block->state != IIO_BLOCK_STATE_DEAD);
- mutex_lock(&queue->lock);
- dma_free_coherent(queue->dev, PAGE_ALIGN(block->size),
block->vaddr, block->phys_addr);
- kfree(block);
- mutex_unlock(&queue->lock);
- iio_buffer_put(&queue->buffer);
+}
+static const struct dma_buf_ops iio_dma_buffer_dmabuf_ops = {
- .attach = iio_buffer_dma_buf_attach,
- .map_dma_buf = iio_buffer_dma_buf_map,
- .unmap_dma_buf = iio_buffer_dma_buf_unmap,
- .release = iio_buffer_dma_buf_release,
+};
static struct iio_dma_buffer_block *iio_dma_buffer_alloc_block( struct iio_dma_buffer_queue *queue, size_t size) { struct iio_dma_buffer_block *block;
- DEFINE_DMA_BUF_EXPORT_INFO(einfo);
- struct dma_buf *dmabuf;
- int err = -ENOMEM;
block = kzalloc(sizeof(*block), GFP_KERNEL); if (!block)
return NULL;
return ERR_PTR(err);
block->vaddr = dma_alloc_coherent(queue->dev, PAGE_ALIGN(size), &block->phys_addr, GFP_KERNEL);
- if (!block->vaddr) {
kfree(block);
return NULL;
- if (!block->vaddr)
goto err_free_block;
- einfo.ops = &iio_dma_buffer_dmabuf_ops;
- einfo.size = PAGE_ALIGN(size);
- einfo.priv = block;
- einfo.flags = O_RDWR;
- dmabuf = dma_buf_export(&einfo);
- if (IS_ERR(dmabuf)) {
err = PTR_ERR(dmabuf);
}goto err_free_dma;
- block->dmabuf = dmabuf; block->size = size; block->state = IIO_BLOCK_STATE_DONE; block->queue = queue; INIT_LIST_HEAD(&block->head);
- kref_init(&block->kref);
iio_buffer_get(&queue->buffer); return block;
+err_free_dma:
- dma_free_coherent(queue->dev, PAGE_ALIGN(size),
block->vaddr, block->phys_addr);
+err_free_block:
- kfree(block);
- return ERR_PTR(err);
} static void _iio_dma_buffer_block_done(struct iio_dma_buffer_block *block) @@ -223,7 +266,7 @@ void iio_dma_buffer_block_done(struct iio_dma_buffer_block *block) _iio_dma_buffer_block_done(block); spin_unlock_irqrestore(&queue->list_lock, flags);
- iio_buffer_block_put_atomic(block);
- iio_buffer_block_put(block); iio_dma_buffer_queue_wake(queue);
} EXPORT_SYMBOL_GPL(iio_dma_buffer_block_done); @@ -249,7 +292,8 @@ void iio_dma_buffer_block_list_abort(struct iio_dma_buffer_queue *queue, list_del(&block->head); block->bytes_used = 0; _iio_dma_buffer_block_done(block);
iio_buffer_block_put_atomic(block);
} spin_unlock_irqrestore(&queue->list_lock, flags);iio_buffer_block_put(block);
@@ -340,8 +384,8 @@ int iio_dma_buffer_request_update(struct iio_buffer *buffer) if (!block) { block = iio_dma_buffer_alloc_block(queue, size);
if (!block) {
ret = -ENOMEM;
if (IS_ERR(block)) {
ret = PTR_ERR(block); goto out_unlock; } queue->fileio.blocks[i] = block;
diff --git a/include/linux/iio/buffer-dma.h b/include/linux/iio/buffer-dma.h index 490b93f76fa8..6b3fa7d2124b 100644 --- a/include/linux/iio/buffer-dma.h +++ b/include/linux/iio/buffer-dma.h @@ -8,7 +8,6 @@ #define __INDUSTRIALIO_DMA_BUFFER_H__ #include <linux/list.h> -#include <linux/kref.h> #include <linux/spinlock.h> #include <linux/mutex.h> #include <linux/iio/buffer_impl.h> @@ -16,6 +15,7 @@ struct iio_dma_buffer_queue; struct iio_dma_buffer_ops; struct device; +struct dma_buf; /**
- enum iio_block_state - State of a struct iio_dma_buffer_block
@@ -39,8 +39,8 @@ enum iio_block_state {
- @vaddr: Virutal address of the blocks memory
- @phys_addr: Physical address of the blocks memory
- @queue: Parent DMA buffer queue
- @kref: kref used to manage the lifetime of block
- @state: Current state of the block
*/
- @dmabuf: Underlying DMABUF object
struct iio_dma_buffer_block { /* May only be accessed by the owner of the block */ @@ -56,13 +56,13 @@ struct iio_dma_buffer_block { size_t size; struct iio_dma_buffer_queue *queue;
- /* Must not be accessed outside the core. */
- struct kref kref; /*
*/ enum iio_block_state state;
- Must not be accessed outside the core. Access needs to hold
- queue->list_lock if the block is not owned by the core.
- struct dma_buf *dmabuf;
}; /**
Am 28.03.22 um 19:54 schrieb Jonathan Cameron:
On Mon, 7 Feb 2022 12:59:28 +0000 Paul Cercueil paul@crapouillou.net wrote:
Enhance the current fileio code by using DMABUF objects instead of custom buffers.
This adds more code than it removes, but:
- a lot of the complexity can be dropped, e.g. custom kref and iio_buffer_block_put_atomic() are not needed anymore;
- it will be much easier to introduce an API to export these DMABUF objects to userspace in a following patch.
Signed-off-by: Paul Cercueil paul@crapouillou.net
Hi Paul,
I'm a bit rusty on dma mappings, but you seem to have a mixture of streaming and coherent mappings going on in here.
Is it the case that the current code is using the coherent mappings and a potential 'other user' of the dma buffer might need streaming mappings?
Streaming mappings are generally not supported by DMA-buf.
You always have only coherent mappings.
Regards, Christian.
Jonathan
drivers/iio/buffer/industrialio-buffer-dma.c | 192 ++++++++++++------- include/linux/iio/buffer-dma.h | 8 +- 2 files changed, 122 insertions(+), 78 deletions(-)
diff --git a/drivers/iio/buffer/industrialio-buffer-dma.c b/drivers/iio/buffer/industrialio-buffer-dma.c index 15ea7bc3ac08..54e6000cd2ee 100644 --- a/drivers/iio/buffer/industrialio-buffer-dma.c +++ b/drivers/iio/buffer/industrialio-buffer-dma.c @@ -14,6 +14,7 @@ #include <linux/poll.h> #include <linux/iio/buffer_impl.h> #include <linux/iio/buffer-dma.h> +#include <linux/dma-buf.h> #include <linux/dma-mapping.h> #include <linux/sizes.h> @@ -90,103 +91,145 @@
- callback is called from within the custom callback.
*/ -static void iio_buffer_block_release(struct kref *kref) -{
- struct iio_dma_buffer_block *block = container_of(kref,
struct iio_dma_buffer_block, kref);
- WARN_ON(block->state != IIO_BLOCK_STATE_DEAD);
- dma_free_coherent(block->queue->dev, PAGE_ALIGN(block->size),
block->vaddr, block->phys_addr);
- iio_buffer_put(&block->queue->buffer);
- kfree(block);
-}
-static void iio_buffer_block_get(struct iio_dma_buffer_block *block) -{
- kref_get(&block->kref);
-}
-static void iio_buffer_block_put(struct iio_dma_buffer_block *block) -{
- kref_put(&block->kref, iio_buffer_block_release);
-}
-/*
- dma_free_coherent can sleep, hence we need to take some special care to be
- able to drop a reference from an atomic context.
- */
-static LIST_HEAD(iio_dma_buffer_dead_blocks); -static DEFINE_SPINLOCK(iio_dma_buffer_dead_blocks_lock);
-static void iio_dma_buffer_cleanup_worker(struct work_struct *work) -{
- struct iio_dma_buffer_block *block, *_block;
- LIST_HEAD(block_list);
- spin_lock_irq(&iio_dma_buffer_dead_blocks_lock);
- list_splice_tail_init(&iio_dma_buffer_dead_blocks, &block_list);
- spin_unlock_irq(&iio_dma_buffer_dead_blocks_lock);
- list_for_each_entry_safe(block, _block, &block_list, head)
iio_buffer_block_release(&block->kref);
-} -static DECLARE_WORK(iio_dma_buffer_cleanup_work, iio_dma_buffer_cleanup_worker);
-static void iio_buffer_block_release_atomic(struct kref *kref) -{ +struct iio_buffer_dma_buf_attachment {
- struct scatterlist sgl;
- struct sg_table sg_table; struct iio_dma_buffer_block *block;
- unsigned long flags;
- block = container_of(kref, struct iio_dma_buffer_block, kref);
- spin_lock_irqsave(&iio_dma_buffer_dead_blocks_lock, flags);
- list_add_tail(&block->head, &iio_dma_buffer_dead_blocks);
- spin_unlock_irqrestore(&iio_dma_buffer_dead_blocks_lock, flags);
- schedule_work(&iio_dma_buffer_cleanup_work);
-}
-/*
- Version of iio_buffer_block_put() that can be called from atomic context
- */
-static void iio_buffer_block_put_atomic(struct iio_dma_buffer_block *block) -{
- kref_put(&block->kref, iio_buffer_block_release_atomic);
-} +}; static struct iio_dma_buffer_queue *iio_buffer_to_queue(struct iio_buffer *buf) { return container_of(buf, struct iio_dma_buffer_queue, buffer); } +static struct iio_buffer_dma_buf_attachment * +to_iio_buffer_dma_buf_attachment(struct sg_table *table) +{
- return container_of(table, struct iio_buffer_dma_buf_attachment, sg_table);
+}
+static void iio_buffer_block_get(struct iio_dma_buffer_block *block) +{
- get_dma_buf(block->dmabuf);
+}
+static void iio_buffer_block_put(struct iio_dma_buffer_block *block) +{
- dma_buf_put(block->dmabuf);
+}
+static int iio_buffer_dma_buf_attach(struct dma_buf *dbuf,
struct dma_buf_attachment *at)
+{
- at->priv = dbuf->priv;
- return 0;
+}
+static struct sg_table *iio_buffer_dma_buf_map(struct dma_buf_attachment *at,
enum dma_data_direction dma_dir)
+{
- struct iio_dma_buffer_block *block = at->priv;
- struct iio_buffer_dma_buf_attachment *dba;
- int ret;
- dba = kzalloc(sizeof(*dba), GFP_KERNEL);
- if (!dba)
return ERR_PTR(-ENOMEM);
- sg_init_one(&dba->sgl, block->vaddr, PAGE_ALIGN(block->size));
- dba->sg_table.sgl = &dba->sgl;
- dba->sg_table.nents = 1;
- dba->block = block;
- ret = dma_map_sgtable(at->dev, &dba->sg_table, dma_dir, 0);
- if (ret) {
kfree(dba);
return ERR_PTR(ret);
- }
- return &dba->sg_table;
+}
+static void iio_buffer_dma_buf_unmap(struct dma_buf_attachment *at,
struct sg_table *sg_table,
enum dma_data_direction dma_dir)
+{
- struct iio_buffer_dma_buf_attachment *dba =
to_iio_buffer_dma_buf_attachment(sg_table);
- dma_unmap_sgtable(at->dev, &dba->sg_table, dma_dir, 0);
- kfree(dba);
+}
+static void iio_buffer_dma_buf_release(struct dma_buf *dbuf) +{
- struct iio_dma_buffer_block *block = dbuf->priv;
- struct iio_dma_buffer_queue *queue = block->queue;
- WARN_ON(block->state != IIO_BLOCK_STATE_DEAD);
- mutex_lock(&queue->lock);
- dma_free_coherent(queue->dev, PAGE_ALIGN(block->size),
block->vaddr, block->phys_addr);
- kfree(block);
- mutex_unlock(&queue->lock);
- iio_buffer_put(&queue->buffer);
+}
+static const struct dma_buf_ops iio_dma_buffer_dmabuf_ops = {
- .attach = iio_buffer_dma_buf_attach,
- .map_dma_buf = iio_buffer_dma_buf_map,
- .unmap_dma_buf = iio_buffer_dma_buf_unmap,
- .release = iio_buffer_dma_buf_release,
+};
- static struct iio_dma_buffer_block *iio_dma_buffer_alloc_block( struct iio_dma_buffer_queue *queue, size_t size) { struct iio_dma_buffer_block *block;
- DEFINE_DMA_BUF_EXPORT_INFO(einfo);
- struct dma_buf *dmabuf;
- int err = -ENOMEM;
block = kzalloc(sizeof(*block), GFP_KERNEL); if (!block)
return NULL;
return ERR_PTR(err);
block->vaddr = dma_alloc_coherent(queue->dev, PAGE_ALIGN(size), &block->phys_addr, GFP_KERNEL);
- if (!block->vaddr) {
kfree(block);
return NULL;
- if (!block->vaddr)
goto err_free_block;
- einfo.ops = &iio_dma_buffer_dmabuf_ops;
- einfo.size = PAGE_ALIGN(size);
- einfo.priv = block;
- einfo.flags = O_RDWR;
- dmabuf = dma_buf_export(&einfo);
- if (IS_ERR(dmabuf)) {
err = PTR_ERR(dmabuf);
}goto err_free_dma;
- block->dmabuf = dmabuf; block->size = size; block->state = IIO_BLOCK_STATE_DONE; block->queue = queue; INIT_LIST_HEAD(&block->head);
- kref_init(&block->kref);
iio_buffer_get(&queue->buffer); return block;
+err_free_dma:
- dma_free_coherent(queue->dev, PAGE_ALIGN(size),
block->vaddr, block->phys_addr);
+err_free_block:
- kfree(block);
- return ERR_PTR(err); }
static void _iio_dma_buffer_block_done(struct iio_dma_buffer_block *block) @@ -223,7 +266,7 @@ void iio_dma_buffer_block_done(struct iio_dma_buffer_block *block) _iio_dma_buffer_block_done(block); spin_unlock_irqrestore(&queue->list_lock, flags);
- iio_buffer_block_put_atomic(block);
- iio_buffer_block_put(block); iio_dma_buffer_queue_wake(queue); } EXPORT_SYMBOL_GPL(iio_dma_buffer_block_done);
@@ -249,7 +292,8 @@ void iio_dma_buffer_block_list_abort(struct iio_dma_buffer_queue *queue, list_del(&block->head); block->bytes_used = 0; _iio_dma_buffer_block_done(block);
iio_buffer_block_put_atomic(block);
} spin_unlock_irqrestore(&queue->list_lock, flags);iio_buffer_block_put(block);
@@ -340,8 +384,8 @@ int iio_dma_buffer_request_update(struct iio_buffer *buffer) if (!block) { block = iio_dma_buffer_alloc_block(queue, size);
if (!block) {
ret = -ENOMEM;
if (IS_ERR(block)) {
ret = PTR_ERR(block); goto out_unlock; } queue->fileio.blocks[i] = block;
diff --git a/include/linux/iio/buffer-dma.h b/include/linux/iio/buffer-dma.h index 490b93f76fa8..6b3fa7d2124b 100644 --- a/include/linux/iio/buffer-dma.h +++ b/include/linux/iio/buffer-dma.h @@ -8,7 +8,6 @@ #define __INDUSTRIALIO_DMA_BUFFER_H__ #include <linux/list.h> -#include <linux/kref.h> #include <linux/spinlock.h> #include <linux/mutex.h> #include <linux/iio/buffer_impl.h> @@ -16,6 +15,7 @@ struct iio_dma_buffer_queue; struct iio_dma_buffer_ops; struct device; +struct dma_buf; /**
- enum iio_block_state - State of a struct iio_dma_buffer_block
@@ -39,8 +39,8 @@ enum iio_block_state {
- @vaddr: Virutal address of the blocks memory
- @phys_addr: Physical address of the blocks memory
- @queue: Parent DMA buffer queue
- @kref: kref used to manage the lifetime of block
- @state: Current state of the block
*/ struct iio_dma_buffer_block { /* May only be accessed by the owner of the block */
- @dmabuf: Underlying DMABUF object
@@ -56,13 +56,13 @@ struct iio_dma_buffer_block { size_t size; struct iio_dma_buffer_queue *queue;
- /* Must not be accessed outside the core. */
- struct kref kref; /*
*/ enum iio_block_state state;
- Must not be accessed outside the core. Access needs to hold
- queue->list_lock if the block is not owned by the core.
- struct dma_buf *dmabuf; };
/**
Hi Jonathan,
Le lun., mars 28 2022 at 18:54:25 +0100, Jonathan Cameron jic23@kernel.org a écrit :
On Mon, 7 Feb 2022 12:59:28 +0000 Paul Cercueil paul@crapouillou.net wrote:
Enhance the current fileio code by using DMABUF objects instead of custom buffers.
This adds more code than it removes, but:
- a lot of the complexity can be dropped, e.g. custom kref and iio_buffer_block_put_atomic() are not needed anymore;
- it will be much easier to introduce an API to export these DMABUF objects to userspace in a following patch.
Signed-off-by: Paul Cercueil paul@crapouillou.net
Hi Paul,
I'm a bit rusty on dma mappings, but you seem to have a mixture of streaming and coherent mappings going on in here.
That's OK, so am I. What do you call "streaming mappings"?
Is it the case that the current code is using the coherent mappings and a potential 'other user' of the dma buffer might need streaming mappings?
Something like that. There are two different things; on both cases, userspace needs to create a DMABUF with IIO_BUFFER_DMABUF_ALLOC_IOCTL, and the backing memory is allocated with dma_alloc_coherent().
- For the userspace interface, you then have a "cpu access" IOCTL (DMA_BUF_IOCTL_SYNC), that allows userspace to inform when it will start/finish to process the buffer in user-space (which will sync/invalidate the data cache if needed). A buffer can then be enqueued for DMA processing (TX or RX) with the new IIO_BUFFER_DMABUF_ENQUEUE_IOCTL.
- When the DMABUF created via the IIO core is sent to another driver through the driver's custom DMABUF import function, this driver will call dma_buf_attach(), which will call iio_buffer_dma_buf_map(). Since it has to return a "struct sg_table *", this function then simply creates a sgtable with one entry that points to the backing memory.
Note that I added the iio_buffer_dma_buf_map() / _unmap() functions because the dma-buf core would WARN() if these were not provided. But since this code doesn't yet support importing/exporting DMABUFs to other drivers, these are never called, and I should probably just make them return a ERR_PTR() unconditionally.
Cheers, -Paul
Jonathan
drivers/iio/buffer/industrialio-buffer-dma.c | 192 ++++++++++++------- include/linux/iio/buffer-dma.h | 8 +- 2 files changed, 122 insertions(+), 78 deletions(-)
diff --git a/drivers/iio/buffer/industrialio-buffer-dma.c b/drivers/iio/buffer/industrialio-buffer-dma.c index 15ea7bc3ac08..54e6000cd2ee 100644 --- a/drivers/iio/buffer/industrialio-buffer-dma.c +++ b/drivers/iio/buffer/industrialio-buffer-dma.c @@ -14,6 +14,7 @@ #include <linux/poll.h> #include <linux/iio/buffer_impl.h> #include <linux/iio/buffer-dma.h> +#include <linux/dma-buf.h> #include <linux/dma-mapping.h> #include <linux/sizes.h>
@@ -90,103 +91,145 @@
- callback is called from within the custom callback.
*/
-static void iio_buffer_block_release(struct kref *kref) -{
- struct iio_dma_buffer_block *block = container_of(kref,
struct iio_dma_buffer_block, kref);
- WARN_ON(block->state != IIO_BLOCK_STATE_DEAD);
- dma_free_coherent(block->queue->dev, PAGE_ALIGN(block->size),
block->vaddr, block->phys_addr);
- iio_buffer_put(&block->queue->buffer);
- kfree(block);
-}
-static void iio_buffer_block_get(struct iio_dma_buffer_block *block) -{
- kref_get(&block->kref);
-}
-static void iio_buffer_block_put(struct iio_dma_buffer_block *block) -{
- kref_put(&block->kref, iio_buffer_block_release);
-}
-/*
- dma_free_coherent can sleep, hence we need to take some special
care to be
- able to drop a reference from an atomic context.
- */
-static LIST_HEAD(iio_dma_buffer_dead_blocks); -static DEFINE_SPINLOCK(iio_dma_buffer_dead_blocks_lock);
-static void iio_dma_buffer_cleanup_worker(struct work_struct *work) -{
- struct iio_dma_buffer_block *block, *_block;
- LIST_HEAD(block_list);
- spin_lock_irq(&iio_dma_buffer_dead_blocks_lock);
- list_splice_tail_init(&iio_dma_buffer_dead_blocks, &block_list);
- spin_unlock_irq(&iio_dma_buffer_dead_blocks_lock);
- list_for_each_entry_safe(block, _block, &block_list, head)
iio_buffer_block_release(&block->kref);
-} -static DECLARE_WORK(iio_dma_buffer_cleanup_work, iio_dma_buffer_cleanup_worker);
-static void iio_buffer_block_release_atomic(struct kref *kref) -{ +struct iio_buffer_dma_buf_attachment {
- struct scatterlist sgl;
- struct sg_table sg_table; struct iio_dma_buffer_block *block;
- unsigned long flags;
- block = container_of(kref, struct iio_dma_buffer_block, kref);
- spin_lock_irqsave(&iio_dma_buffer_dead_blocks_lock, flags);
- list_add_tail(&block->head, &iio_dma_buffer_dead_blocks);
- spin_unlock_irqrestore(&iio_dma_buffer_dead_blocks_lock, flags);
- schedule_work(&iio_dma_buffer_cleanup_work);
-}
-/*
- Version of iio_buffer_block_put() that can be called from
atomic context
- */
-static void iio_buffer_block_put_atomic(struct iio_dma_buffer_block *block) -{
- kref_put(&block->kref, iio_buffer_block_release_atomic);
-} +};
static struct iio_dma_buffer_queue *iio_buffer_to_queue(struct iio_buffer *buf) { return container_of(buf, struct iio_dma_buffer_queue, buffer); }
+static struct iio_buffer_dma_buf_attachment * +to_iio_buffer_dma_buf_attachment(struct sg_table *table) +{
- return container_of(table, struct iio_buffer_dma_buf_attachment,
sg_table); +}
+static void iio_buffer_block_get(struct iio_dma_buffer_block *block) +{
- get_dma_buf(block->dmabuf);
+}
+static void iio_buffer_block_put(struct iio_dma_buffer_block *block) +{
- dma_buf_put(block->dmabuf);
+}
+static int iio_buffer_dma_buf_attach(struct dma_buf *dbuf,
struct dma_buf_attachment *at)
+{
- at->priv = dbuf->priv;
- return 0;
+}
+static struct sg_table *iio_buffer_dma_buf_map(struct dma_buf_attachment *at,
enum dma_data_direction dma_dir)
+{
- struct iio_dma_buffer_block *block = at->priv;
- struct iio_buffer_dma_buf_attachment *dba;
- int ret;
- dba = kzalloc(sizeof(*dba), GFP_KERNEL);
- if (!dba)
return ERR_PTR(-ENOMEM);
- sg_init_one(&dba->sgl, block->vaddr, PAGE_ALIGN(block->size));
- dba->sg_table.sgl = &dba->sgl;
- dba->sg_table.nents = 1;
- dba->block = block;
- ret = dma_map_sgtable(at->dev, &dba->sg_table, dma_dir, 0);
- if (ret) {
kfree(dba);
return ERR_PTR(ret);
- }
- return &dba->sg_table;
+}
+static void iio_buffer_dma_buf_unmap(struct dma_buf_attachment *at,
struct sg_table *sg_table,
enum dma_data_direction dma_dir)
+{
- struct iio_buffer_dma_buf_attachment *dba =
to_iio_buffer_dma_buf_attachment(sg_table);
- dma_unmap_sgtable(at->dev, &dba->sg_table, dma_dir, 0);
- kfree(dba);
+}
+static void iio_buffer_dma_buf_release(struct dma_buf *dbuf) +{
- struct iio_dma_buffer_block *block = dbuf->priv;
- struct iio_dma_buffer_queue *queue = block->queue;
- WARN_ON(block->state != IIO_BLOCK_STATE_DEAD);
- mutex_lock(&queue->lock);
- dma_free_coherent(queue->dev, PAGE_ALIGN(block->size),
block->vaddr, block->phys_addr);
- kfree(block);
- mutex_unlock(&queue->lock);
- iio_buffer_put(&queue->buffer);
+}
+static const struct dma_buf_ops iio_dma_buffer_dmabuf_ops = {
- .attach = iio_buffer_dma_buf_attach,
- .map_dma_buf = iio_buffer_dma_buf_map,
- .unmap_dma_buf = iio_buffer_dma_buf_unmap,
- .release = iio_buffer_dma_buf_release,
+};
static struct iio_dma_buffer_block *iio_dma_buffer_alloc_block( struct iio_dma_buffer_queue *queue, size_t size) { struct iio_dma_buffer_block *block;
DEFINE_DMA_BUF_EXPORT_INFO(einfo);
struct dma_buf *dmabuf;
int err = -ENOMEM;
block = kzalloc(sizeof(*block), GFP_KERNEL); if (!block)
return NULL;
return ERR_PTR(err);
block->vaddr = dma_alloc_coherent(queue->dev, PAGE_ALIGN(size), &block->phys_addr, GFP_KERNEL);
- if (!block->vaddr) {
kfree(block);
return NULL;
if (!block->vaddr)
goto err_free_block;
einfo.ops = &iio_dma_buffer_dmabuf_ops;
einfo.size = PAGE_ALIGN(size);
einfo.priv = block;
einfo.flags = O_RDWR;
dmabuf = dma_buf_export(&einfo);
if (IS_ERR(dmabuf)) {
err = PTR_ERR(dmabuf);
goto err_free_dma;
}
block->dmabuf = dmabuf; block->size = size; block->state = IIO_BLOCK_STATE_DONE; block->queue = queue; INIT_LIST_HEAD(&block->head);
kref_init(&block->kref);
iio_buffer_get(&queue->buffer);
return block;
+err_free_dma:
- dma_free_coherent(queue->dev, PAGE_ALIGN(size),
block->vaddr, block->phys_addr);
+err_free_block:
- kfree(block);
- return ERR_PTR(err);
}
static void _iio_dma_buffer_block_done(struct iio_dma_buffer_block *block) @@ -223,7 +266,7 @@ void iio_dma_buffer_block_done(struct iio_dma_buffer_block *block) _iio_dma_buffer_block_done(block); spin_unlock_irqrestore(&queue->list_lock, flags);
- iio_buffer_block_put_atomic(block);
- iio_buffer_block_put(block); iio_dma_buffer_queue_wake(queue);
} EXPORT_SYMBOL_GPL(iio_dma_buffer_block_done); @@ -249,7 +292,8 @@ void iio_dma_buffer_block_list_abort(struct iio_dma_buffer_queue *queue, list_del(&block->head); block->bytes_used = 0; _iio_dma_buffer_block_done(block);
iio_buffer_block_put_atomic(block);
} spin_unlock_irqrestore(&queue->list_lock, flags);iio_buffer_block_put(block);
@@ -340,8 +384,8 @@ int iio_dma_buffer_request_update(struct iio_buffer *buffer)
if (!block) { block = iio_dma_buffer_alloc_block(queue, size);
if (!block) {
ret = -ENOMEM;
if (IS_ERR(block)) {
ret = PTR_ERR(block); goto out_unlock; } queue->fileio.blocks[i] = block;
diff --git a/include/linux/iio/buffer-dma.h b/include/linux/iio/buffer-dma.h index 490b93f76fa8..6b3fa7d2124b 100644 --- a/include/linux/iio/buffer-dma.h +++ b/include/linux/iio/buffer-dma.h @@ -8,7 +8,6 @@ #define __INDUSTRIALIO_DMA_BUFFER_H__
#include <linux/list.h> -#include <linux/kref.h> #include <linux/spinlock.h> #include <linux/mutex.h> #include <linux/iio/buffer_impl.h> @@ -16,6 +15,7 @@ struct iio_dma_buffer_queue; struct iio_dma_buffer_ops; struct device; +struct dma_buf;
/**
- enum iio_block_state - State of a struct iio_dma_buffer_block
@@ -39,8 +39,8 @@ enum iio_block_state {
- @vaddr: Virutal address of the blocks memory
- @phys_addr: Physical address of the blocks memory
- @queue: Parent DMA buffer queue
- @kref: kref used to manage the lifetime of block
- @state: Current state of the block
*/
- @dmabuf: Underlying DMABUF object
struct iio_dma_buffer_block { /* May only be accessed by the owner of the block */ @@ -56,13 +56,13 @@ struct iio_dma_buffer_block { size_t size; struct iio_dma_buffer_queue *queue;
- /* Must not be accessed outside the core. */
- struct kref kref; /*
*/ enum iio_block_state state;
- Must not be accessed outside the core. Access needs to hold
- queue->list_lock if the block is not owned by the core.
- struct dma_buf *dmabuf;
};
/**
On Mon, Mar 28, 2022 at 11:30 PM Paul Cercueil paul@crapouillou.net wrote:
Le lun., mars 28 2022 at 18:54:25 +0100, Jonathan Cameron jic23@kernel.org a écrit :
On Mon, 7 Feb 2022 12:59:28 +0000 Paul Cercueil paul@crapouillou.net wrote:
Enhance the current fileio code by using DMABUF objects instead of custom buffers.
This adds more code than it removes, but:
- a lot of the complexity can be dropped, e.g. custom kref and iio_buffer_block_put_atomic() are not needed anymore;
- it will be much easier to introduce an API to export these DMABUF objects to userspace in a following patch.
I'm a bit rusty on dma mappings, but you seem to have a mixture of streaming and coherent mappings going on in here.
That's OK, so am I. What do you call "streaming mappings"?
dma_*_coherent() are for coherent mappings (usually you do it once and cache coherency is guaranteed by accessing this memory by device or CPU). dma_map_*() are for streaming, which means that you often want to map arbitrary pages during the transfer (usually used for the cases when you want to keep previous data and do something with a new coming, or when a new coming data is supplied by different virtual address, and hence has to be mapped for DMA).
Is it the case that the current code is using the coherent mappings and a potential 'other user' of the dma buffer might need streaming mappings?
Something like that. There are two different things; on both cases, userspace needs to create a DMABUF with IIO_BUFFER_DMABUF_ALLOC_IOCTL, and the backing memory is allocated with dma_alloc_coherent().
- For the userspace interface, you then have a "cpu access" IOCTL
(DMA_BUF_IOCTL_SYNC), that allows userspace to inform when it will start/finish to process the buffer in user-space (which will sync/invalidate the data cache if needed). A buffer can then be enqueued for DMA processing (TX or RX) with the new IIO_BUFFER_DMABUF_ENQUEUE_IOCTL.
- When the DMABUF created via the IIO core is sent to another driver
through the driver's custom DMABUF import function, this driver will call dma_buf_attach(), which will call iio_buffer_dma_buf_map(). Since it has to return a "struct sg_table *", this function then simply creates a sgtable with one entry that points to the backing memory.
...
- ret = dma_map_sgtable(at->dev, &dba->sg_table, dma_dir, 0);
- if (ret) {
kfree(dba);
return ERR_PTR(ret);
- }
Missed DMA mapping error check.
- return &dba->sg_table;
+}
...
- /* Must not be accessed outside the core. */
- struct kref kref;
- struct dma_buf *dmabuf;
Is it okay to access outside the core? If no, why did you remove (actually not modify) the comment?
Implement the two functions iio_dma_buffer_alloc_dmabuf() and iio_dma_buffer_enqueue_dmabuf(), as well as all the necessary bits to enable userspace access to the DMABUF objects.
These two functions are exported as GPL symbols so that IIO buffer implementations can support the new DMABUF based userspace API.
Signed-off-by: Paul Cercueil paul@crapouillou.net --- drivers/iio/buffer/industrialio-buffer-dma.c | 260 ++++++++++++++++++- include/linux/iio/buffer-dma.h | 13 + 2 files changed, 266 insertions(+), 7 deletions(-)
diff --git a/drivers/iio/buffer/industrialio-buffer-dma.c b/drivers/iio/buffer/industrialio-buffer-dma.c index 54e6000cd2ee..b9c3b01c5ea0 100644 --- a/drivers/iio/buffer/industrialio-buffer-dma.c +++ b/drivers/iio/buffer/industrialio-buffer-dma.c @@ -15,7 +15,9 @@ #include <linux/iio/buffer_impl.h> #include <linux/iio/buffer-dma.h> #include <linux/dma-buf.h> +#include <linux/dma-fence.h> #include <linux/dma-mapping.h> +#include <linux/dma-resv.h> #include <linux/sizes.h>
/* @@ -97,6 +99,18 @@ struct iio_buffer_dma_buf_attachment { struct iio_dma_buffer_block *block; };
+struct iio_buffer_dma_fence { + struct dma_fence base; + struct iio_dma_buffer_block *block; + spinlock_t lock; +}; + +static struct iio_buffer_dma_fence * +to_iio_buffer_dma_fence(struct dma_fence *fence) +{ + return container_of(fence, struct iio_buffer_dma_fence, base); +} + static struct iio_dma_buffer_queue *iio_buffer_to_queue(struct iio_buffer *buf) { return container_of(buf, struct iio_dma_buffer_queue, buffer); @@ -118,6 +132,48 @@ static void iio_buffer_block_put(struct iio_dma_buffer_block *block) dma_buf_put(block->dmabuf); }
+static const char * +iio_buffer_dma_fence_get_driver_name(struct dma_fence *fence) +{ + struct iio_buffer_dma_fence *iio_fence = to_iio_buffer_dma_fence(fence); + + return dev_name(iio_fence->block->queue->dev); +} + +static void iio_buffer_dma_fence_release(struct dma_fence *fence) +{ + struct iio_buffer_dma_fence *iio_fence = to_iio_buffer_dma_fence(fence); + + kfree(iio_fence); +} + +static const struct dma_fence_ops iio_buffer_dma_fence_ops = { + .get_driver_name = iio_buffer_dma_fence_get_driver_name, + .get_timeline_name = iio_buffer_dma_fence_get_driver_name, + .release = iio_buffer_dma_fence_release, +}; + +static struct dma_fence * +iio_dma_buffer_create_dma_fence(struct iio_dma_buffer_block *block) +{ + struct iio_buffer_dma_fence *fence; + u64 ctx; + + fence = kzalloc(sizeof(*fence), GFP_KERNEL); + if (!fence) + return ERR_PTR(-ENOMEM); + + fence->block = block; + spin_lock_init(&fence->lock); + + ctx = dma_fence_context_alloc(1); + + dma_fence_init(&fence->base, &iio_buffer_dma_fence_ops, + &fence->lock, ctx, 0); + + return &fence->base; +} + static int iio_buffer_dma_buf_attach(struct dma_buf *dbuf, struct dma_buf_attachment *at) { @@ -162,10 +218,26 @@ static void iio_buffer_dma_buf_unmap(struct dma_buf_attachment *at, kfree(dba); }
+static int iio_buffer_dma_buf_mmap(struct dma_buf *dbuf, + struct vm_area_struct *vma) +{ + struct iio_dma_buffer_block *block = dbuf->priv; + struct device *dev = block->queue->dev; + + vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP; + + if (vma->vm_ops->open) + vma->vm_ops->open(vma); + + return dma_mmap_coherent(dev, vma, block->vaddr, block->phys_addr, + vma->vm_end - vma->vm_start); +} + static void iio_buffer_dma_buf_release(struct dma_buf *dbuf) { struct iio_dma_buffer_block *block = dbuf->priv; struct iio_dma_buffer_queue *queue = block->queue; + bool is_fileio = block->fileio;
WARN_ON(block->state != IIO_BLOCK_STATE_DEAD);
@@ -175,6 +247,9 @@ static void iio_buffer_dma_buf_release(struct dma_buf *dbuf) block->vaddr, block->phys_addr); kfree(block);
+ queue->num_blocks--; + if (is_fileio) + queue->num_fileio_blocks--; mutex_unlock(&queue->lock); iio_buffer_put(&queue->buffer); } @@ -183,11 +258,12 @@ static const struct dma_buf_ops iio_dma_buffer_dmabuf_ops = { .attach = iio_buffer_dma_buf_attach, .map_dma_buf = iio_buffer_dma_buf_map, .unmap_dma_buf = iio_buffer_dma_buf_unmap, + .mmap = iio_buffer_dma_buf_mmap, .release = iio_buffer_dma_buf_release, };
static struct iio_dma_buffer_block *iio_dma_buffer_alloc_block( - struct iio_dma_buffer_queue *queue, size_t size) + struct iio_dma_buffer_queue *queue, size_t size, bool fileio) { struct iio_dma_buffer_block *block; DEFINE_DMA_BUF_EXPORT_INFO(einfo); @@ -218,10 +294,15 @@ static struct iio_dma_buffer_block *iio_dma_buffer_alloc_block( block->size = size; block->state = IIO_BLOCK_STATE_DONE; block->queue = queue; + block->fileio = fileio; INIT_LIST_HEAD(&block->head);
iio_buffer_get(&queue->buffer);
+ queue->num_blocks++; + if (fileio) + queue->num_fileio_blocks++; + return block;
err_free_dma: @@ -260,14 +341,23 @@ static void iio_dma_buffer_queue_wake(struct iio_dma_buffer_queue *queue) void iio_dma_buffer_block_done(struct iio_dma_buffer_block *block) { struct iio_dma_buffer_queue *queue = block->queue; + struct dma_resv *resv = block->dmabuf->resv; + struct dma_fence *fence; unsigned long flags;
spin_lock_irqsave(&queue->list_lock, flags); _iio_dma_buffer_block_done(block); spin_unlock_irqrestore(&queue->list_lock, flags);
+ fence = dma_resv_excl_fence(resv); + if (fence) + dma_fence_signal(fence); + dma_resv_unlock(resv); + iio_buffer_block_put(block); - iio_dma_buffer_queue_wake(queue); + + if (queue->fileio.enabled) + iio_dma_buffer_queue_wake(queue); } EXPORT_SYMBOL_GPL(iio_dma_buffer_block_done);
@@ -293,6 +383,9 @@ void iio_dma_buffer_block_list_abort(struct iio_dma_buffer_queue *queue, block->bytes_used = 0; _iio_dma_buffer_block_done(block);
+ if (dma_resv_is_locked(block->dmabuf->resv)) + dma_resv_unlock(block->dmabuf->resv); + iio_buffer_block_put(block); } spin_unlock_irqrestore(&queue->list_lock, flags); @@ -317,6 +410,12 @@ static bool iio_dma_block_reusable(struct iio_dma_buffer_block *block) } }
+static bool iio_dma_buffer_fileio_mode(struct iio_dma_buffer_queue *queue) +{ + return queue->fileio.enabled || + queue->num_blocks == queue->num_fileio_blocks; +} + /** * iio_dma_buffer_request_update() - DMA buffer request_update callback * @buffer: The buffer which to request an update @@ -343,6 +442,12 @@ int iio_dma_buffer_request_update(struct iio_buffer *buffer)
mutex_lock(&queue->lock);
+ queue->fileio.enabled = iio_dma_buffer_fileio_mode(queue); + + /* If DMABUFs were created, disable fileio interface */ + if (!queue->fileio.enabled) + goto out_unlock; + /* Allocations are page aligned */ if (PAGE_ALIGN(queue->fileio.block_size) == PAGE_ALIGN(size)) try_reuse = true; @@ -383,7 +488,7 @@ int iio_dma_buffer_request_update(struct iio_buffer *buffer) }
if (!block) { - block = iio_dma_buffer_alloc_block(queue, size); + block = iio_dma_buffer_alloc_block(queue, size, true); if (IS_ERR(block)) { ret = PTR_ERR(block); goto out_unlock; @@ -403,12 +508,10 @@ int iio_dma_buffer_request_update(struct iio_buffer *buffer) * iio_dma_buffer_enable() will submit it. Otherwise mark it as * done, which means it's ready to be dequeued. */ - if (queue->buffer.direction == IIO_BUFFER_DIRECTION_IN) { + if (queue->buffer.direction == IIO_BUFFER_DIRECTION_IN) block->state = IIO_BLOCK_STATE_QUEUED; - list_add_tail(&block->head, &queue->incoming); - } else { + else block->state = IIO_BLOCK_STATE_DONE; - } }
out_unlock: @@ -456,6 +559,8 @@ static void iio_dma_buffer_submit_block(struct iio_dma_buffer_queue *queue,
block->state = IIO_BLOCK_STATE_ACTIVE; iio_buffer_block_get(block); + dma_resv_lock(block->dmabuf->resv, NULL); + ret = queue->ops->submit(queue, block); if (ret) { /* @@ -487,12 +592,31 @@ int iio_dma_buffer_enable(struct iio_buffer *buffer, { struct iio_dma_buffer_queue *queue = iio_buffer_to_queue(buffer); struct iio_dma_buffer_block *block, *_block; + unsigned int i;
mutex_lock(&queue->lock); queue->active = true; + queue->fileio.next_dequeue = 0; + queue->fileio.enabled = iio_dma_buffer_fileio_mode(queue); + + dev_dbg(queue->dev, "Buffer enabled in %s mode\n", + queue->fileio.enabled ? "fileio" : "dmabuf"); + + if (queue->fileio.enabled) { + for (i = 0; i < ARRAY_SIZE(queue->fileio.blocks); i++) { + block = queue->fileio.blocks[i]; + + if (block->state == IIO_BLOCK_STATE_QUEUED) { + iio_buffer_block_get(block); + list_add_tail(&block->head, &queue->incoming); + } + } + } + list_for_each_entry_safe(block, _block, &queue->incoming, head) { list_del(&block->head); iio_dma_buffer_submit_block(queue, block); + iio_buffer_block_put(block); } mutex_unlock(&queue->lock);
@@ -514,6 +638,7 @@ int iio_dma_buffer_disable(struct iio_buffer *buffer, struct iio_dma_buffer_queue *queue = iio_buffer_to_queue(buffer);
mutex_lock(&queue->lock); + queue->fileio.enabled = false; queue->active = false;
if (queue->ops && queue->ops->abort) @@ -533,6 +658,7 @@ static void iio_dma_buffer_enqueue(struct iio_dma_buffer_queue *queue, iio_dma_buffer_submit_block(queue, block); } else { block->state = IIO_BLOCK_STATE_QUEUED; + iio_buffer_block_get(block); list_add_tail(&block->head, &queue->incoming); } } @@ -573,6 +699,11 @@ static int iio_dma_buffer_io(struct iio_buffer *buffer,
mutex_lock(&queue->lock);
+ if (!queue->fileio.enabled) { + ret = -EBUSY; + goto out_unlock; + } + if (!queue->fileio.active_block) { block = iio_dma_buffer_dequeue(queue); if (block == NULL) { @@ -688,6 +819,121 @@ size_t iio_dma_buffer_data_available(struct iio_buffer *buf) } EXPORT_SYMBOL_GPL(iio_dma_buffer_data_available);
+int iio_dma_buffer_alloc_dmabuf(struct iio_buffer *buffer, + struct iio_dmabuf_alloc_req *req) +{ + struct iio_dma_buffer_queue *queue = iio_buffer_to_queue(buffer); + struct iio_dma_buffer_block *block; + int ret = 0; + + mutex_lock(&queue->lock); + + /* + * If the buffer is enabled and in fileio mode new blocks can't be + * allocated. + */ + if (queue->fileio.enabled) { + ret = -EBUSY; + goto out_unlock; + } + + if (!req->size || req->size > SIZE_MAX) { + ret = -EINVAL; + goto out_unlock; + } + + /* Free memory that might be in use for fileio mode */ + iio_dma_buffer_fileio_free(queue); + + block = iio_dma_buffer_alloc_block(queue, req->size, false); + if (IS_ERR(block)) { + ret = PTR_ERR(block); + goto out_unlock; + } + + ret = dma_buf_fd(block->dmabuf, O_CLOEXEC); + if (ret < 0) { + dma_buf_put(block->dmabuf); + goto out_unlock; + } + +out_unlock: + mutex_unlock(&queue->lock); + + return ret; +} +EXPORT_SYMBOL_GPL(iio_dma_buffer_alloc_dmabuf); + +int iio_dma_buffer_enqueue_dmabuf(struct iio_buffer *buffer, + struct iio_dmabuf *iio_dmabuf) +{ + struct iio_dma_buffer_queue *queue = iio_buffer_to_queue(buffer); + struct iio_dma_buffer_block *dma_block; + struct dma_fence *fence; + struct dma_buf *dmabuf; + int ret = 0; + + mutex_lock(&queue->lock); + + /* If in fileio mode buffers can't be enqueued. */ + if (queue->fileio.enabled) { + ret = -EBUSY; + goto out_unlock; + } + + dmabuf = dma_buf_get(iio_dmabuf->fd); + if (IS_ERR(dmabuf)) { + ret = PTR_ERR(dmabuf); + goto out_unlock; + } + + if (dmabuf->ops != &iio_dma_buffer_dmabuf_ops) { + dev_err(queue->dev, "importing DMABUFs from other drivers is not yet supported.\n"); + ret = -EINVAL; + goto out_dma_buf_put; + } + + dma_block = dmabuf->priv; + + if (iio_dmabuf->bytes_used > dma_block->size) { + ret = -EINVAL; + goto out_dma_buf_put; + } + + dma_block->bytes_used = iio_dmabuf->bytes_used ?: dma_block->size; + + switch (dma_block->state) { + case IIO_BLOCK_STATE_QUEUED: + /* Nothing to do */ + goto out_unlock; + case IIO_BLOCK_STATE_DONE: + break; + default: + ret = -EBUSY; + goto out_dma_buf_put; + } + + fence = iio_dma_buffer_create_dma_fence(dma_block); + if (IS_ERR(fence)) { + ret = PTR_ERR(fence); + goto out_dma_buf_put; + } + + dma_resv_lock(dmabuf->resv, NULL); + dma_resv_add_excl_fence(dmabuf->resv, fence); + dma_resv_unlock(dmabuf->resv); + + iio_dma_buffer_enqueue(queue, dma_block); + +out_dma_buf_put: + dma_buf_put(dmabuf); +out_unlock: + mutex_unlock(&queue->lock); + + return ret; +} +EXPORT_SYMBOL_GPL(iio_dma_buffer_enqueue_dmabuf); + /** * iio_dma_buffer_set_bytes_per_datum() - DMA buffer set_bytes_per_datum callback * @buffer: Buffer to set the bytes-per-datum for diff --git a/include/linux/iio/buffer-dma.h b/include/linux/iio/buffer-dma.h index 6b3fa7d2124b..5bd687132355 100644 --- a/include/linux/iio/buffer-dma.h +++ b/include/linux/iio/buffer-dma.h @@ -40,6 +40,7 @@ enum iio_block_state { * @phys_addr: Physical address of the blocks memory * @queue: Parent DMA buffer queue * @state: Current state of the block + * @fileio: True if this buffer is used for fileio mode * @dmabuf: Underlying DMABUF object */ struct iio_dma_buffer_block { @@ -62,6 +63,7 @@ struct iio_dma_buffer_block { */ enum iio_block_state state;
+ bool fileio; struct dma_buf *dmabuf; };
@@ -72,6 +74,7 @@ struct iio_dma_buffer_block { * @pos: Read offset in the active block * @block_size: Size of each block * @next_dequeue: index of next block that will be dequeued + * @enabled: Whether the buffer is operating in fileio mode */ struct iio_dma_buffer_queue_fileio { struct iio_dma_buffer_block *blocks[2]; @@ -80,6 +83,7 @@ struct iio_dma_buffer_queue_fileio { size_t block_size;
unsigned int next_dequeue; + bool enabled; };
/** @@ -95,6 +99,8 @@ struct iio_dma_buffer_queue_fileio { * the DMA controller * @incoming: List of buffers on the incoming queue * @active: Whether the buffer is currently active + * @num_blocks: Total number of blocks in the queue + * @num_fileio_blocks: Number of blocks used for fileio interface * @fileio: FileIO state */ struct iio_dma_buffer_queue { @@ -107,6 +113,8 @@ struct iio_dma_buffer_queue { struct list_head incoming;
bool active; + unsigned int num_blocks; + unsigned int num_fileio_blocks;
struct iio_dma_buffer_queue_fileio fileio; }; @@ -149,4 +157,9 @@ static inline size_t iio_dma_buffer_space_available(struct iio_buffer *buffer) return iio_dma_buffer_data_available(buffer); }
+int iio_dma_buffer_alloc_dmabuf(struct iio_buffer *buffer, + struct iio_dmabuf_alloc_req *req); +int iio_dma_buffer_enqueue_dmabuf(struct iio_buffer *buffer, + struct iio_dmabuf *dmabuf); + #endif
Use the functions provided by the buffer-dma core to implement the DMABUF userspace API in the buffer-dmaengine IIO buffer implementation.
Signed-off-by: Paul Cercueil paul@crapouillou.net --- drivers/iio/buffer/industrialio-buffer-dmaengine.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/iio/buffer/industrialio-buffer-dmaengine.c b/drivers/iio/buffer/industrialio-buffer-dmaengine.c index 5cde8fd81c7f..57a8b2e4ba3c 100644 --- a/drivers/iio/buffer/industrialio-buffer-dmaengine.c +++ b/drivers/iio/buffer/industrialio-buffer-dmaengine.c @@ -133,6 +133,9 @@ static const struct iio_buffer_access_funcs iio_dmaengine_buffer_ops = { .space_available = iio_dma_buffer_space_available, .release = iio_dmaengine_buffer_release,
+ .alloc_dmabuf = iio_dma_buffer_alloc_dmabuf, + .enqueue_dmabuf = iio_dma_buffer_enqueue_dmabuf, + .modes = INDIO_BUFFER_HARDWARE, .flags = INDIO_BUFFER_FLAG_FIXED_WATERMARK, };
Introduce a new flag IIO_BUFFER_DMABUF_CYCLIC in the "flags" field of the iio_dmabuf uapi structure.
When set, the DMABUF enqueued with the enqueue ioctl will be endlessly repeated on the TX output, until the buffer is disabled.
Signed-off-by: Paul Cercueil paul@crapouillou.net Reviewed-by: Alexandru Ardelean ardeleanalex@gmail.com --- drivers/iio/industrialio-buffer.c | 5 +++++ include/uapi/linux/iio/buffer.h | 3 ++- 2 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/iio/industrialio-buffer.c b/drivers/iio/industrialio-buffer.c index 72f333a519bc..85331cedaad8 100644 --- a/drivers/iio/industrialio-buffer.c +++ b/drivers/iio/industrialio-buffer.c @@ -1535,6 +1535,11 @@ static int iio_buffer_enqueue_dmabuf(struct iio_buffer *buffer, if (dmabuf.flags & ~IIO_BUFFER_DMABUF_SUPPORTED_FLAGS) return -EINVAL;
+ /* Cyclic flag is only supported on output buffers */ + if ((dmabuf.flags & IIO_BUFFER_DMABUF_CYCLIC) && + buffer->direction != IIO_BUFFER_DIRECTION_OUT) + return -EINVAL; + return buffer->access->enqueue_dmabuf(buffer, &dmabuf); }
diff --git a/include/uapi/linux/iio/buffer.h b/include/uapi/linux/iio/buffer.h index e4621b926262..2d541d038c02 100644 --- a/include/uapi/linux/iio/buffer.h +++ b/include/uapi/linux/iio/buffer.h @@ -7,7 +7,8 @@
#include <linux/types.h>
-#define IIO_BUFFER_DMABUF_SUPPORTED_FLAGS 0x00000000 +#define IIO_BUFFER_DMABUF_CYCLIC (1 << 0) +#define IIO_BUFFER_DMABUF_SUPPORTED_FLAGS 0x00000001
/** * struct iio_dmabuf_alloc_req - Descriptor for allocating IIO DMABUFs
Handle the IIO_BUFFER_DMABUF_CYCLIC flag to support cyclic buffers.
Signed-off-by: Paul Cercueil paul@crapouillou.net Reviewed-by: Alexandru Ardelean ardeleanalex@gmail.com --- drivers/iio/buffer/industrialio-buffer-dma.c | 1 + .../iio/buffer/industrialio-buffer-dmaengine.c | 15 ++++++++++++--- include/linux/iio/buffer-dma.h | 3 +++ 3 files changed, 16 insertions(+), 3 deletions(-)
diff --git a/drivers/iio/buffer/industrialio-buffer-dma.c b/drivers/iio/buffer/industrialio-buffer-dma.c index b9c3b01c5ea0..6185af2f33f0 100644 --- a/drivers/iio/buffer/industrialio-buffer-dma.c +++ b/drivers/iio/buffer/industrialio-buffer-dma.c @@ -901,6 +901,7 @@ int iio_dma_buffer_enqueue_dmabuf(struct iio_buffer *buffer, }
dma_block->bytes_used = iio_dmabuf->bytes_used ?: dma_block->size; + dma_block->cyclic = iio_dmabuf->flags & IIO_BUFFER_DMABUF_CYCLIC;
switch (dma_block->state) { case IIO_BLOCK_STATE_QUEUED: diff --git a/drivers/iio/buffer/industrialio-buffer-dmaengine.c b/drivers/iio/buffer/industrialio-buffer-dmaengine.c index 57a8b2e4ba3c..952e2160a11e 100644 --- a/drivers/iio/buffer/industrialio-buffer-dmaengine.c +++ b/drivers/iio/buffer/industrialio-buffer-dmaengine.c @@ -81,9 +81,18 @@ static int iio_dmaengine_buffer_submit_block(struct iio_dma_buffer_queue *queue, if (!block->bytes_used || block->bytes_used > max_size) return -EINVAL;
- desc = dmaengine_prep_slave_single(dmaengine_buffer->chan, - block->phys_addr, block->bytes_used, dma_dir, - DMA_PREP_INTERRUPT); + if (block->cyclic) { + desc = dmaengine_prep_dma_cyclic(dmaengine_buffer->chan, + block->phys_addr, + block->size, + block->bytes_used, + dma_dir, 0); + } else { + desc = dmaengine_prep_slave_single(dmaengine_buffer->chan, + block->phys_addr, + block->bytes_used, dma_dir, + DMA_PREP_INTERRUPT); + } if (!desc) return -ENOMEM;
diff --git a/include/linux/iio/buffer-dma.h b/include/linux/iio/buffer-dma.h index 5bd687132355..3a5d9169e573 100644 --- a/include/linux/iio/buffer-dma.h +++ b/include/linux/iio/buffer-dma.h @@ -40,6 +40,7 @@ enum iio_block_state { * @phys_addr: Physical address of the blocks memory * @queue: Parent DMA buffer queue * @state: Current state of the block + * @cyclic: True if this is a cyclic buffer * @fileio: True if this buffer is used for fileio mode * @dmabuf: Underlying DMABUF object */ @@ -63,6 +64,8 @@ struct iio_dma_buffer_block { */ enum iio_block_state state;
+ bool cyclic; + bool fileio; struct dma_buf *dmabuf; };
Document the new DMABUF based API.
v2: - Explicitly state that the new interface is optional and is not implemented by all drivers. - The IOCTLs can now only be called on the buffer FD returned by IIO_BUFFER_GET_FD_IOCTL. - Move the page up a bit in the index since it is core stuff and not driver-specific.
Signed-off-by: Paul Cercueil paul@crapouillou.net --- Documentation/driver-api/dma-buf.rst | 2 + Documentation/iio/dmabuf_api.rst | 94 ++++++++++++++++++++++++++++ Documentation/iio/index.rst | 2 + 3 files changed, 98 insertions(+) create mode 100644 Documentation/iio/dmabuf_api.rst
diff --git a/Documentation/driver-api/dma-buf.rst b/Documentation/driver-api/dma-buf.rst index 2cd7db82d9fe..d3c9b58d2706 100644 --- a/Documentation/driver-api/dma-buf.rst +++ b/Documentation/driver-api/dma-buf.rst @@ -1,3 +1,5 @@ +.. _dma-buf: + Buffer Sharing and Synchronization ==================================
diff --git a/Documentation/iio/dmabuf_api.rst b/Documentation/iio/dmabuf_api.rst new file mode 100644 index 000000000000..43bb2c1b9fdc --- /dev/null +++ b/Documentation/iio/dmabuf_api.rst @@ -0,0 +1,94 @@ +=================================== +High-speed DMABUF interface for IIO +=================================== + +1. Overview +=========== + +The Industrial I/O subsystem supports access to buffers through a file-based +interface, with read() and write() access calls through the IIO device's dev +node. + +It additionally supports a DMABUF based interface, where the userspace +application can allocate and append DMABUF objects to the buffer's queue. +This interface is however optional and is not available in all drivers. + +The advantage of this DMABUF based interface vs. the read() +interface, is that it avoids an extra copy of the data between the +kernel and userspace. This is particularly useful for high-speed +devices which produce several megabytes or even gigabytes of data per +second. + +The data in this DMABUF interface is managed at the granularity of +DMABUF objects. Reducing the granularity from byte level to block level +is done to reduce the userspace-kernelspace synchronization overhead +since performing syscalls for each byte at a few Mbps is just not +feasible. + +This of course leads to a slightly increased latency. For this reason an +application can choose the size of the DMABUFs as well as how many it +allocates. E.g. two DMABUFs would be a traditional double buffering +scheme. But using a higher number might be necessary to avoid +underflow/overflow situations in the presence of scheduling latencies. + +2. User API +=========== + +``IIO_BUFFER_DMABUF_ALLOC_IOCTL(struct iio_dmabuf_alloc_req *)`` +---------------------------------------------------------------- + +Each call will allocate a new DMABUF object. The return value (if not +a negative errno value as error) will be the file descriptor of the new +DMABUF. + +``IIO_BUFFER_DMABUF_ENQUEUE_IOCTL(struct iio_dmabuf *)`` +-------------------------------------------------------- + +Place the DMABUF object into the queue pending for hardware process. + +These two IOCTLs have to be performed on the IIO buffer's file +descriptor, obtained using the `IIO_BUFFER_GET_FD_IOCTL` ioctl. + +3. Usage +======== + +To access the data stored in a block by userspace the block must be +mapped to the process's memory. This is done by calling mmap() on the +DMABUF's file descriptor. + +Before accessing the data through the map, you must use the +DMA_BUF_IOCTL_SYNC(struct dma_buf_sync *) ioctl, with the +DMA_BUF_SYNC_START flag, to make sure that the data is available. +This call may block until the hardware is done with this block. Once +you are done reading or writing the data, you must use this ioctl again +with the DMA_BUF_SYNC_END flag, before enqueueing the DMABUF to the +kernel's queue. + +If you need to know when the hardware is done with a DMABUF, you can +poll its file descriptor for the EPOLLOUT event. + +Finally, to destroy a DMABUF object, simply call close() on its file +descriptor. + +For more information about manipulating DMABUF objects, see: :ref:`dma-buf`. + +A typical workflow for the new interface is: + + for block in blocks: + DMABUF_ALLOC block + mmap block + + enable buffer + + while !done + for block in blocks: + DMABUF_ENQUEUE block + + DMABUF_SYNC_START block + process data + DMABUF_SYNC_END block + + disable buffer + + for block in blocks: + close block diff --git a/Documentation/iio/index.rst b/Documentation/iio/index.rst index 58b7a4ebac51..669deb67ddee 100644 --- a/Documentation/iio/index.rst +++ b/Documentation/iio/index.rst @@ -9,4 +9,6 @@ Industrial I/O
iio_configfs
+ dmabuf_api + ep93xx_adc
On Mon, Feb 07, 2022 at 01:01:40PM +0000, Paul Cercueil wrote:
Document the new DMABUF based API.
v2: - Explicitly state that the new interface is optional and is not implemented by all drivers. - The IOCTLs can now only be called on the buffer FD returned by IIO_BUFFER_GET_FD_IOCTL. - Move the page up a bit in the index since it is core stuff and not driver-specific.
Signed-off-by: Paul Cercueil paul@crapouillou.net
Documentation/driver-api/dma-buf.rst | 2 + Documentation/iio/dmabuf_api.rst | 94 ++++++++++++++++++++++++++++ Documentation/iio/index.rst | 2 + 3 files changed, 98 insertions(+) create mode 100644 Documentation/iio/dmabuf_api.rst
diff --git a/Documentation/driver-api/dma-buf.rst b/Documentation/driver-api/dma-buf.rst index 2cd7db82d9fe..d3c9b58d2706 100644 --- a/Documentation/driver-api/dma-buf.rst +++ b/Documentation/driver-api/dma-buf.rst @@ -1,3 +1,5 @@ +.. _dma-buf:
Buffer Sharing and Synchronization
diff --git a/Documentation/iio/dmabuf_api.rst b/Documentation/iio/dmabuf_api.rst new file mode 100644 index 000000000000..43bb2c1b9fdc --- /dev/null +++ b/Documentation/iio/dmabuf_api.rst @@ -0,0 +1,94 @@ +=================================== +High-speed DMABUF interface for IIO +===================================
+1. Overview +===========
+The Industrial I/O subsystem supports access to buffers through a file-based +interface, with read() and write() access calls through the IIO device's dev +node.
+It additionally supports a DMABUF based interface, where the userspace +application can allocate and append DMABUF objects to the buffer's queue. +This interface is however optional and is not available in all drivers.
+The advantage of this DMABUF based interface vs. the read() +interface, is that it avoids an extra copy of the data between the +kernel and userspace. This is particularly useful for high-speed +devices which produce several megabytes or even gigabytes of data per +second.
+The data in this DMABUF interface is managed at the granularity of +DMABUF objects. Reducing the granularity from byte level to block level +is done to reduce the userspace-kernelspace synchronization overhead +since performing syscalls for each byte at a few Mbps is just not +feasible.
+This of course leads to a slightly increased latency. For this reason an +application can choose the size of the DMABUFs as well as how many it +allocates. E.g. two DMABUFs would be a traditional double buffering +scheme. But using a higher number might be necessary to avoid +underflow/overflow situations in the presence of scheduling latencies.
So this reads a lot like reinventing io-uring with pre-registered O_DIRECT memory ranges. Except it's using dma-buf and hand-rolling a lot of pieces instead of io-uring and O_DIRECT.
At least if the entire justification for dma-buf support is zero-copy support between the driver and userspace it's _really_ not the right tool for the job. dma-buf is for zero-copy between devices, with cpu access from userpace (or kernel fwiw) being very much the exception (and often flat-out not supported at all). -Daniel
+2. User API +===========
+``IIO_BUFFER_DMABUF_ALLOC_IOCTL(struct iio_dmabuf_alloc_req *)`` +----------------------------------------------------------------
+Each call will allocate a new DMABUF object. The return value (if not +a negative errno value as error) will be the file descriptor of the new +DMABUF.
+``IIO_BUFFER_DMABUF_ENQUEUE_IOCTL(struct iio_dmabuf *)`` +--------------------------------------------------------
+Place the DMABUF object into the queue pending for hardware process.
+These two IOCTLs have to be performed on the IIO buffer's file +descriptor, obtained using the `IIO_BUFFER_GET_FD_IOCTL` ioctl.
+3. Usage +========
+To access the data stored in a block by userspace the block must be +mapped to the process's memory. This is done by calling mmap() on the +DMABUF's file descriptor.
+Before accessing the data through the map, you must use the +DMA_BUF_IOCTL_SYNC(struct dma_buf_sync *) ioctl, with the +DMA_BUF_SYNC_START flag, to make sure that the data is available. +This call may block until the hardware is done with this block. Once +you are done reading or writing the data, you must use this ioctl again +with the DMA_BUF_SYNC_END flag, before enqueueing the DMABUF to the +kernel's queue.
+If you need to know when the hardware is done with a DMABUF, you can +poll its file descriptor for the EPOLLOUT event.
+Finally, to destroy a DMABUF object, simply call close() on its file +descriptor.
+For more information about manipulating DMABUF objects, see: :ref:`dma-buf`.
+A typical workflow for the new interface is:
- for block in blocks:
DMABUF_ALLOC block
mmap block
- enable buffer
- while !done
for block in blocks:
DMABUF_ENQUEUE block
DMABUF_SYNC_START block
process data
DMABUF_SYNC_END block
- disable buffer
- for block in blocks:
close block
diff --git a/Documentation/iio/index.rst b/Documentation/iio/index.rst index 58b7a4ebac51..669deb67ddee 100644 --- a/Documentation/iio/index.rst +++ b/Documentation/iio/index.rst @@ -9,4 +9,6 @@ Industrial I/O iio_configfs
- dmabuf_api
- ep93xx_adc
-- 2.34.1
Hi Daniel,
Le mar., mars 29 2022 at 10:54:43 +0200, Daniel Vetter daniel@ffwll.ch a écrit :
On Mon, Feb 07, 2022 at 01:01:40PM +0000, Paul Cercueil wrote:
Document the new DMABUF based API.
v2: - Explicitly state that the new interface is optional and is not implemented by all drivers. - The IOCTLs can now only be called on the buffer FD returned by IIO_BUFFER_GET_FD_IOCTL. - Move the page up a bit in the index since it is core stuff and not driver-specific.
Signed-off-by: Paul Cercueil paul@crapouillou.net
Documentation/driver-api/dma-buf.rst | 2 + Documentation/iio/dmabuf_api.rst | 94 ++++++++++++++++++++++++++++ Documentation/iio/index.rst | 2 + 3 files changed, 98 insertions(+) create mode 100644 Documentation/iio/dmabuf_api.rst
diff --git a/Documentation/driver-api/dma-buf.rst b/Documentation/driver-api/dma-buf.rst index 2cd7db82d9fe..d3c9b58d2706 100644 --- a/Documentation/driver-api/dma-buf.rst +++ b/Documentation/driver-api/dma-buf.rst @@ -1,3 +1,5 @@ +.. _dma-buf:
Buffer Sharing and Synchronization
diff --git a/Documentation/iio/dmabuf_api.rst b/Documentation/iio/dmabuf_api.rst new file mode 100644 index 000000000000..43bb2c1b9fdc --- /dev/null +++ b/Documentation/iio/dmabuf_api.rst @@ -0,0 +1,94 @@ +=================================== +High-speed DMABUF interface for IIO +===================================
+1. Overview +===========
+The Industrial I/O subsystem supports access to buffers through a file-based +interface, with read() and write() access calls through the IIO device's dev +node.
+It additionally supports a DMABUF based interface, where the userspace +application can allocate and append DMABUF objects to the buffer's queue. +This interface is however optional and is not available in all drivers.
+The advantage of this DMABUF based interface vs. the read() +interface, is that it avoids an extra copy of the data between the +kernel and userspace. This is particularly useful for high-speed +devices which produce several megabytes or even gigabytes of data per +second.
+The data in this DMABUF interface is managed at the granularity of +DMABUF objects. Reducing the granularity from byte level to block level +is done to reduce the userspace-kernelspace synchronization overhead +since performing syscalls for each byte at a few Mbps is just not +feasible.
+This of course leads to a slightly increased latency. For this reason an +application can choose the size of the DMABUFs as well as how many it +allocates. E.g. two DMABUFs would be a traditional double buffering +scheme. But using a higher number might be necessary to avoid +underflow/overflow situations in the presence of scheduling latencies.
So this reads a lot like reinventing io-uring with pre-registered O_DIRECT memory ranges. Except it's using dma-buf and hand-rolling a lot of pieces instead of io-uring and O_DIRECT.
I don't see how io_uring would help us. It's an async I/O framework, does it allow us to access a kernel buffer without copying the data? Does it allow us to zero-copy the data to a network interface?
At least if the entire justification for dma-buf support is zero-copy support between the driver and userspace it's _really_ not the right tool for the job. dma-buf is for zero-copy between devices, with cpu access from userpace (or kernel fwiw) being very much the exception (and often flat-out not supported at all).
We want both. Using dma-bufs for the driver/userspace interface is a convenience as we then have a unique API instead of two distinct ones.
Why should CPU access from userspace be the exception? It works fine for IIO dma-bufs. You keep warning about this being a terrible design, but I simply don't see it.
Cheers, -Paul
+2. User API +===========
+``IIO_BUFFER_DMABUF_ALLOC_IOCTL(struct iio_dmabuf_alloc_req *)`` +----------------------------------------------------------------
+Each call will allocate a new DMABUF object. The return value (if not +a negative errno value as error) will be the file descriptor of the new +DMABUF.
+``IIO_BUFFER_DMABUF_ENQUEUE_IOCTL(struct iio_dmabuf *)`` +--------------------------------------------------------
+Place the DMABUF object into the queue pending for hardware process.
+These two IOCTLs have to be performed on the IIO buffer's file +descriptor, obtained using the `IIO_BUFFER_GET_FD_IOCTL` ioctl.
+3. Usage +========
+To access the data stored in a block by userspace the block must be +mapped to the process's memory. This is done by calling mmap() on the +DMABUF's file descriptor.
+Before accessing the data through the map, you must use the +DMA_BUF_IOCTL_SYNC(struct dma_buf_sync *) ioctl, with the +DMA_BUF_SYNC_START flag, to make sure that the data is available. +This call may block until the hardware is done with this block. Once +you are done reading or writing the data, you must use this ioctl again +with the DMA_BUF_SYNC_END flag, before enqueueing the DMABUF to the +kernel's queue.
+If you need to know when the hardware is done with a DMABUF, you can +poll its file descriptor for the EPOLLOUT event.
+Finally, to destroy a DMABUF object, simply call close() on its file +descriptor.
+For more information about manipulating DMABUF objects, see: :ref:`dma-buf`.
+A typical workflow for the new interface is:
- for block in blocks:
DMABUF_ALLOC block
mmap block
- enable buffer
- while !done
for block in blocks:
DMABUF_ENQUEUE block
DMABUF_SYNC_START block
process data
DMABUF_SYNC_END block
- disable buffer
- for block in blocks:
close block
diff --git a/Documentation/iio/index.rst b/Documentation/iio/index.rst index 58b7a4ebac51..669deb67ddee 100644 --- a/Documentation/iio/index.rst +++ b/Documentation/iio/index.rst @@ -9,4 +9,6 @@ Industrial I/O
iio_configfs
- dmabuf_api
- ep93xx_adc
-- 2.34.1
-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
On Tue, Mar 29, 2022 at 10:47:23AM +0100, Paul Cercueil wrote:
Hi Daniel,
Le mar., mars 29 2022 at 10:54:43 +0200, Daniel Vetter daniel@ffwll.ch a écrit :
On Mon, Feb 07, 2022 at 01:01:40PM +0000, Paul Cercueil wrote:
Document the new DMABUF based API.
v2: - Explicitly state that the new interface is optional and is not implemented by all drivers. - The IOCTLs can now only be called on the buffer FD returned by IIO_BUFFER_GET_FD_IOCTL. - Move the page up a bit in the index since it is core stuff and not driver-specific.
Signed-off-by: Paul Cercueil paul@crapouillou.net
Documentation/driver-api/dma-buf.rst | 2 + Documentation/iio/dmabuf_api.rst | 94 ++++++++++++++++++++++++++++ Documentation/iio/index.rst | 2 + 3 files changed, 98 insertions(+) create mode 100644 Documentation/iio/dmabuf_api.rst
diff --git a/Documentation/driver-api/dma-buf.rst b/Documentation/driver-api/dma-buf.rst index 2cd7db82d9fe..d3c9b58d2706 100644 --- a/Documentation/driver-api/dma-buf.rst +++ b/Documentation/driver-api/dma-buf.rst @@ -1,3 +1,5 @@ +.. _dma-buf:
Buffer Sharing and Synchronization
diff --git a/Documentation/iio/dmabuf_api.rst b/Documentation/iio/dmabuf_api.rst new file mode 100644 index 000000000000..43bb2c1b9fdc --- /dev/null +++ b/Documentation/iio/dmabuf_api.rst @@ -0,0 +1,94 @@ +=================================== +High-speed DMABUF interface for IIO +===================================
+1. Overview +===========
+The Industrial I/O subsystem supports access to buffers through a file-based +interface, with read() and write() access calls through the IIO device's dev +node.
+It additionally supports a DMABUF based interface, where the userspace +application can allocate and append DMABUF objects to the buffer's queue. +This interface is however optional and is not available in all drivers.
+The advantage of this DMABUF based interface vs. the read() +interface, is that it avoids an extra copy of the data between the +kernel and userspace. This is particularly useful for high-speed +devices which produce several megabytes or even gigabytes of data per +second.
+The data in this DMABUF interface is managed at the granularity of +DMABUF objects. Reducing the granularity from byte level to block level +is done to reduce the userspace-kernelspace synchronization overhead +since performing syscalls for each byte at a few Mbps is just not +feasible.
+This of course leads to a slightly increased latency. For this reason an +application can choose the size of the DMABUFs as well as how many it +allocates. E.g. two DMABUFs would be a traditional double buffering +scheme. But using a higher number might be necessary to avoid +underflow/overflow situations in the presence of scheduling latencies.
So this reads a lot like reinventing io-uring with pre-registered O_DIRECT memory ranges. Except it's using dma-buf and hand-rolling a lot of pieces instead of io-uring and O_DIRECT.
I don't see how io_uring would help us. It's an async I/O framework, does it allow us to access a kernel buffer without copying the data? Does it allow us to zero-copy the data to a network interface?
With networking, do you mean rdma, or some other kind of networking? Anything else than rdma doesn't support dma-buf, and I don't think it will likely ever do so. Similar it's really tricky to glue dma-buf support into the block layer.
Wrt io_uring, yes it's async, but that's not the point. The point is that with io_uring you pre-register ranges for reads and writes to target, which in combination with O_DIRECT, makes it effectively (and efficient!) zero-copy. Plus it has full integration with both networking and normal file io, which dma-buf just doesn't have.
Like you _cannot_ do zero copy from a dma-buf into a normal file. You absolutely can do the same with io_uring.
At least if the entire justification for dma-buf support is zero-copy support between the driver and userspace it's _really_ not the right tool for the job. dma-buf is for zero-copy between devices, with cpu access from userpace (or kernel fwiw) being very much the exception (and often flat-out not supported at all).
We want both. Using dma-bufs for the driver/userspace interface is a convenience as we then have a unique API instead of two distinct ones.
Why should CPU access from userspace be the exception? It works fine for IIO dma-bufs. You keep warning about this being a terrible design, but I simply don't see it.
It depends really on what you're trying to do, and there's extremely high chances it will simply not work.
Unless you want to do zero copy with a gpu, or something which is in that ecosystem of accelerators and devices, then dma-buf is probably not what you're looking for. -Daniel
Cheers, -Paul
+2. User API +===========
+``IIO_BUFFER_DMABUF_ALLOC_IOCTL(struct iio_dmabuf_alloc_req *)`` +----------------------------------------------------------------
+Each call will allocate a new DMABUF object. The return value (if not +a negative errno value as error) will be the file descriptor of the new +DMABUF.
+``IIO_BUFFER_DMABUF_ENQUEUE_IOCTL(struct iio_dmabuf *)`` +--------------------------------------------------------
+Place the DMABUF object into the queue pending for hardware process.
+These two IOCTLs have to be performed on the IIO buffer's file +descriptor, obtained using the `IIO_BUFFER_GET_FD_IOCTL` ioctl.
+3. Usage +========
+To access the data stored in a block by userspace the block must be +mapped to the process's memory. This is done by calling mmap() on the +DMABUF's file descriptor.
+Before accessing the data through the map, you must use the +DMA_BUF_IOCTL_SYNC(struct dma_buf_sync *) ioctl, with the +DMA_BUF_SYNC_START flag, to make sure that the data is available. +This call may block until the hardware is done with this block. Once +you are done reading or writing the data, you must use this ioctl again +with the DMA_BUF_SYNC_END flag, before enqueueing the DMABUF to the +kernel's queue.
+If you need to know when the hardware is done with a DMABUF, you can +poll its file descriptor for the EPOLLOUT event.
+Finally, to destroy a DMABUF object, simply call close() on its file +descriptor.
+For more information about manipulating DMABUF objects, see: :ref:`dma-buf`.
+A typical workflow for the new interface is:
- for block in blocks:
DMABUF_ALLOC block
mmap block
- enable buffer
- while !done
for block in blocks:
DMABUF_ENQUEUE block
DMABUF_SYNC_START block
process data
DMABUF_SYNC_END block
- disable buffer
- for block in blocks:
close block
diff --git a/Documentation/iio/index.rst b/Documentation/iio/index.rst index 58b7a4ebac51..669deb67ddee 100644 --- a/Documentation/iio/index.rst +++ b/Documentation/iio/index.rst @@ -9,4 +9,6 @@ Industrial I/O
iio_configfs
- dmabuf_api
- ep93xx_adc
-- 2.34.1
-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Le mar., mars 29 2022 at 16:07:21 +0200, Daniel Vetter daniel@ffwll.ch a écrit :
On Tue, Mar 29, 2022 at 10:47:23AM +0100, Paul Cercueil wrote:
Hi Daniel,
Le mar., mars 29 2022 at 10:54:43 +0200, Daniel Vetter daniel@ffwll.ch a écrit :
On Mon, Feb 07, 2022 at 01:01:40PM +0000, Paul Cercueil wrote:
Document the new DMABUF based API.
v2: - Explicitly state that the new interface is optional and
is
not implemented by all drivers. - The IOCTLs can now only be called on the buffer FD
returned by
IIO_BUFFER_GET_FD_IOCTL. - Move the page up a bit in the index since it is core
stuff
and not driver-specific.
Signed-off-by: Paul Cercueil paul@crapouillou.net
Documentation/driver-api/dma-buf.rst | 2 + Documentation/iio/dmabuf_api.rst | 94 ++++++++++++++++++++++++++++ Documentation/iio/index.rst | 2 + 3 files changed, 98 insertions(+) create mode 100644 Documentation/iio/dmabuf_api.rst
diff --git a/Documentation/driver-api/dma-buf.rst b/Documentation/driver-api/dma-buf.rst index 2cd7db82d9fe..d3c9b58d2706 100644 --- a/Documentation/driver-api/dma-buf.rst +++ b/Documentation/driver-api/dma-buf.rst @@ -1,3 +1,5 @@ +.. _dma-buf:
Buffer Sharing and Synchronization
diff --git a/Documentation/iio/dmabuf_api.rst b/Documentation/iio/dmabuf_api.rst new file mode 100644 index 000000000000..43bb2c1b9fdc --- /dev/null +++ b/Documentation/iio/dmabuf_api.rst @@ -0,0 +1,94 @@ +=================================== +High-speed DMABUF interface for IIO +===================================
+1. Overview +===========
+The Industrial I/O subsystem supports access to buffers
through a
file-based +interface, with read() and write() access calls through the
IIO
device's dev +node.
+It additionally supports a DMABUF based interface, where the userspace +application can allocate and append DMABUF objects to the
buffer's
queue. +This interface is however optional and is not available in all drivers.
+The advantage of this DMABUF based interface vs. the read() +interface, is that it avoids an extra copy of the data
between the
+kernel and userspace. This is particularly useful for
high-speed
+devices which produce several megabytes or even gigabytes of
data
per +second.
+The data in this DMABUF interface is managed at the
granularity of
+DMABUF objects. Reducing the granularity from byte level to
block
level +is done to reduce the userspace-kernelspace synchronization overhead +since performing syscalls for each byte at a few Mbps is just
not
+feasible.
+This of course leads to a slightly increased latency. For this reason an +application can choose the size of the DMABUFs as well as how
many
it +allocates. E.g. two DMABUFs would be a traditional double
buffering
+scheme. But using a higher number might be necessary to avoid +underflow/overflow situations in the presence of scheduling latencies.
So this reads a lot like reinventing io-uring with pre-registered O_DIRECT memory ranges. Except it's using dma-buf and hand-rolling a lot of pieces instead of io-uring and O_DIRECT.
I don't see how io_uring would help us. It's an async I/O framework, does it allow us to access a kernel buffer without copying the data? Does it allow us to zero-copy the data to a network interface?
With networking, do you mean rdma, or some other kind of networking? Anything else than rdma doesn't support dma-buf, and I don't think it will likely ever do so. Similar it's really tricky to glue dma-buf support into the block layer.
By networking I mean standard sockets. If I'm not mistaken, Jonathan Lemon's work on zctap was to add dma-buf import/export support to standard sockets.
Wrt io_uring, yes it's async, but that's not the point. The point is that with io_uring you pre-register ranges for reads and writes to target, which in combination with O_DIRECT, makes it effectively (and efficient!) zero-copy. Plus it has full integration with both networking and normal file io, which dma-buf just doesn't have.
Like you _cannot_ do zero copy from a dma-buf into a normal file. You absolutely can do the same with io_uring.
I believe io_uring does zero-copy the same way as splice(), by duplicating/moving pages? Because that wouldn't work with DMA coherent memory, which is contiguous and not backed by pages.
At least if the entire justification for dma-buf support is
zero-copy
support between the driver and userspace it's _really_ not the
right
tool for the job. dma-buf is for zero-copy between devices, with cpu
access
from userpace (or kernel fwiw) being very much the exception (and
often
flat-out not supported at all).
We want both. Using dma-bufs for the driver/userspace interface is a convenience as we then have a unique API instead of two distinct ones.
Why should CPU access from userspace be the exception? It works fine for IIO dma-bufs. You keep warning about this being a terrible design, but I simply don't see it.
It depends really on what you're trying to do, and there's extremely high chances it will simply not work.
Well it does work though. The userspace interface is stupidly simple here - one dma-buf, backed by DMA coherent memory, is enqueued for processing by the DMA. The userspace calling the "sync" ioctl on the dma-buf will block until the transfer is complete, and then userspace can access it again.
Unless you want to do zero copy with a gpu, or something which is in that ecosystem of accelerators and devices, then dma-buf is probably not what you're looking for. -Daniel
I want to do zero-copy between a IIO device and the network/USB, and right now there is absolutely nothing in place that allows me to do that. So I have to get creative.
Cheers, -Paul
+2. User API +===========
+``IIO_BUFFER_DMABUF_ALLOC_IOCTL(struct iio_dmabuf_alloc_req
*)``
+----------------------------------------------------------------
+Each call will allocate a new DMABUF object. The return value
(if
not +a negative errno value as error) will be the file descriptor
of
the new +DMABUF.
+``IIO_BUFFER_DMABUF_ENQUEUE_IOCTL(struct iio_dmabuf *)`` +--------------------------------------------------------
+Place the DMABUF object into the queue pending for hardware process.
+These two IOCTLs have to be performed on the IIO buffer's file +descriptor, obtained using the `IIO_BUFFER_GET_FD_IOCTL`
ioctl.
+3. Usage +========
+To access the data stored in a block by userspace the block
must be
+mapped to the process's memory. This is done by calling
mmap() on
the +DMABUF's file descriptor.
+Before accessing the data through the map, you must use the +DMA_BUF_IOCTL_SYNC(struct dma_buf_sync *) ioctl, with the +DMA_BUF_SYNC_START flag, to make sure that the data is
available.
+This call may block until the hardware is done with this
block.
Once +you are done reading or writing the data, you must use this
ioctl
again +with the DMA_BUF_SYNC_END flag, before enqueueing the DMABUF
to the
+kernel's queue.
+If you need to know when the hardware is done with a DMABUF,
you
can +poll its file descriptor for the EPOLLOUT event.
+Finally, to destroy a DMABUF object, simply call close() on
its
file +descriptor.
+For more information about manipulating DMABUF objects, see: :ref:`dma-buf`.
+A typical workflow for the new interface is:
- for block in blocks:
DMABUF_ALLOC block
mmap block
- enable buffer
- while !done
for block in blocks:
DMABUF_ENQUEUE block
DMABUF_SYNC_START block
process data
DMABUF_SYNC_END block
- disable buffer
- for block in blocks:
close block
diff --git a/Documentation/iio/index.rst b/Documentation/iio/index.rst index 58b7a4ebac51..669deb67ddee 100644 --- a/Documentation/iio/index.rst +++ b/Documentation/iio/index.rst @@ -9,4 +9,6 @@ Industrial I/O
iio_configfs
- dmabuf_api
- ep93xx_adc
-- 2.34.1
-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
On Tue, Mar 29, 2022 at 06:34:58PM +0100, Paul Cercueil wrote:
Le mar., mars 29 2022 at 16:07:21 +0200, Daniel Vetter daniel@ffwll.ch a écrit :
On Tue, Mar 29, 2022 at 10:47:23AM +0100, Paul Cercueil wrote:
Hi Daniel,
Le mar., mars 29 2022 at 10:54:43 +0200, Daniel Vetter daniel@ffwll.ch a écrit :
On Mon, Feb 07, 2022 at 01:01:40PM +0000, Paul Cercueil wrote:
Document the new DMABUF based API.
v2: - Explicitly state that the new interface is optional and
is
not implemented by all drivers. - The IOCTLs can now only be called on the buffer FD
returned by
IIO_BUFFER_GET_FD_IOCTL. - Move the page up a bit in the index since it is core
stuff
and not driver-specific.
Signed-off-by: Paul Cercueil paul@crapouillou.net
Documentation/driver-api/dma-buf.rst | 2 + Documentation/iio/dmabuf_api.rst | 94 ++++++++++++++++++++++++++++ Documentation/iio/index.rst | 2 + 3 files changed, 98 insertions(+) create mode 100644 Documentation/iio/dmabuf_api.rst
diff --git a/Documentation/driver-api/dma-buf.rst b/Documentation/driver-api/dma-buf.rst index 2cd7db82d9fe..d3c9b58d2706 100644 --- a/Documentation/driver-api/dma-buf.rst +++ b/Documentation/driver-api/dma-buf.rst @@ -1,3 +1,5 @@ +.. _dma-buf:
Buffer Sharing and Synchronization
diff --git a/Documentation/iio/dmabuf_api.rst b/Documentation/iio/dmabuf_api.rst new file mode 100644 index 000000000000..43bb2c1b9fdc --- /dev/null +++ b/Documentation/iio/dmabuf_api.rst @@ -0,0 +1,94 @@ +=================================== +High-speed DMABUF interface for IIO +===================================
+1. Overview +===========
+The Industrial I/O subsystem supports access to buffers
through a
file-based +interface, with read() and write() access calls through the
IIO
device's dev +node.
+It additionally supports a DMABUF based interface, where the userspace +application can allocate and append DMABUF objects to the
buffer's
queue. +This interface is however optional and is not available in all drivers.
+The advantage of this DMABUF based interface vs. the read() +interface, is that it avoids an extra copy of the data
between the
+kernel and userspace. This is particularly useful for
high-speed
+devices which produce several megabytes or even gigabytes of
data
per +second.
+The data in this DMABUF interface is managed at the
granularity of
+DMABUF objects. Reducing the granularity from byte level to
block
level +is done to reduce the userspace-kernelspace synchronization overhead +since performing syscalls for each byte at a few Mbps is just
not
+feasible.
+This of course leads to a slightly increased latency. For this reason an +application can choose the size of the DMABUFs as well as how
many
it +allocates. E.g. two DMABUFs would be a traditional double
buffering
+scheme. But using a higher number might be necessary to avoid +underflow/overflow situations in the presence of scheduling latencies.
So this reads a lot like reinventing io-uring with pre-registered O_DIRECT memory ranges. Except it's using dma-buf and hand-rolling a lot of pieces instead of io-uring and O_DIRECT.
I don't see how io_uring would help us. It's an async I/O framework, does it allow us to access a kernel buffer without copying the data? Does it allow us to zero-copy the data to a network interface?
With networking, do you mean rdma, or some other kind of networking? Anything else than rdma doesn't support dma-buf, and I don't think it will likely ever do so. Similar it's really tricky to glue dma-buf support into the block layer.
By networking I mean standard sockets. If I'm not mistaken, Jonathan Lemon's work on zctap was to add dma-buf import/export support to standard sockets.
Wrt io_uring, yes it's async, but that's not the point. The point is that with io_uring you pre-register ranges for reads and writes to target, which in combination with O_DIRECT, makes it effectively (and efficient!) zero-copy. Plus it has full integration with both networking and normal file io, which dma-buf just doesn't have.
Like you _cannot_ do zero copy from a dma-buf into a normal file. You absolutely can do the same with io_uring.
I believe io_uring does zero-copy the same way as splice(), by duplicating/moving pages? Because that wouldn't work with DMA coherent memory, which is contiguous and not backed by pages.
Yeah if your memory has to be contig and/or write-combined/uncached for dma reasons, then we're much more firmly into dma-buf territory. But also that means we really need dma-buf support in the networking stack, and that might be a supreme challenge.
E.g. dma-buf is all about pre-registering memory (dma_buf_attach is potentilly very expensive) for a specific device. With full generality networking, none of this is possible since until you make the dynamic decision to send stuff out, you might not even know the device the packets go out on.
Also with filtering and everything cpu access is pretty much assumed, and doing that with dma-buf is a bit a challenge. -Daniel
At least if the entire justification for dma-buf support is
zero-copy
support between the driver and userspace it's _really_ not the
right
tool for the job. dma-buf is for zero-copy between devices, with cpu
access
from userpace (or kernel fwiw) being very much the exception (and
often
flat-out not supported at all).
We want both. Using dma-bufs for the driver/userspace interface is a convenience as we then have a unique API instead of two distinct ones.
Why should CPU access from userspace be the exception? It works fine for IIO dma-bufs. You keep warning about this being a terrible design, but I simply don't see it.
It depends really on what you're trying to do, and there's extremely high chances it will simply not work.
Well it does work though. The userspace interface is stupidly simple here - one dma-buf, backed by DMA coherent memory, is enqueued for processing by the DMA. The userspace calling the "sync" ioctl on the dma-buf will block until the transfer is complete, and then userspace can access it again.
Unless you want to do zero copy with a gpu, or something which is in that ecosystem of accelerators and devices, then dma-buf is probably not what you're looking for. -Daniel
I want to do zero-copy between a IIO device and the network/USB, and right now there is absolutely nothing in place that allows me to do that. So I have to get creative.
Cheers, -Paul
+2. User API +===========
+``IIO_BUFFER_DMABUF_ALLOC_IOCTL(struct iio_dmabuf_alloc_req
*)``
+----------------------------------------------------------------
+Each call will allocate a new DMABUF object. The return value
(if
not +a negative errno value as error) will be the file descriptor
of
the new +DMABUF.
+``IIO_BUFFER_DMABUF_ENQUEUE_IOCTL(struct iio_dmabuf *)`` +--------------------------------------------------------
+Place the DMABUF object into the queue pending for hardware process.
+These two IOCTLs have to be performed on the IIO buffer's file +descriptor, obtained using the `IIO_BUFFER_GET_FD_IOCTL`
ioctl.
+3. Usage +========
+To access the data stored in a block by userspace the block
must be
+mapped to the process's memory. This is done by calling
mmap() on
the +DMABUF's file descriptor.
+Before accessing the data through the map, you must use the +DMA_BUF_IOCTL_SYNC(struct dma_buf_sync *) ioctl, with the +DMA_BUF_SYNC_START flag, to make sure that the data is
available.
+This call may block until the hardware is done with this
block.
Once +you are done reading or writing the data, you must use this
ioctl
again +with the DMA_BUF_SYNC_END flag, before enqueueing the DMABUF
to the
+kernel's queue.
+If you need to know when the hardware is done with a DMABUF,
you
can +poll its file descriptor for the EPOLLOUT event.
+Finally, to destroy a DMABUF object, simply call close() on
its
file +descriptor.
+For more information about manipulating DMABUF objects, see: :ref:`dma-buf`.
+A typical workflow for the new interface is:
- for block in blocks:
DMABUF_ALLOC block
mmap block
- enable buffer
- while !done
for block in blocks:
DMABUF_ENQUEUE block
DMABUF_SYNC_START block
process data
DMABUF_SYNC_END block
- disable buffer
- for block in blocks:
close block
diff --git a/Documentation/iio/index.rst b/Documentation/iio/index.rst index 58b7a4ebac51..669deb67ddee 100644 --- a/Documentation/iio/index.rst +++ b/Documentation/iio/index.rst @@ -9,4 +9,6 @@ Industrial I/O
iio_configfs
- dmabuf_api
- ep93xx_adc
-- 2.34.1
-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
On Mon, 7 Feb 2022 12:59:21 +0000 Paul Cercueil paul@crapouillou.net wrote:
Hi Jonathan,
This is the V2 of my patchset that introduces a new userspace interface based on DMABUF objects to complement the fileio API, and adds write() support to the existing fileio API.
Hi Paul,
It's been a little while. Perhaps you could summarize the various view points around the appropriateness of using DMABUF for this? I appreciate it is a tricky topic to distil into a brief summary but I know I would find it useful even if no one else does!
Thanks,
Jonathan
Changes since v1:
- the patches that were merged in v1 have been (obviously) dropped from this patchset;
- the patch that was setting the write-combine cache setting has been dropped as well, as it was simply not useful.
- [01/12]:
- Only remove the outgoing queue, and keep the incoming queue, as we want the buffer to start streaming data as soon as it is enabled.
- Remove IIO_BLOCK_STATE_DEQUEUED, since it is now functionally the same as IIO_BLOCK_STATE_DONE.
- [02/12]:
- Fix block->state not being reset in iio_dma_buffer_request_update() for output buffers.
- Only update block->bytes_used once and add a comment about why we update it.
- Add a comment about why we're setting a different state for output buffers in iio_dma_buffer_request_update()
- Remove useless cast to bool (!!) in iio_dma_buffer_io()
- [05/12]: Only allow the new IOCTLs on the buffer FD created with IIO_BUFFER_GET_FD_IOCTL().
- [12/12]:
- Explicitly state that the new interface is optional and is not implemented by all drivers.
- The IOCTLs can now only be called on the buffer FD returned by IIO_BUFFER_GET_FD_IOCTL.
- Move the page up a bit in the index since it is core stuff and not driver-specific.
The patches not listed here have not been modified since v1.
Cheers, -Paul
Alexandru Ardelean (1): iio: buffer-dma: split iio_dma_buffer_fileio_free() function
Paul Cercueil (11): iio: buffer-dma: Get rid of outgoing queue iio: buffer-dma: Enable buffer write support iio: buffer-dmaengine: Support specifying buffer direction iio: buffer-dmaengine: Enable write support iio: core: Add new DMABUF interface infrastructure iio: buffer-dma: Use DMABUFs instead of custom solution iio: buffer-dma: Implement new DMABUF based userspace API iio: buffer-dmaengine: Support new DMABUF based userspace API iio: core: Add support for cyclic buffers iio: buffer-dmaengine: Add support for cyclic buffers Documentation: iio: Document high-speed DMABUF based API
Documentation/driver-api/dma-buf.rst | 2 + Documentation/iio/dmabuf_api.rst | 94 +++ Documentation/iio/index.rst | 2 + drivers/iio/adc/adi-axi-adc.c | 3 +- drivers/iio/buffer/industrialio-buffer-dma.c | 610 ++++++++++++++---- .../buffer/industrialio-buffer-dmaengine.c | 42 +- drivers/iio/industrialio-buffer.c | 60 ++ include/linux/iio/buffer-dma.h | 38 +- include/linux/iio/buffer-dmaengine.h | 5 +- include/linux/iio/buffer_impl.h | 8 + include/uapi/linux/iio/buffer.h | 30 + 11 files changed, 749 insertions(+), 145 deletions(-) create mode 100644 Documentation/iio/dmabuf_api.rst
Hi Jonathan,
Le dim., févr. 13 2022 at 18:46:16 +0000, Jonathan Cameron jic23@kernel.org a écrit :
On Mon, 7 Feb 2022 12:59:21 +0000 Paul Cercueil paul@crapouillou.net wrote:
Hi Jonathan,
This is the V2 of my patchset that introduces a new userspace interface based on DMABUF objects to complement the fileio API, and adds write() support to the existing fileio API.
Hi Paul,
It's been a little while. Perhaps you could summarize the various view points around the appropriateness of using DMABUF for this? I appreciate it is a tricky topic to distil into a brief summary but I know I would find it useful even if no one else does!
So we want to have a high-speed interface where buffers of samples are passed around between IIO devices and other devices (e.g. USB or network), or made available to userspace without copying the data.
DMABUF is, at least in theory, exactly what we need. Quoting the documentation (https://www.kernel.org/doc/html/v5.15/driver-api/dma-buf.html): "The dma-buf subsystem provides the framework for sharing buffers for hardware (DMA) access across multiple device drivers and subsystems, and for synchronizing asynchronous hardware access. This is used, for example, by drm “prime” multi-GPU support, but is of course not limited to GPU use cases."
The problem is that right now DMABUF is only really used by DRM, and to quote Daniel, "dma-buf looks like something super generic and useful, until you realize that there's a metric ton of gpu/accelerator bagage piled in".
Still, it seems to be the only viable option. We could add a custom buffer-passing interface, but that would mean implementing the same buffer-passing interface on the network and USB stacks, and before we know it we re-invented DMABUFs.
Cheers, -Paul
Changes since v1:
- the patches that were merged in v1 have been (obviously) dropped
from this patchset;
- the patch that was setting the write-combine cache setting has
been dropped as well, as it was simply not useful.
- [01/12]:
- Only remove the outgoing queue, and keep the incoming queue,
as we want the buffer to start streaming data as soon as it is enabled. * Remove IIO_BLOCK_STATE_DEQUEUED, since it is now functionally the same as IIO_BLOCK_STATE_DONE.
- [02/12]:
- Fix block->state not being reset in iio_dma_buffer_request_update() for output buffers.
- Only update block->bytes_used once and add a comment about
why we update it. * Add a comment about why we're setting a different state for output buffers in iio_dma_buffer_request_update() * Remove useless cast to bool (!!) in iio_dma_buffer_io()
- [05/12]: Only allow the new IOCTLs on the buffer FD created with IIO_BUFFER_GET_FD_IOCTL().
- [12/12]:
- Explicitly state that the new interface is optional and is not implemented by all drivers.
- The IOCTLs can now only be called on the buffer FD returned by IIO_BUFFER_GET_FD_IOCTL.
- Move the page up a bit in the index since it is core stuff
and not driver-specific.
The patches not listed here have not been modified since v1.
Cheers, -Paul
Alexandru Ardelean (1): iio: buffer-dma: split iio_dma_buffer_fileio_free() function
Paul Cercueil (11): iio: buffer-dma: Get rid of outgoing queue iio: buffer-dma: Enable buffer write support iio: buffer-dmaengine: Support specifying buffer direction iio: buffer-dmaengine: Enable write support iio: core: Add new DMABUF interface infrastructure iio: buffer-dma: Use DMABUFs instead of custom solution iio: buffer-dma: Implement new DMABUF based userspace API iio: buffer-dmaengine: Support new DMABUF based userspace API iio: core: Add support for cyclic buffers iio: buffer-dmaengine: Add support for cyclic buffers Documentation: iio: Document high-speed DMABUF based API
Documentation/driver-api/dma-buf.rst | 2 + Documentation/iio/dmabuf_api.rst | 94 +++ Documentation/iio/index.rst | 2 + drivers/iio/adc/adi-axi-adc.c | 3 +- drivers/iio/buffer/industrialio-buffer-dma.c | 610 ++++++++++++++---- .../buffer/industrialio-buffer-dmaengine.c | 42 +- drivers/iio/industrialio-buffer.c | 60 ++ include/linux/iio/buffer-dma.h | 38 +- include/linux/iio/buffer-dmaengine.h | 5 +- include/linux/iio/buffer_impl.h | 8 + include/uapi/linux/iio/buffer.h | 30 + 11 files changed, 749 insertions(+), 145 deletions(-) create mode 100644 Documentation/iio/dmabuf_api.rst
On Tue, Feb 15, 2022 at 05:43:35PM +0000, Paul Cercueil wrote:
Hi Jonathan,
Le dim., févr. 13 2022 at 18:46:16 +0000, Jonathan Cameron jic23@kernel.org a écrit :
On Mon, 7 Feb 2022 12:59:21 +0000 Paul Cercueil paul@crapouillou.net wrote:
Hi Jonathan,
This is the V2 of my patchset that introduces a new userspace interface based on DMABUF objects to complement the fileio API, and adds write() support to the existing fileio API.
Hi Paul,
It's been a little while. Perhaps you could summarize the various view points around the appropriateness of using DMABUF for this? I appreciate it is a tricky topic to distil into a brief summary but I know I would find it useful even if no one else does!
So we want to have a high-speed interface where buffers of samples are passed around between IIO devices and other devices (e.g. USB or network), or made available to userspace without copying the data.
DMABUF is, at least in theory, exactly what we need. Quoting the documentation (https://www.kernel.org/doc/html/v5.15/driver-api/dma-buf.html): "The dma-buf subsystem provides the framework for sharing buffers for hardware (DMA) access across multiple device drivers and subsystems, and for synchronizing asynchronous hardware access. This is used, for example, by drm “prime” multi-GPU support, but is of course not limited to GPU use cases."
The problem is that right now DMABUF is only really used by DRM, and to quote Daniel, "dma-buf looks like something super generic and useful, until you realize that there's a metric ton of gpu/accelerator bagage piled in".
Still, it seems to be the only viable option. We could add a custom buffer-passing interface, but that would mean implementing the same buffer-passing interface on the network and USB stacks, and before we know it we re-invented DMABUFs.
dma-buf also doesn't support sharing with network and usb stacks, so I'm a bit confused why exactly this is useful?
So yeah unless there's some sharing going on with gpu stuff (for data processing maybe) I'm not sure this makes a lot of sense really. Or at least some zero-copy sharing between drivers, but even that would minimally require a dma-buf import ioctl of some sorts. Which I either missed or doesn't exist.
If there's none of that then just hand-roll your buffer handling code (xarray is cheap to use in terms of code for this), you can always add dma-buf import/export later on when the need arises.
Scrolling through patches you only have dma-buf export, but no importing, so the use-case that works is with one of the existing subsystems that supporting dma-buf importing.
I think minimally we need the use-case (in form of code) that needs the buffer sharing here. -Daniel
Cheers, -Paul
Changes since v1:
- the patches that were merged in v1 have been (obviously) dropped
from this patchset;
- the patch that was setting the write-combine cache setting has
been dropped as well, as it was simply not useful.
- [01/12]:
- Only remove the outgoing queue, and keep the incoming queue,
as we want the buffer to start streaming data as soon as it is enabled. * Remove IIO_BLOCK_STATE_DEQUEUED, since it is now functionally the same as IIO_BLOCK_STATE_DONE.
- [02/12]:
- Fix block->state not being reset in iio_dma_buffer_request_update() for output buffers.
- Only update block->bytes_used once and add a comment about
why we update it. * Add a comment about why we're setting a different state for output buffers in iio_dma_buffer_request_update() * Remove useless cast to bool (!!) in iio_dma_buffer_io()
- [05/12]: Only allow the new IOCTLs on the buffer FD created with IIO_BUFFER_GET_FD_IOCTL().
- [12/12]:
- Explicitly state that the new interface is optional and is not implemented by all drivers.
- The IOCTLs can now only be called on the buffer FD returned by IIO_BUFFER_GET_FD_IOCTL.
- Move the page up a bit in the index since it is core stuff
and not driver-specific.
The patches not listed here have not been modified since v1.
Cheers, -Paul
Alexandru Ardelean (1): iio: buffer-dma: split iio_dma_buffer_fileio_free() function
Paul Cercueil (11): iio: buffer-dma: Get rid of outgoing queue iio: buffer-dma: Enable buffer write support iio: buffer-dmaengine: Support specifying buffer direction iio: buffer-dmaengine: Enable write support iio: core: Add new DMABUF interface infrastructure iio: buffer-dma: Use DMABUFs instead of custom solution iio: buffer-dma: Implement new DMABUF based userspace API iio: buffer-dmaengine: Support new DMABUF based userspace API iio: core: Add support for cyclic buffers iio: buffer-dmaengine: Add support for cyclic buffers Documentation: iio: Document high-speed DMABUF based API
Documentation/driver-api/dma-buf.rst | 2 + Documentation/iio/dmabuf_api.rst | 94 +++ Documentation/iio/index.rst | 2 + drivers/iio/adc/adi-axi-adc.c | 3 +- drivers/iio/buffer/industrialio-buffer-dma.c | 610 ++++++++++++++---- .../buffer/industrialio-buffer-dmaengine.c | 42 +- drivers/iio/industrialio-buffer.c | 60 ++ include/linux/iio/buffer-dma.h | 38 +- include/linux/iio/buffer-dmaengine.h | 5 +- include/linux/iio/buffer_impl.h | 8 + include/uapi/linux/iio/buffer.h | 30 + 11 files changed, 749 insertions(+), 145 deletions(-) create mode 100644 Documentation/iio/dmabuf_api.rst
Hi Daniel,
Le mar., mars 29 2022 at 10:33:32 +0200, Daniel Vetter daniel@ffwll.ch a écrit :
On Tue, Feb 15, 2022 at 05:43:35PM +0000, Paul Cercueil wrote:
Hi Jonathan,
Le dim., févr. 13 2022 at 18:46:16 +0000, Jonathan Cameron jic23@kernel.org a écrit :
On Mon, 7 Feb 2022 12:59:21 +0000 Paul Cercueil paul@crapouillou.net wrote:
Hi Jonathan,
This is the V2 of my patchset that introduces a new userspace interface based on DMABUF objects to complement the fileio API, and adds write() support to the existing fileio API.
Hi Paul,
It's been a little while. Perhaps you could summarize the various
view
points around the appropriateness of using DMABUF for this? I appreciate it is a tricky topic to distil into a brief summary
but
I know I would find it useful even if no one else does!
So we want to have a high-speed interface where buffers of samples are passed around between IIO devices and other devices (e.g. USB or network), or made available to userspace without copying the data.
DMABUF is, at least in theory, exactly what we need. Quoting the documentation (https://www.kernel.org/doc/html/v5.15/driver-api/dma-buf.html): "The dma-buf subsystem provides the framework for sharing buffers for hardware (DMA) access across multiple device drivers and subsystems, and for synchronizing asynchronous hardware access. This is used, for example, by drm “prime” multi-GPU support, but is of course not limited to GPU use cases."
The problem is that right now DMABUF is only really used by DRM, and to quote Daniel, "dma-buf looks like something super generic and useful, until you realize that there's a metric ton of gpu/accelerator bagage piled in".
Still, it seems to be the only viable option. We could add a custom buffer-passing interface, but that would mean implementing the same buffer-passing interface on the network and USB stacks, and before we know it we re-invented DMABUFs.
dma-buf also doesn't support sharing with network and usb stacks, so I'm a bit confused why exactly this is useful?
There is an attempt to get dma-buf support in the network stack, called "zctap". Last patchset was sent last november. USB stack does not support dma-buf, but we can add it later I guess.
So yeah unless there's some sharing going on with gpu stuff (for data processing maybe) I'm not sure this makes a lot of sense really. Or at least some zero-copy sharing between drivers, but even that would minimally require a dma-buf import ioctl of some sorts. Which I either missed or doesn't exist.
We do want zero-copy between drivers, the network stack, and the USB stack. It's not just about having a userspace interface.
If there's none of that then just hand-roll your buffer handling code (xarray is cheap to use in terms of code for this), you can always add dma-buf import/export later on when the need arises.
Scrolling through patches you only have dma-buf export, but no importing, so the use-case that works is with one of the existing subsystems that supporting dma-buf importing.
I think minimally we need the use-case (in form of code) that needs the buffer sharing here.
I'll try with zctap and report back.
Cheers, -Paul
Changes since v1:
- the patches that were merged in v1 have been (obviously)
dropped
from this patchset;
- the patch that was setting the write-combine cache setting
has
been dropped as well, as it was simply not useful.
- [01/12]:
- Only remove the outgoing queue, and keep the incoming
queue,
as we want the buffer to start streaming data as soon as it is enabled. * Remove IIO_BLOCK_STATE_DEQUEUED, since it is now
functionally
the same as IIO_BLOCK_STATE_DONE.
- [02/12]:
- Fix block->state not being reset in iio_dma_buffer_request_update() for output buffers.
- Only update block->bytes_used once and add a comment
about
why we update it. * Add a comment about why we're setting a different state
for
output buffers in iio_dma_buffer_request_update() * Remove useless cast to bool (!!) in iio_dma_buffer_io()
- [05/12]: Only allow the new IOCTLs on the buffer FD created with IIO_BUFFER_GET_FD_IOCTL().
- [12/12]:
- Explicitly state that the new interface is optional and
is
not implemented by all drivers. * The IOCTLs can now only be called on the buffer FD
returned by
IIO_BUFFER_GET_FD_IOCTL. * Move the page up a bit in the index since it is core
stuff
and not driver-specific.
The patches not listed here have not been modified since v1.
Cheers, -Paul
Alexandru Ardelean (1): iio: buffer-dma: split iio_dma_buffer_fileio_free() function
Paul Cercueil (11): iio: buffer-dma: Get rid of outgoing queue iio: buffer-dma: Enable buffer write support iio: buffer-dmaengine: Support specifying buffer direction iio: buffer-dmaengine: Enable write support iio: core: Add new DMABUF interface infrastructure iio: buffer-dma: Use DMABUFs instead of custom solution iio: buffer-dma: Implement new DMABUF based userspace API iio: buffer-dmaengine: Support new DMABUF based userspace API iio: core: Add support for cyclic buffers iio: buffer-dmaengine: Add support for cyclic buffers Documentation: iio: Document high-speed DMABUF based API
Documentation/driver-api/dma-buf.rst | 2 + Documentation/iio/dmabuf_api.rst | 94 +++ Documentation/iio/index.rst | 2 + drivers/iio/adc/adi-axi-adc.c | 3 +- drivers/iio/buffer/industrialio-buffer-dma.c | 610 ++++++++++++++---- .../buffer/industrialio-buffer-dmaengine.c | 42 +- drivers/iio/industrialio-buffer.c | 60 ++ include/linux/iio/buffer-dma.h | 38 +- include/linux/iio/buffer-dmaengine.h | 5 +- include/linux/iio/buffer_impl.h | 8 + include/uapi/linux/iio/buffer.h | 30 + 11 files changed, 749 insertions(+), 145 deletions(-) create mode 100644 Documentation/iio/dmabuf_api.rst
-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
On Tue, Mar 29, 2022 at 10:11:14AM +0100, Paul Cercueil wrote:
Hi Daniel,
Le mar., mars 29 2022 at 10:33:32 +0200, Daniel Vetter daniel@ffwll.ch a écrit :
On Tue, Feb 15, 2022 at 05:43:35PM +0000, Paul Cercueil wrote:
Hi Jonathan,
Le dim., févr. 13 2022 at 18:46:16 +0000, Jonathan Cameron jic23@kernel.org a écrit :
On Mon, 7 Feb 2022 12:59:21 +0000 Paul Cercueil paul@crapouillou.net wrote:
Hi Jonathan,
This is the V2 of my patchset that introduces a new userspace interface based on DMABUF objects to complement the fileio API, and adds write() support to the existing fileio API.
Hi Paul,
It's been a little while. Perhaps you could summarize the various
view
points around the appropriateness of using DMABUF for this? I appreciate it is a tricky topic to distil into a brief summary
but
I know I would find it useful even if no one else does!
So we want to have a high-speed interface where buffers of samples are passed around between IIO devices and other devices (e.g. USB or network), or made available to userspace without copying the data.
DMABUF is, at least in theory, exactly what we need. Quoting the documentation (https://www.kernel.org/doc/html/v5.15/driver-api/dma-buf.html): "The dma-buf subsystem provides the framework for sharing buffers for hardware (DMA) access across multiple device drivers and subsystems, and for synchronizing asynchronous hardware access. This is used, for example, by drm “prime” multi-GPU support, but is of course not limited to GPU use cases."
The problem is that right now DMABUF is only really used by DRM, and to quote Daniel, "dma-buf looks like something super generic and useful, until you realize that there's a metric ton of gpu/accelerator bagage piled in".
Still, it seems to be the only viable option. We could add a custom buffer-passing interface, but that would mean implementing the same buffer-passing interface on the network and USB stacks, and before we know it we re-invented DMABUFs.
dma-buf also doesn't support sharing with network and usb stacks, so I'm a bit confused why exactly this is useful?
There is an attempt to get dma-buf support in the network stack, called "zctap". Last patchset was sent last november. USB stack does not support dma-buf, but we can add it later I guess.
So yeah unless there's some sharing going on with gpu stuff (for data processing maybe) I'm not sure this makes a lot of sense really. Or at least some zero-copy sharing between drivers, but even that would minimally require a dma-buf import ioctl of some sorts. Which I either missed or doesn't exist.
We do want zero-copy between drivers, the network stack, and the USB stack. It's not just about having a userspace interface.
I think in that case we need these other pieces too. And we need acks from relevant subsystems that these other pieces are a) ready for upstream merging and also that the dma-buf side of things actually makes sense.
If there's none of that then just hand-roll your buffer handling code (xarray is cheap to use in terms of code for this), you can always add dma-buf import/export later on when the need arises.
Scrolling through patches you only have dma-buf export, but no importing, so the use-case that works is with one of the existing subsystems that supporting dma-buf importing.
I think minimally we need the use-case (in form of code) that needs the buffer sharing here.
I'll try with zctap and report back.
Do you have a link for this? I just checked dri-devel on lore, and it's not there. Nor anywhere else.
We really need all the pieces, and if block layer reaction is anything to judge by, dma-buf wont happen for networking either. There's some really nasty and fairly fundamental issues with locking and memory reclaim that make this utter pain or outright impossible. -Daniel
Cheers, -Paul
Changes since v1:
- the patches that were merged in v1 have been (obviously)
dropped
from this patchset;
- the patch that was setting the write-combine cache setting
has
been dropped as well, as it was simply not useful.
- [01/12]:
- Only remove the outgoing queue, and keep the incoming
queue,
as we want the buffer to start streaming data as soon as it is enabled. * Remove IIO_BLOCK_STATE_DEQUEUED, since it is now
functionally
the same as IIO_BLOCK_STATE_DONE.
- [02/12]:
- Fix block->state not being reset in iio_dma_buffer_request_update() for output buffers.
- Only update block->bytes_used once and add a comment
about
why we update it. * Add a comment about why we're setting a different state
for
output buffers in iio_dma_buffer_request_update() * Remove useless cast to bool (!!) in iio_dma_buffer_io()
- [05/12]: Only allow the new IOCTLs on the buffer FD created with IIO_BUFFER_GET_FD_IOCTL().
- [12/12]:
- Explicitly state that the new interface is optional and
is
not implemented by all drivers. * The IOCTLs can now only be called on the buffer FD
returned by
IIO_BUFFER_GET_FD_IOCTL. * Move the page up a bit in the index since it is core
stuff
and not driver-specific.
The patches not listed here have not been modified since v1.
Cheers, -Paul
Alexandru Ardelean (1): iio: buffer-dma: split iio_dma_buffer_fileio_free() function
Paul Cercueil (11): iio: buffer-dma: Get rid of outgoing queue iio: buffer-dma: Enable buffer write support iio: buffer-dmaengine: Support specifying buffer direction iio: buffer-dmaengine: Enable write support iio: core: Add new DMABUF interface infrastructure iio: buffer-dma: Use DMABUFs instead of custom solution iio: buffer-dma: Implement new DMABUF based userspace API iio: buffer-dmaengine: Support new DMABUF based userspace API iio: core: Add support for cyclic buffers iio: buffer-dmaengine: Add support for cyclic buffers Documentation: iio: Document high-speed DMABUF based API
Documentation/driver-api/dma-buf.rst | 2 + Documentation/iio/dmabuf_api.rst | 94 +++ Documentation/iio/index.rst | 2 + drivers/iio/adc/adi-axi-adc.c | 3 +- drivers/iio/buffer/industrialio-buffer-dma.c | 610 ++++++++++++++---- .../buffer/industrialio-buffer-dmaengine.c | 42 +- drivers/iio/industrialio-buffer.c | 60 ++ include/linux/iio/buffer-dma.h | 38 +- include/linux/iio/buffer-dmaengine.h | 5 +- include/linux/iio/buffer_impl.h | 8 + include/uapi/linux/iio/buffer.h | 30 + 11 files changed, 749 insertions(+), 145 deletions(-) create mode 100644 Documentation/iio/dmabuf_api.rst
-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Hi Daniel,
Le mar., mars 29 2022 at 16:10:44 +0200, Daniel Vetter daniel@ffwll.ch a écrit :
On Tue, Mar 29, 2022 at 10:11:14AM +0100, Paul Cercueil wrote:
Hi Daniel,
Le mar., mars 29 2022 at 10:33:32 +0200, Daniel Vetter daniel@ffwll.ch a écrit :
On Tue, Feb 15, 2022 at 05:43:35PM +0000, Paul Cercueil wrote:
Hi Jonathan,
Le dim., févr. 13 2022 at 18:46:16 +0000, Jonathan Cameron jic23@kernel.org a écrit :
On Mon, 7 Feb 2022 12:59:21 +0000 Paul Cercueil paul@crapouillou.net wrote:
Hi Jonathan,
This is the V2 of my patchset that introduces a new
userspace
interface based on DMABUF objects to complement the fileio API, and
adds
write() support to the existing fileio API.
Hi Paul,
It's been a little while. Perhaps you could summarize the
various
view
points around the appropriateness of using DMABUF for this? I appreciate it is a tricky topic to distil into a brief
summary
but
I know I would find it useful even if no one else does!
So we want to have a high-speed interface where buffers of
samples
are passed around between IIO devices and other devices (e.g. USB
or
network), or made available to userspace without copying the data.
DMABUF is, at least in theory, exactly what we need. Quoting
the
documentation
(https://www.kernel.org/doc/html/v5.15/driver-api/dma-buf.html):
"The dma-buf subsystem provides the framework for sharing
buffers
for hardware (DMA) access across multiple device drivers and subsystems, and for synchronizing asynchronous hardware access. This is used, for example, by drm “prime” multi-GPU support, but is of course not
limited to GPU
use cases."
The problem is that right now DMABUF is only really used by
DRM,
and to quote Daniel, "dma-buf looks like something super generic and useful, until you realize that there's a metric ton of gpu/accelerator bagage piled in".
Still, it seems to be the only viable option. We could add a
custom
buffer-passing interface, but that would mean implementing the
same
buffer-passing interface on the network and USB stacks, and
before
we know it we re-invented DMABUFs.
dma-buf also doesn't support sharing with network and usb stacks,
so I'm
a bit confused why exactly this is useful?
There is an attempt to get dma-buf support in the network stack, called "zctap". Last patchset was sent last november. USB stack does not support dma-buf, but we can add it later I guess.
So yeah unless there's some sharing going on with gpu stuff (for
data
processing maybe) I'm not sure this makes a lot of sense really.
Or at
least some zero-copy sharing between drivers, but even that would minimally require a dma-buf import ioctl of some sorts. Which I
either
missed or doesn't exist.
We do want zero-copy between drivers, the network stack, and the USB stack. It's not just about having a userspace interface.
I think in that case we need these other pieces too. And we need acks from relevant subsystems that these other pieces are a) ready for upstream merging and also that the dma-buf side of things actually makes sense.
Ok...
If there's none of that then just hand-roll your buffer handling
code
(xarray is cheap to use in terms of code for this), you can
always add
dma-buf import/export later on when the need arises.
Scrolling through patches you only have dma-buf export, but no importing, so the use-case that works is with one of the existing subsystems
that
supporting dma-buf importing.
I think minimally we need the use-case (in form of code) that
needs the
buffer sharing here.
I'll try with zctap and report back.
Do you have a link for this? I just checked dri-devel on lore, and it's not there. Nor anywhere else.
The code is here: https://github.com/jlemon/zctap_kernel
I know Jonathan Lemon (Cc'd) was working on upstreaming it, I saw a few patchsets.
Cheers, -Paul
We really need all the pieces, and if block layer reaction is anything to judge by, dma-buf wont happen for networking either. There's some really nasty and fairly fundamental issues with locking and memory reclaim that make this utter pain or outright impossible. -Daniel
Cheers, -Paul
Changes since v1:
- the patches that were merged in v1 have been (obviously)
dropped
from this patchset;
- the patch that was setting the write-combine cache
setting
has
been dropped as well, as it was simply not useful.
- [01/12]:
- Only remove the outgoing queue, and keep the
incoming
queue,
as we want the buffer to start streaming data as soon as
it is
enabled. * Remove IIO_BLOCK_STATE_DEQUEUED, since it is now
functionally
the same as IIO_BLOCK_STATE_DONE.
- [02/12]:
- Fix block->state not being reset in iio_dma_buffer_request_update() for output buffers.
- Only update block->bytes_used once and add a comment
about
why we update it. * Add a comment about why we're setting a different
state
for
output buffers in iio_dma_buffer_request_update() * Remove useless cast to bool (!!) in
iio_dma_buffer_io()
- [05/12]: Only allow the new IOCTLs on the buffer FD created
with
IIO_BUFFER_GET_FD_IOCTL().
- [12/12]:
- Explicitly state that the new interface is optional
and
is
not implemented by all drivers. * The IOCTLs can now only be called on the buffer FD
returned by
IIO_BUFFER_GET_FD_IOCTL. * Move the page up a bit in the index since it is core
stuff
and not driver-specific.
The patches not listed here have not been modified since
v1.
Cheers, -Paul
Alexandru Ardelean (1): iio: buffer-dma: split iio_dma_buffer_fileio_free()
function
Paul Cercueil (11): iio: buffer-dma: Get rid of outgoing queue iio: buffer-dma: Enable buffer write support iio: buffer-dmaengine: Support specifying buffer
direction
iio: buffer-dmaengine: Enable write support iio: core: Add new DMABUF interface infrastructure iio: buffer-dma: Use DMABUFs instead of custom solution iio: buffer-dma: Implement new DMABUF based userspace
API
iio: buffer-dmaengine: Support new DMABUF based
userspace API
iio: core: Add support for cyclic buffers iio: buffer-dmaengine: Add support for cyclic buffers Documentation: iio: Document high-speed DMABUF based API
Documentation/driver-api/dma-buf.rst | 2 + Documentation/iio/dmabuf_api.rst | 94 +++ Documentation/iio/index.rst | 2 + drivers/iio/adc/adi-axi-adc.c | 3 +- drivers/iio/buffer/industrialio-buffer-dma.c | 610 ++++++++++++++---- .../buffer/industrialio-buffer-dmaengine.c | 42 +- drivers/iio/industrialio-buffer.c | 60 ++ include/linux/iio/buffer-dma.h | 38 +- include/linux/iio/buffer-dmaengine.h | 5 +- include/linux/iio/buffer_impl.h | 8 + include/uapi/linux/iio/buffer.h | 30 + 11 files changed, 749 insertions(+), 145 deletions(-) create mode 100644 Documentation/iio/dmabuf_api.rst
-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
On Tue, Mar 29, 2022 at 06:16:56PM +0100, Paul Cercueil wrote:
Hi Daniel,
Le mar., mars 29 2022 at 16:10:44 +0200, Daniel Vetter daniel@ffwll.ch a écrit :
On Tue, Mar 29, 2022 at 10:11:14AM +0100, Paul Cercueil wrote:
Hi Daniel,
Le mar., mars 29 2022 at 10:33:32 +0200, Daniel Vetter daniel@ffwll.ch a écrit :
On Tue, Feb 15, 2022 at 05:43:35PM +0000, Paul Cercueil wrote:
Hi Jonathan,
Le dim., févr. 13 2022 at 18:46:16 +0000, Jonathan Cameron jic23@kernel.org a écrit :
On Mon, 7 Feb 2022 12:59:21 +0000 Paul Cercueil paul@crapouillou.net wrote:
> Hi Jonathan, > > This is the V2 of my patchset that introduces a new
userspace
> interface > based on DMABUF objects to complement the fileio API, and
adds
> write() > support to the existing fileio API.
Hi Paul,
It's been a little while. Perhaps you could summarize the
various
view
points around the appropriateness of using DMABUF for this? I appreciate it is a tricky topic to distil into a brief
summary
but
I know I would find it useful even if no one else does!
So we want to have a high-speed interface where buffers of
samples
are passed around between IIO devices and other devices (e.g. USB
or
network), or made available to userspace without copying the data.
DMABUF is, at least in theory, exactly what we need. Quoting
the
documentation
(https://www.kernel.org/doc/html/v5.15/driver-api/dma-buf.html):
"The dma-buf subsystem provides the framework for sharing
buffers
for hardware (DMA) access across multiple device drivers and subsystems, and for synchronizing asynchronous hardware access. This is used, for example, by drm “prime” multi-GPU support, but is of course not limited to
GPU
use cases."
The problem is that right now DMABUF is only really used by
DRM,
and to quote Daniel, "dma-buf looks like something super generic and useful, until you realize that there's a metric ton of gpu/accelerator bagage piled in".
Still, it seems to be the only viable option. We could add a
custom
buffer-passing interface, but that would mean implementing the
same
buffer-passing interface on the network and USB stacks, and
before
we know it we re-invented DMABUFs.
dma-buf also doesn't support sharing with network and usb stacks,
so I'm
a bit confused why exactly this is useful?
There is an attempt to get dma-buf support in the network stack, called "zctap". Last patchset was sent last november. USB stack does not support dma-buf, but we can add it later I guess.
So yeah unless there's some sharing going on with gpu stuff (for
data
processing maybe) I'm not sure this makes a lot of sense really.
Or at
least some zero-copy sharing between drivers, but even that would minimally require a dma-buf import ioctl of some sorts. Which I
either
missed or doesn't exist.
We do want zero-copy between drivers, the network stack, and the USB stack. It's not just about having a userspace interface.
I think in that case we need these other pieces too. And we need acks from relevant subsystems that these other pieces are a) ready for upstream merging and also that the dma-buf side of things actually makes sense.
Ok...
If there's none of that then just hand-roll your buffer handling
code
(xarray is cheap to use in terms of code for this), you can
always add
dma-buf import/export later on when the need arises.
Scrolling through patches you only have dma-buf export, but no importing, so the use-case that works is with one of the existing subsystems
that
supporting dma-buf importing.
I think minimally we need the use-case (in form of code) that
needs the
buffer sharing here.
I'll try with zctap and report back.
Do you have a link for this? I just checked dri-devel on lore, and it's not there. Nor anywhere else.
The code is here: https://github.com/jlemon/zctap_kernel
I know Jonathan Lemon (Cc'd) was working on upstreaming it, I saw a few patchsets.
Yeah if the goal here is to zero-copy from iio to network sockets, then I think we really need the full picture first, at least as a prototype.
And also a rough consensus among all involved subsystems that this is the right approach and that there's no fundamental issues. I really have no clue about network to make a call there.
I'm bringing this up because a few folks wanted to look into zero-copy between gpu and nvme, using dma-buf. And after lots of head-banging-against-solid-concrete-walls, at least my conclusion is that due to locking issues it's really not possible without huge changes to the block i/o. And those are not on the table. -Daniel
Cheers, -Paul
We really need all the pieces, and if block layer reaction is anything to judge by, dma-buf wont happen for networking either. There's some really nasty and fairly fundamental issues with locking and memory reclaim that make this utter pain or outright impossible. -Daniel
Cheers, -Paul
> > Changes since v1: > > - the patches that were merged in v1 have been (obviously)
dropped
> from > this patchset; > - the patch that was setting the write-combine cache
setting
has
> been > dropped as well, as it was simply not useful. > - [01/12]: > * Only remove the outgoing queue, and keep the
incoming
queue,
> as we > want the buffer to start streaming data as soon as
it is
> enabled. > * Remove IIO_BLOCK_STATE_DEQUEUED, since it is now
functionally
> the > same as IIO_BLOCK_STATE_DONE. > - [02/12]: > * Fix block->state not being reset in > iio_dma_buffer_request_update() for output buffers. > * Only update block->bytes_used once and add a comment
about
> why we > update it. > * Add a comment about why we're setting a different
state
for
> output > buffers in iio_dma_buffer_request_update() > * Remove useless cast to bool (!!) in
iio_dma_buffer_io()
> - [05/12]: > Only allow the new IOCTLs on the buffer FD created
with
> IIO_BUFFER_GET_FD_IOCTL(). > - [12/12]: > * Explicitly state that the new interface is optional
and
is
> not implemented by all drivers. > * The IOCTLs can now only be called on the buffer FD
returned by
> IIO_BUFFER_GET_FD_IOCTL. > * Move the page up a bit in the index since it is core
stuff
> and not > driver-specific. > > The patches not listed here have not been modified since
v1.
> > Cheers, > -Paul > > Alexandru Ardelean (1): > iio: buffer-dma: split iio_dma_buffer_fileio_free()
function
> > Paul Cercueil (11): > iio: buffer-dma: Get rid of outgoing queue > iio: buffer-dma: Enable buffer write support > iio: buffer-dmaengine: Support specifying buffer
direction
> iio: buffer-dmaengine: Enable write support > iio: core: Add new DMABUF interface infrastructure > iio: buffer-dma: Use DMABUFs instead of custom solution > iio: buffer-dma: Implement new DMABUF based userspace
API
> iio: buffer-dmaengine: Support new DMABUF based
userspace API
> iio: core: Add support for cyclic buffers > iio: buffer-dmaengine: Add support for cyclic buffers > Documentation: iio: Document high-speed DMABUF based API > > Documentation/driver-api/dma-buf.rst | 2 + > Documentation/iio/dmabuf_api.rst | 94 +++ > Documentation/iio/index.rst | 2 + > drivers/iio/adc/adi-axi-adc.c | 3 +- > drivers/iio/buffer/industrialio-buffer-dma.c | 610 > ++++++++++++++---- > .../buffer/industrialio-buffer-dmaengine.c | 42 +- > drivers/iio/industrialio-buffer.c | 60 ++ > include/linux/iio/buffer-dma.h | 38 +- > include/linux/iio/buffer-dmaengine.h | 5 +- > include/linux/iio/buffer_impl.h | 8 + > include/uapi/linux/iio/buffer.h | 30 + > 11 files changed, 749 insertions(+), 145 deletions(-) > create mode 100644 Documentation/iio/dmabuf_api.rst >
-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
-- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
linaro-mm-sig@lists.linaro.org