Hi,
This is v2 on the attempt to remove misuse of the cache APIs from Ion. The previous version[1] attempted to pull the cache APIs into Ion. This was indicated as the wrong approach and real APIs should be created instead.
The APIs created are kernel_force_cache_clean and kernel_force_cache_invalidate. They force a clean and invalidate of the cache, respectively. The aim was to take the semantics of dma_sync and turn them into something that isn't dma_sync. This series includes a nominal implementation for arm/arm64, mostly for demonstration purposes.
I did review before writing this of whether Ion could just use the DMA mapping APIs to acheive the necessary coherency. The conclusion I came to was that trying to force the Ion code into the DMA model would create more problem than it solved since there wasn't anything like DMA going on. Similarly, none of the existing cache APIs did exactly what was needed either. The closest matches were designed for file cache pages and not drivers.
The series includes conversion of Ion to the new APIs. There are a few other drivers which are calling either arch specific APIs or flush_dcache_page that could be converted as well. The i915 driver could potentially be converted with the addition of an x86 implementation.
Feedback appreciated as always.
Thanks, Laura
[1] https://lkml.kernel.org/g/1464205684-5587-1-git-send-email-labbott@redhat.com
Laura Abbott (5): Documentation: Introduce kernel_force_cache_* APIs arm: Implement ARCH_HAS_FORCE_CACHE arm64: Implement ARCH_HAS_FORCE_CACHE staging: android: ion: Convert to the kernel_force_cache APIs staging: ion: Add support for syncing with DMA_BUF_IOCTL_SYNC
Documentation/cachetlb.txt | 18 +++- arch/arm/include/asm/cacheflush.h | 4 + arch/arm/mm/dma-mapping.c | 119 ------------------------ arch/arm/mm/flush.c | 115 +++++++++++++++++++++++ arch/arm/mm/mm.h | 8 ++ arch/arm64/include/asm/cacheflush.h | 5 + arch/arm64/mm/flush.c | 11 +++ drivers/staging/android/ion/ion.c | 53 +++++++---- drivers/staging/android/ion/ion_carveout_heap.c | 8 +- drivers/staging/android/ion/ion_chunk_heap.c | 12 ++- drivers/staging/android/ion/ion_page_pool.c | 6 +- drivers/staging/android/ion/ion_priv.h | 11 --- drivers/staging/android/ion/ion_system_heap.c | 6 +- include/linux/cacheflush.h | 11 +++ 14 files changed, 225 insertions(+), 162 deletions(-) create mode 100644 include/linux/cacheflush.h
Some frameworks (e.g. Ion) may need to do explicit cache management to meet performance/correctness requirements. Rather than piggy-back on another API and hope the semantics don't change, introduce a set of APIs to force a page to be cleaned/invalidated in the cache.
Signed-off-by: Laura Abbott labbott@redhat.com --- Documentation/cachetlb.txt | 18 +++++++++++++++++- include/linux/cacheflush.h | 11 +++++++++++ 2 files changed, 28 insertions(+), 1 deletion(-) create mode 100644 include/linux/cacheflush.h
diff --git a/Documentation/cachetlb.txt b/Documentation/cachetlb.txt index 3f9f808..18eec7c 100644 --- a/Documentation/cachetlb.txt +++ b/Documentation/cachetlb.txt @@ -378,7 +378,7 @@ maps this page at its virtual address. flush_dcache_page and update_mmu_cache. In the future, the hope is to remove this interface completely.
-The final category of APIs is for I/O to deliberately aliased address +Another set of APIs is for I/O to deliberately aliased address ranges inside the kernel. Such aliases are set up by use of the vmap/vmalloc API. Since kernel I/O goes via physical pages, the I/O subsystem assumes that the user mapping and kernel offset mapping are @@ -401,3 +401,19 @@ I/O and invalidating it after the I/O returns. speculatively reading data while the I/O was occurring to the physical pages. This is only necessary for data reads into the vmap area. + +Nearly all drivers can handle cache management using the existing DMA model. +There may be limited circumstances when a driver or framework needs to +explicitly manage the cache; trying to force cache management into the DMA +framework may lead to performance loss or unnecessary work. These APIs may +be used to provide explicit coherency for memory that does not fall into +any of the above categories. Implementers of this API must assume the +address can be aliased. Any cache operations shall not be delayed and must +be completed by the time the call returns. + + void kernel_force_cache_clean(struct page *page, size_t size); + Ensures that any data in the cache by the page is written back + and visible across all aliases. + + void kernel_force_cache_invalidate(struct page *page, size_t size); + Invalidates the cache for the given page. diff --git a/include/linux/cacheflush.h b/include/linux/cacheflush.h new file mode 100644 index 0000000..4388846 --- /dev/null +++ b/include/linux/cacheflush.h @@ -0,0 +1,11 @@ +#ifndef CACHEFLUSH_H +#define CACHEFLUSH_H + +#include <asm/cacheflush.h> + +#ifndef ARCH_HAS_FORCE_CACHE +static inline void kernel_force_cache_clean(struct page *page, size_t size) { } +static inline void kernel_force_cache_invalidate(struct page *page, size_t size) { } +#endif + +#endif
arm may need the kernel_force_cache APIs to guarantee data consistency. Implement versions of these APIs based on the DMA APIs.
Signed-off-by: Laura Abbott labbott@redhat.com --- arch/arm/include/asm/cacheflush.h | 4 ++ arch/arm/mm/dma-mapping.c | 119 -------------------------------------- arch/arm/mm/flush.c | 115 ++++++++++++++++++++++++++++++++++++ arch/arm/mm/mm.h | 8 +++ 4 files changed, 127 insertions(+), 119 deletions(-)
diff --git a/arch/arm/include/asm/cacheflush.h b/arch/arm/include/asm/cacheflush.h index 9156fc3..78eb011 100644 --- a/arch/arm/include/asm/cacheflush.h +++ b/arch/arm/include/asm/cacheflush.h @@ -518,4 +518,8 @@ static inline void secure_flush_area(const void *addr, size_t size) outer_flush_range(phys, phys + size); }
+#define ARCH_HAS_FORCE_CACHE 1 +void kernel_force_cache_clean(struct page *page, size_t size); +void kernel_force_cache_invalidate(struct page *page, size_t size); + #endif diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c index c6834c0..8c9296d 100644 --- a/arch/arm/mm/dma-mapping.c +++ b/arch/arm/mm/dma-mapping.c @@ -95,23 +95,6 @@ static struct arm_dma_buffer *arm_dma_buffer_find(void *virt) return found; }
-/* - * The DMA API is built upon the notion of "buffer ownership". A buffer - * is either exclusively owned by the CPU (and therefore may be accessed - * by it) or exclusively owned by the DMA device. These helper functions - * represent the transitions between these two ownership states. - * - * Note, however, that on later ARMs, this notion does not work due to - * speculative prefetches. We model our approach on the assumption that - * the CPU does do speculative prefetches, which means we clean caches - * before transfers and delay cache invalidation until transfer completion. - * - */ -static void __dma_page_cpu_to_dev(struct page *, unsigned long, - size_t, enum dma_data_direction); -static void __dma_page_dev_to_cpu(struct page *, unsigned long, - size_t, enum dma_data_direction); - /** * arm_dma_map_page - map a portion of a page for streaming DMA * @dev: valid struct device pointer, or NULL for ISA and EISA-like devices @@ -945,108 +928,6 @@ int arm_dma_get_sgtable(struct device *dev, struct sg_table *sgt, return 0; }
-static void dma_cache_maint_page(struct page *page, unsigned long offset, - size_t size, enum dma_data_direction dir, - void (*op)(const void *, size_t, int)) -{ - unsigned long pfn; - size_t left = size; - - pfn = page_to_pfn(page) + offset / PAGE_SIZE; - offset %= PAGE_SIZE; - - /* - * A single sg entry may refer to multiple physically contiguous - * pages. But we still need to process highmem pages individually. - * If highmem is not configured then the bulk of this loop gets - * optimized out. - */ - do { - size_t len = left; - void *vaddr; - - page = pfn_to_page(pfn); - - if (PageHighMem(page)) { - if (len + offset > PAGE_SIZE) - len = PAGE_SIZE - offset; - - if (cache_is_vipt_nonaliasing()) { - vaddr = kmap_atomic(page); - op(vaddr + offset, len, dir); - kunmap_atomic(vaddr); - } else { - vaddr = kmap_high_get(page); - if (vaddr) { - op(vaddr + offset, len, dir); - kunmap_high(page); - } - } - } else { - vaddr = page_address(page) + offset; - op(vaddr, len, dir); - } - offset = 0; - pfn++; - left -= len; - } while (left); -} - -/* - * Make an area consistent for devices. - * Note: Drivers should NOT use this function directly, as it will break - * platforms with CONFIG_DMABOUNCE. - * Use the driver DMA support - see dma-mapping.h (dma_sync_*) - */ -static void __dma_page_cpu_to_dev(struct page *page, unsigned long off, - size_t size, enum dma_data_direction dir) -{ - phys_addr_t paddr; - - dma_cache_maint_page(page, off, size, dir, dmac_map_area); - - paddr = page_to_phys(page) + off; - if (dir == DMA_FROM_DEVICE) { - outer_inv_range(paddr, paddr + size); - } else { - outer_clean_range(paddr, paddr + size); - } - /* FIXME: non-speculating: flush on bidirectional mappings? */ -} - -static void __dma_page_dev_to_cpu(struct page *page, unsigned long off, - size_t size, enum dma_data_direction dir) -{ - phys_addr_t paddr = page_to_phys(page) + off; - - /* FIXME: non-speculating: not required */ - /* in any case, don't bother invalidating if DMA to device */ - if (dir != DMA_TO_DEVICE) { - outer_inv_range(paddr, paddr + size); - - dma_cache_maint_page(page, off, size, dir, dmac_unmap_area); - } - - /* - * Mark the D-cache clean for these pages to avoid extra flushing. - */ - if (dir != DMA_TO_DEVICE && size >= PAGE_SIZE) { - unsigned long pfn; - size_t left = size; - - pfn = page_to_pfn(page) + off / PAGE_SIZE; - off %= PAGE_SIZE; - if (off) { - pfn++; - left -= PAGE_SIZE - off; - } - while (left >= PAGE_SIZE) { - page = pfn_to_page(pfn++); - set_bit(PG_dcache_clean, &page->flags); - left -= PAGE_SIZE; - } - } -}
/** * arm_dma_map_sg - map a set of SG buffers for streaming mode DMA diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c index 3cced84..2b8b705 100644 --- a/arch/arm/mm/flush.c +++ b/arch/arm/mm/flush.c @@ -11,6 +11,7 @@ #include <linux/mm.h> #include <linux/pagemap.h> #include <linux/highmem.h> +#include <linux/dma-mapping.h>
#include <asm/cacheflush.h> #include <asm/cachetype.h> @@ -20,6 +21,7 @@ #include <linux/hugetlb.h>
#include "mm.h" +#include "dma.h"
#ifdef CONFIG_ARM_HEAVY_MB void (*soc_mb)(void); @@ -415,3 +417,116 @@ void __flush_anon_page(struct vm_area_struct *vma, struct page *page, unsigned l */ __cpuc_flush_dcache_area(page_address(page), PAGE_SIZE); } + +static void dma_cache_maint_page(struct page *page, unsigned long offset, + size_t size, enum dma_data_direction dir, + void (*op)(const void *, size_t, int)) +{ + unsigned long pfn; + size_t left = size; + + pfn = page_to_pfn(page) + offset / PAGE_SIZE; + offset %= PAGE_SIZE; + + /* + * A single sg entry may refer to multiple physically contiguous + * pages. But we still need to process highmem pages individually. + * If highmem is not configured then the bulk of this loop gets + * optimized out. + */ + do { + size_t len = left; + void *vaddr; + + page = pfn_to_page(pfn); + + if (PageHighMem(page)) { + if (len + offset > PAGE_SIZE) + len = PAGE_SIZE - offset; + + if (cache_is_vipt_nonaliasing()) { + vaddr = kmap_atomic(page); + op(vaddr + offset, len, dir); + kunmap_atomic(vaddr); + } else { + vaddr = kmap_high_get(page); + if (vaddr) { + op(vaddr + offset, len, dir); + kunmap_high(page); + } + } + } else { + vaddr = page_address(page) + offset; + op(vaddr, len, dir); + } + offset = 0; + pfn++; + left -= len; + } while (left); +} + +/* + * Make an area consistent for devices. + * Note: Drivers should NOT use this function directly, as it will break + * platforms with CONFIG_DMABOUNCE. + * Use the driver DMA support - see dma-mapping.h (dma_sync_*) + */ +void __dma_page_cpu_to_dev(struct page *page, unsigned long off, + size_t size, enum dma_data_direction dir) +{ + phys_addr_t paddr; + + dma_cache_maint_page(page, off, size, dir, dmac_map_area); + + paddr = page_to_phys(page) + off; + if (dir == DMA_FROM_DEVICE) { + outer_inv_range(paddr, paddr + size); + } else { + outer_clean_range(paddr, paddr + size); + } + /* FIXME: non-speculating: flush on bidirectional mappings? */ +} + +void __dma_page_dev_to_cpu(struct page *page, unsigned long off, + size_t size, enum dma_data_direction dir) +{ + phys_addr_t paddr = page_to_phys(page) + off; + + /* FIXME: non-speculating: not required */ + /* in any case, don't bother invalidating if DMA to device */ + if (dir != DMA_TO_DEVICE) { + outer_inv_range(paddr, paddr + size); + + dma_cache_maint_page(page, off, size, dir, dmac_unmap_area); + } + + /* + * Mark the D-cache clean for these pages to avoid extra flushing. + */ + if (dir != DMA_TO_DEVICE && size >= PAGE_SIZE) { + unsigned long pfn; + size_t left = size; + + pfn = page_to_pfn(page) + off / PAGE_SIZE; + off %= PAGE_SIZE; + if (off) { + pfn++; + left -= PAGE_SIZE - off; + } + while (left >= PAGE_SIZE) { + page = pfn_to_page(pfn++); + set_bit(PG_dcache_clean, &page->flags); + left -= PAGE_SIZE; + } + } +} + +void kernel_force_cache_clean(struct page *page, size_t size) +{ + __dma_page_cpu_to_dev(page, 0, size, DMA_BIDIRECTIONAL); +} + +void kernel_force_cache_invalidate(struct page *page, size_t size) +{ + __dma_page_dev_to_cpu(page, 0, size, DMA_BIDIRECTIONAL); +} diff --git a/arch/arm/mm/mm.h b/arch/arm/mm/mm.h index ce727d4..9b853ac 100644 --- a/arch/arm/mm/mm.h +++ b/arch/arm/mm/mm.h @@ -1,9 +1,11 @@ #ifdef CONFIG_MMU #include <linux/list.h> #include <linux/vmalloc.h> +#include <linux/dma-mapping.h>
#include <asm/pgtable.h>
+ /* the upper-most page table pointer */ extern pmd_t *top_pmd;
@@ -97,3 +99,9 @@ void arm_mm_memblock_reserve(void); void dma_contiguous_remap(void);
unsigned long __clear_cr(unsigned long mask); + +void __dma_page_dev_to_cpu(struct page *page, unsigned long off, + size_t size, enum dma_data_direction dir); + +void __dma_page_cpu_to_dev(struct page *page, unsigned long off, + size_t size, enum dma_data_direction dir);
On Mon, Aug 08, 2016 at 10:49:34AM -0700, Laura Abbott wrote:
+/*
- Make an area consistent for devices.
- Note: Drivers should NOT use this function directly, as it will break
- platforms with CONFIG_DMABOUNCE.
- Use the driver DMA support - see dma-mapping.h (dma_sync_*)
- */
+void __dma_page_cpu_to_dev(struct page *page, unsigned long off,
- size_t size, enum dma_data_direction dir)
+{
- phys_addr_t paddr;
- dma_cache_maint_page(page, off, size, dir, dmac_map_area);
- paddr = page_to_phys(page) + off;
- if (dir == DMA_FROM_DEVICE) {
outer_inv_range(paddr, paddr + size);
- } else {
outer_clean_range(paddr, paddr + size);
- }
- /* FIXME: non-speculating: flush on bidirectional mappings? */
+}
+void __dma_page_dev_to_cpu(struct page *page, unsigned long off,
- size_t size, enum dma_data_direction dir)
+{
- phys_addr_t paddr = page_to_phys(page) + off;
- /* FIXME: non-speculating: not required */
- /* in any case, don't bother invalidating if DMA to device */
- if (dir != DMA_TO_DEVICE) {
outer_inv_range(paddr, paddr + size);
dma_cache_maint_page(page, off, size, dir, dmac_unmap_area);
- }
- /*
* Mark the D-cache clean for these pages to avoid extra flushing.
*/
- if (dir != DMA_TO_DEVICE && size >= PAGE_SIZE) {
unsigned long pfn;
size_t left = size;
pfn = page_to_pfn(page) + off / PAGE_SIZE;
off %= PAGE_SIZE;
if (off) {
pfn++;
left -= PAGE_SIZE - off;
}
while (left >= PAGE_SIZE) {
page = pfn_to_page(pfn++);
set_bit(PG_dcache_clean, &page->flags);
left -= PAGE_SIZE;
}
- }
+}
I _really_ don't want these exposed in any shape or form to driver code. I've seen too many hacks out there where people have gone under the cover of the APIs they should be using, and headed straight for the low-level functionality - adding function prototypes to get at stuff they have no business doing. Moving this here is just asking for it to be abused.
+void kernel_force_cache_clean(struct page *page, size_t size) +{
- __dma_page_cpu_to_dev(page, 0, size, DMA_BIDIRECTIONAL);
+}
+void kernel_force_cache_invalidate(struct page *page, size_t size) +{
- __dma_page_dev_to_cpu(page, 0, size, DMA_BIDIRECTIONAL);
+}
Nothing in our implementation of these DMA operations guarantees that those mean "clean" and "invalidate". The DMA operations are there so that CPUs can implement whatever they need at the map and unmap times - and I've been very careful not to specify which cache operations are involved.
For example, on older CPUs, __dma_page_dev_to_cpu() is almost always a no-op.
If you want something that does something specific, then we need something designed to do something specific. Please don't re-use what you think will fit.
On 08/10/2016 04:22 PM, Russell King - ARM Linux wrote:
On Mon, Aug 08, 2016 at 10:49:34AM -0700, Laura Abbott wrote:
+/*
- Make an area consistent for devices.
- Note: Drivers should NOT use this function directly, as it will break
- platforms with CONFIG_DMABOUNCE.
- Use the driver DMA support - see dma-mapping.h (dma_sync_*)
- */
+void __dma_page_cpu_to_dev(struct page *page, unsigned long off,
- size_t size, enum dma_data_direction dir)
+{
- phys_addr_t paddr;
- dma_cache_maint_page(page, off, size, dir, dmac_map_area);
- paddr = page_to_phys(page) + off;
- if (dir == DMA_FROM_DEVICE) {
outer_inv_range(paddr, paddr + size);
- } else {
outer_clean_range(paddr, paddr + size);
- }
- /* FIXME: non-speculating: flush on bidirectional mappings? */
+}
+void __dma_page_dev_to_cpu(struct page *page, unsigned long off,
- size_t size, enum dma_data_direction dir)
+{
- phys_addr_t paddr = page_to_phys(page) + off;
- /* FIXME: non-speculating: not required */
- /* in any case, don't bother invalidating if DMA to device */
- if (dir != DMA_TO_DEVICE) {
outer_inv_range(paddr, paddr + size);
dma_cache_maint_page(page, off, size, dir, dmac_unmap_area);
- }
- /*
* Mark the D-cache clean for these pages to avoid extra flushing.
*/
- if (dir != DMA_TO_DEVICE && size >= PAGE_SIZE) {
unsigned long pfn;
size_t left = size;
pfn = page_to_pfn(page) + off / PAGE_SIZE;
off %= PAGE_SIZE;
if (off) {
pfn++;
left -= PAGE_SIZE - off;
}
while (left >= PAGE_SIZE) {
page = pfn_to_page(pfn++);
set_bit(PG_dcache_clean, &page->flags);
left -= PAGE_SIZE;
}
- }
+}
I _really_ don't want these exposed in any shape or form to driver code. I've seen too many hacks out there where people have gone under the cover of the APIs they should be using, and headed straight for the low-level functionality - adding function prototypes to get at stuff they have no business doing. Moving this here is just asking for it to be abused.
+void kernel_force_cache_clean(struct page *page, size_t size) +{
- __dma_page_cpu_to_dev(page, 0, size, DMA_BIDIRECTIONAL);
+}
+void kernel_force_cache_invalidate(struct page *page, size_t size) +{
- __dma_page_dev_to_cpu(page, 0, size, DMA_BIDIRECTIONAL);
+}
Nothing in our implementation of these DMA operations guarantees that those mean "clean" and "invalidate". The DMA operations are there so that CPUs can implement whatever they need at the map and unmap times - and I've been very careful not to specify which cache operations are involved.
For example, on older CPUs, __dma_page_dev_to_cpu() is almost always a no-op.
If you want something that does something specific, then we need something designed to do something specific. Please don't re-use what you think will fit.
I see what you are saying. What I really wanted was to re-use some of the code that dma_cache_maint_page was doing for highmem handling but it looks like I picked the wrong layer to make common. I'll give this some thought.
Thanks, Laura
arm64 may need to guarantee the caches are synced. Implement versions of the kernel_force_cache API based on the DMA APIs.
Signed-off-by: Laura Abbott labbott@redhat.com --- arch/arm64/include/asm/cacheflush.h | 5 +++++ arch/arm64/mm/flush.c | 11 +++++++++++ 2 files changed, 16 insertions(+)
diff --git a/arch/arm64/include/asm/cacheflush.h b/arch/arm64/include/asm/cacheflush.h index c64268d..9980dd8 100644 --- a/arch/arm64/include/asm/cacheflush.h +++ b/arch/arm64/include/asm/cacheflush.h @@ -149,4 +149,9 @@ int set_memory_rw(unsigned long addr, int numpages); int set_memory_x(unsigned long addr, int numpages); int set_memory_nx(unsigned long addr, int numpages);
+#define ARCH_HAS_FORCE_CACHE 1 + +void kernel_force_cache_clean(struct page *page, size_t size); +void kernel_force_cache_invalidate(struct page *page, size_t size); + #endif diff --git a/arch/arm64/mm/flush.c b/arch/arm64/mm/flush.c index 43a76b0..0af78ab 100644 --- a/arch/arm64/mm/flush.c +++ b/arch/arm64/mm/flush.c @@ -20,6 +20,7 @@ #include <linux/export.h> #include <linux/mm.h> #include <linux/pagemap.h> +#include <linux/dma-mapping.h>
#include <asm/cacheflush.h> #include <asm/cachetype.h> @@ -94,3 +95,13 @@ EXPORT_SYMBOL(flush_dcache_page); * Additional functions defined in assembly. */ EXPORT_SYMBOL(flush_icache_range); + +void kernel_force_cache_clean(struct page *page, size_t size) +{ + __dma_map_area(page_address(page), size, DMA_BIDIRECTIONAL); +} + +void kernel_force_cache_invalidate(struct page *page, size_t size) +{ + __dma_unmap_area(page_address(page), size, DMA_BIDIRECTIONAL); +}
Now that there exists a proper set of cache sync APIs, move away from the dma_sync and do less bad things.
Signed-off-by: Laura Abbott labbott@redhat.com --- drivers/staging/android/ion/ion.c | 22 ++++------------------ drivers/staging/android/ion/ion_carveout_heap.c | 8 +++++--- drivers/staging/android/ion/ion_chunk_heap.c | 12 +++++++----- drivers/staging/android/ion/ion_page_pool.c | 6 ++++-- drivers/staging/android/ion/ion_priv.h | 11 ----------- drivers/staging/android/ion/ion_system_heap.c | 6 +++--- 6 files changed, 23 insertions(+), 42 deletions(-)
diff --git a/drivers/staging/android/ion/ion.c b/drivers/staging/android/ion/ion.c index a2cf93b..5cbe22e 100644 --- a/drivers/staging/android/ion/ion.c +++ b/drivers/staging/android/ion/ion.c @@ -37,6 +37,8 @@ #include <linux/dma-buf.h> #include <linux/idr.h>
+#include <linux/cacheflush.h> + #include "ion.h" #include "ion_priv.h" #include "compat_ion.h" @@ -957,22 +959,6 @@ static void ion_unmap_dma_buf(struct dma_buf_attachment *attachment, { }
-void ion_pages_sync_for_device(struct device *dev, struct page *page, - size_t size, enum dma_data_direction dir) -{ - struct scatterlist sg; - - sg_init_table(&sg, 1); - sg_set_page(&sg, page, size, 0); - /* - * This is not correct - sg_dma_address needs a dma_addr_t that is valid - * for the targeted device, but this works on the currently targeted - * hardware. - */ - sg_dma_address(&sg) = page_to_phys(page); - dma_sync_sg_for_device(dev, &sg, 1, dir); -} - struct ion_vma_list { struct list_head list; struct vm_area_struct *vma; @@ -997,8 +983,8 @@ static void ion_buffer_sync_for_device(struct ion_buffer *buffer, struct page *page = buffer->pages[i];
if (ion_buffer_page_is_dirty(page)) - ion_pages_sync_for_device(dev, ion_buffer_page(page), - PAGE_SIZE, dir); + kernel_force_cache_clean(ion_buffer_page(page), + PAGE_SIZE);
ion_buffer_page_clean(buffer->pages + i); } diff --git a/drivers/staging/android/ion/ion_carveout_heap.c b/drivers/staging/android/ion/ion_carveout_heap.c index 1fb0d81..34c38b0 100644 --- a/drivers/staging/android/ion/ion_carveout_heap.c +++ b/drivers/staging/android/ion/ion_carveout_heap.c @@ -22,6 +22,9 @@ #include <linux/scatterlist.h> #include <linux/slab.h> #include <linux/vmalloc.h> + +#include <linux/cacheflush.h> + #include "ion.h" #include "ion_priv.h"
@@ -116,8 +119,7 @@ static void ion_carveout_heap_free(struct ion_buffer *buffer) ion_heap_buffer_zero(buffer);
if (ion_buffer_cached(buffer)) - dma_sync_sg_for_device(NULL, table->sgl, table->nents, - DMA_BIDIRECTIONAL); + kernel_force_cache_clean(page, buffer->size);
ion_carveout_free(heap, paddr, buffer->size); sg_free_table(table); @@ -157,7 +159,7 @@ struct ion_heap *ion_carveout_heap_create(struct ion_platform_heap *heap_data) page = pfn_to_page(PFN_DOWN(heap_data->base)); size = heap_data->size;
- ion_pages_sync_for_device(NULL, page, size, DMA_BIDIRECTIONAL); + kernel_force_cache_clean(page, size);
ret = ion_heap_pages_zero(page, size, pgprot_writecombine(PAGE_KERNEL)); if (ret) diff --git a/drivers/staging/android/ion/ion_chunk_heap.c b/drivers/staging/android/ion/ion_chunk_heap.c index e0553fe..dde14f3 100644 --- a/drivers/staging/android/ion/ion_chunk_heap.c +++ b/drivers/staging/android/ion/ion_chunk_heap.c @@ -21,6 +21,9 @@ #include <linux/scatterlist.h> #include <linux/slab.h> #include <linux/vmalloc.h> + +#include <linux/cacheflush.h> + #include "ion.h" #include "ion_priv.h"
@@ -104,11 +107,10 @@ static void ion_chunk_heap_free(struct ion_buffer *buffer)
ion_heap_buffer_zero(buffer);
- if (ion_buffer_cached(buffer)) - dma_sync_sg_for_device(NULL, table->sgl, table->nents, - DMA_BIDIRECTIONAL); - for_each_sg(table->sgl, sg, table->nents, i) { + if (ion_buffer_cached(buffer)) + kernel_force_cache_clean(sg_page(table->sgl), + sg->length); gen_pool_free(chunk_heap->pool, page_to_phys(sg_page(sg)), sg->length); } @@ -148,7 +150,7 @@ struct ion_heap *ion_chunk_heap_create(struct ion_platform_heap *heap_data) page = pfn_to_page(PFN_DOWN(heap_data->base)); size = heap_data->size;
- ion_pages_sync_for_device(NULL, page, size, DMA_BIDIRECTIONAL); + kernel_force_cache_clean(page, size);
ret = ion_heap_pages_zero(page, size, pgprot_writecombine(PAGE_KERNEL)); if (ret) diff --git a/drivers/staging/android/ion/ion_page_pool.c b/drivers/staging/android/ion/ion_page_pool.c index 1fe8016..51805d2 100644 --- a/drivers/staging/android/ion/ion_page_pool.c +++ b/drivers/staging/android/ion/ion_page_pool.c @@ -22,6 +22,9 @@ #include <linux/init.h> #include <linux/slab.h> #include <linux/swap.h> + +#include <linux/cacheflush.h> + #include "ion_priv.h"
static void *ion_page_pool_alloc_pages(struct ion_page_pool *pool) @@ -30,8 +33,7 @@ static void *ion_page_pool_alloc_pages(struct ion_page_pool *pool)
if (!page) return NULL; - ion_pages_sync_for_device(NULL, page, PAGE_SIZE << pool->order, - DMA_BIDIRECTIONAL); + kernel_force_cache_clean(page, PAGE_SIZE << pool->order); return page; }
diff --git a/drivers/staging/android/ion/ion_priv.h b/drivers/staging/android/ion/ion_priv.h index 0239883..5828738 100644 --- a/drivers/staging/android/ion/ion_priv.h +++ b/drivers/staging/android/ion/ion_priv.h @@ -392,15 +392,4 @@ void ion_page_pool_free(struct ion_page_pool *, struct page *); int ion_page_pool_shrink(struct ion_page_pool *pool, gfp_t gfp_mask, int nr_to_scan);
-/** - * ion_pages_sync_for_device - cache flush pages for use with the specified - * device - * @dev: the device the pages will be used with - * @page: the first page to be flushed - * @size: size in bytes of region to be flushed - * @dir: direction of dma transfer - */ -void ion_pages_sync_for_device(struct device *dev, struct page *page, - size_t size, enum dma_data_direction dir); - #endif /* _ION_PRIV_H */ diff --git a/drivers/staging/android/ion/ion_system_heap.c b/drivers/staging/android/ion/ion_system_heap.c index b69dfc7..04955f4 100644 --- a/drivers/staging/android/ion/ion_system_heap.c +++ b/drivers/staging/android/ion/ion_system_heap.c @@ -23,6 +23,7 @@ #include <linux/seq_file.h> #include <linux/slab.h> #include <linux/vmalloc.h> +#include <linux/cacheflush.h> #include "ion.h" #include "ion_priv.h"
@@ -70,8 +71,7 @@ static struct page *alloc_buffer_page(struct ion_system_heap *heap, page = alloc_pages(gfp_flags | __GFP_COMP, order); if (!page) return NULL; - ion_pages_sync_for_device(NULL, page, PAGE_SIZE << order, - DMA_BIDIRECTIONAL); + kernel_force_cache_clean(page, PAGE_SIZE << order); }
return page; @@ -360,7 +360,7 @@ static int ion_system_contig_heap_allocate(struct ion_heap *heap,
buffer->priv_virt = table;
- ion_pages_sync_for_device(NULL, page, len, DMA_BIDIRECTIONAL); + kernel_force_cache_clean(page, len);
return 0;
From: Laura Abbott labbott@fedoraproject.org
dma_buf added support for a userspace syncing ioctl. It is implemented by calling dma_buf_begin_cpu_access and dma_buf_end_cpu_access. Ion currently lacks cache operations on this code path. Add them for compatibility with the dma_buf ioctl.
Signed-off-by: Laura Abbott labbott@redhat.com --- drivers/staging/android/ion/ion.c | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+)
diff --git a/drivers/staging/android/ion/ion.c b/drivers/staging/android/ion/ion.c index 5cbe22e..8153af3 100644 --- a/drivers/staging/android/ion/ion.c +++ b/drivers/staging/android/ion/ion.c @@ -1109,6 +1109,24 @@ static void ion_dma_buf_kunmap(struct dma_buf *dmabuf, unsigned long offset, { }
+static void ion_clean_buffer(struct ion_buffer *buffer) +{ + struct scatterlist *sg; + int i; + + for_each_sg(buffer->sg_table->sgl, sg, buffer->sg_table->orig_nents, i) + kernel_force_cache_clean(sg_page(sg), sg->length); +} + +static void ion_invalidate_buffer(struct ion_buffer *buffer) +{ + struct scatterlist *sg; + int i; + + for_each_sg(buffer->sg_table->sgl, sg, buffer->sg_table->orig_nents, i) + kernel_force_cache_invalidate(sg_page(sg), sg->length); +} + static int ion_dma_buf_begin_cpu_access(struct dma_buf *dmabuf, enum dma_data_direction direction) { @@ -1124,6 +1142,11 @@ static int ion_dma_buf_begin_cpu_access(struct dma_buf *dmabuf, mutex_lock(&buffer->lock); vaddr = ion_buffer_kmap_get(buffer); mutex_unlock(&buffer->lock); + + if (direction != DMA_TO_DEVICE) { + ion_invalidate_buffer(buffer); + } + return PTR_ERR_OR_ZERO(vaddr); }
@@ -1136,6 +1159,12 @@ static int ion_dma_buf_end_cpu_access(struct dma_buf *dmabuf, ion_buffer_kmap_put(buffer); mutex_unlock(&buffer->lock);
+ if (direction == DMA_FROM_DEVICE) { + ion_invalidate_buffer(buffer); + } else { + ion_clean_buffer(buffer); + } + return 0; }
@@ -1266,6 +1295,8 @@ static int ion_sync_for_device(struct ion_client *client, int fd) struct dma_buf *dmabuf; struct ion_buffer *buffer;
+ WARN_ONCE(1, "This API is deprecated in favor of the dma_buf ioctl\n"); + dmabuf = dma_buf_get(fd); if (IS_ERR(dmabuf)) return PTR_ERR(dmabuf);
linaro-mm-sig@lists.linaro.org