Hi Kiran,
On 24 March 2014 02:35, kiran kiranchandramohan@gmail.com wrote:
Hi,
I was looking at some code(given below) which seems to perform very badly when attachments and detachments to used to simulate cache coherency.
Care to give more details on your use case, which processor(s) are you targeting etc? To me, using attachment detachment mechanism to 'simulate' cache coherency sounds fairly wrong - Ideally, cache coherency management is to be done by the exporter in your system. I presume you have had a chance to read through the Documentation/dma-buf-sharing.txt file, which gives some good details on what to expect from dma-bufs etc?
Best regards, ~Sumit.
In the code below, when remote_attach is false(ie no remote processors), using just the two A9 cores the following code runs in 8.8 seconds. But when remote_attach is true then even though there are other cores also executing and sharing the workload the following code takes 52.7 seconds. This shows that detach and attach is very heavy for this kind of code. (The system call detach performs dma_buf_unmap_attachment and dma_buf_detach, system call attach performs dma_buf_attach and dma_buf_map_attachment).
for (k = 0; k < N; k++) { if(remote_attach) { detach(path) ; attach(path) ; }
for(i = start_indx; i < end_indx; i++) { for (j = 0; j < N; j++) { if(path[i][j] < (path[i][k] + path[k][j])) { path[i][j] = path[i][k] + path[k][j] ; } } }
}
I would like to manage the cache explicitly and flush cache lines rather than pages to reduce overhead. I also want to access these buffers from the userspace. I can change some kernel code for this. Where should I start ?
Thanks in advance.
--Kiran
Linaro-mm-sig mailing list Linaro-mm-sig@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-mm-sig