Hi,
On Wed, Apr 20, 2011 at 02:52:56PM -0700, Rebecca Schultz Zavin wrote:
The android team's graphics folks, the their counterparts at intel, imagination tech, nvidia, qualcomm and arm have all told me that they need a method for mapping buffers uncached to userspace. The common case for this is to write vertexes, textures etc to these buffers once and never touch them again. This may happen several (or even several 10s or more) of times per frame. My experience with cache flushes on ARM architectures matches Marek's. Typically write combine makes streaming writes really really fast, and on several SOC's we've found it cheaper to flush the whole cache than to flush by line. Clearly this impacts the rest of system performance, not to mention the fact that a couple of large textures and you've totally blown your caches for the rest of the system.
This was exactly our experience with OMAP hardware (the Nokia N900, running an OMAP3430); we made strong use of uncached regions for exactly this reason, and measured a fairly dramatic performance win in both artificial and realistic usecases.
(This was with a Cortex-A8, so happy days.)
Cheers, Daniel