As proposed yesterday, here's the Android sync driver patches for staging.
I've preserved the commit history, but moved all the changes over to be against the staging directory (instead of drivers/base).
The goal of submitting this driver to staging is to try to get more collaberation, as there are some similar efforts going on in the community with dmabuf-fences. My email from yesterday with more details for how I hope this goes is here: http://comments.gmane.org/gmane.linux.kernel/1448420
Erik also provided a nice background on the patch set in his reply yesterday, which I'll quote here:
"In Honeycomb where we introduced the Hardware Composer HAL. This is a userspace layer that allows composition acceleration on a per platform basis. Different SoC vendors have implemented this using overlays, 2d blitters, a combinations of both, or other clever/disgusting means. Along with the HWC we consolidated a lot of our camera and media pipeline to allow their input to be fed into the GPU or display(overlay.) In order to exploit parallelism the the graphics pipeline, this introduced lots of implicit synchronization dependancies. After a couple years of working with many different SoC vendors, we found that it was really difficult to communicate our system's expectations of the implicit contract and it was difficult for the SoC vendors to properly implement the implicit contract in each of their IP blocks (display, gpu, camera, video codecs). It was also incredibly difficult to debug when problems/deadlocks arose.
In an effort to clean up the situation we decided to create set of simple synchronization primitives and have our compositor (SurfaceFlinger) manage the synchronization contract explicitly. We designed these primitives so that they can be passed across processes (much like ion/dma_buf handles), can be backed by hardware synchronization primitives, and can be combined with other sync dependancies in a heterogeneous manner. We also added enough debugging information to make pinpointing a synchronization deadlock bug easier. There are also OpenGL extensions added (which I believe have been ratified by Khronos) to convert a "native" sync object to a gl fence object and vise versa.
So far shipped this system on two products (the Nexus 10 and 4) with two different SoCs (Samsung Exynos5250 and Qualcomm MSM8064.) These two projects were much easier to work out the kinks in the graphics/compositing pipelines. In addition we were able to use the telemetry and tracing features to track down the causes of dropped frames aka "jank."
As for the implementation, I started with having the main driver op primitive be a wait() op. I quickly noticed that most of the tricky race condition prone code was ending up in the drivers wait() op. It also made handling asynchronous waits of more than one type of sync_pt difficult to manage. In the end I opted for something roughly like poll() where all the heavy lifting is done at the high level and the drivers only need to implement a simple check function."
Anyway, let me know what you think of the patches, and hopefully this is something that could be considered for staging for 3.10
thanks -john
Cc: Maarten Lankhorst maarten.lankhorst@canonical.com Cc: Erik Gilling konkers@android.com Cc: Daniel Vetter daniel.vetter@ffwll.ch Cc: Rob Clark robclark@gmail.com Cc: Sumit Semwal sumit.semwal@linaro.org Cc: Greg KH gregkh@linuxfoundation.org Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Cc: Android Kernel Team kernel-team@android.com
Erik Gilling (26): staging: sync: Add synchronization framework staging: sw_sync: Add cpu based sync driver staging: sync: Add timestamps to sync_pts staging: sync: Add debugfs support staging: sw_sync: Add debug support staging: sync: Add ioctl to get fence data staging: sw_sync: Add fill_driver_data support staging: sync: Add poll support staging: sync: Allow async waits to be canceled staging: sync: Export sync API symbols staging: sw_sync: Export sw_sync API staging: sync: Reorder sync_fence_release staging: sync: Optimize fence merges staging: sync: Add internal refcounting to fences staging: sync: Add reference counting to timelines staging: sync: Change wait timeout to mirror poll semantics staging: sync: Dump sync state to console on timeout staging: sync: Improve timeout dump messages staging: sync: Dump sync state on fence errors staging: sync: Protect unlocked access to fence status staging: sync: Update new fence status with sync_fence_signal_pt staging: sync: Use proper barriers when waiting indefinitely staging: sync: Refactor sync debug printing staging: sw_sync: Convert to use new value_str debug ops staging: sync: Add tracepoint support staging: sync: Don't log wait timeouts when timeout = 0
Jamie Gennis (1): staging: sync: Fix timeout = 0 wait behavior
Rebecca Schultz Zavin (2): staging: sync: Fix error paths staging: sw_sync: Fix error paths
Ørjan Eide (1): staging: sync: Fix race condition between merge and signal
drivers/staging/android/Kconfig | 27 + drivers/staging/android/Makefile | 2 + drivers/staging/android/sw_sync.c | 263 +++++++++ drivers/staging/android/sw_sync.h | 58 ++ drivers/staging/android/sync.c | 1016 ++++++++++++++++++++++++++++++++++ drivers/staging/android/sync.h | 426 ++++++++++++++ drivers/staging/android/trace/sync.h | 82 +++ 7 files changed, 1874 insertions(+) create mode 100644 drivers/staging/android/sw_sync.c create mode 100644 drivers/staging/android/sw_sync.h create mode 100644 drivers/staging/android/sync.c create mode 100644 drivers/staging/android/sync.h create mode 100644 drivers/staging/android/trace/sync.h
On Thu, Feb 28, 2013 at 04:42:56PM -0800, John Stultz wrote:
As proposed yesterday, here's the Android sync driver patches for staging.
I've preserved the commit history, but moved all the changes over to be against the staging directory (instead of drivers/base).
The goal of submitting this driver to staging is to try to get more collaberation, as there are some similar efforts going on in the community with dmabuf-fences. My email from yesterday with more details for how I hope this goes is here: http://comments.gmane.org/gmane.linux.kernel/1448420
Erik also provided a nice background on the patch set in his reply yesterday, which I'll quote here:
<snip>
Mind if I put that in the 1/30 changelog body for future people to see?
Other than that, at first glance, I only have one minor question, which I'll make in the patch itself. Otherwise, if there are no objections, I'll queue these up in my tree after 3.9-rc1 is out.
thanks for doing this work,
greg k-h
On Thu, Feb 28, 2013 at 5:59 PM, Greg KH gregkh@linuxfoundation.org wrote:
On Thu, Feb 28, 2013 at 04:42:56PM -0800, John Stultz wrote:
Erik also provided a nice background on the patch set in his reply yesterday, which I'll quote here:
Mind if I put that in the 1/30 changelog body for future people to see?
Please do.
Other than that, at first glance, I only have one minor question, which I'll make in the patch itself. Otherwise, if there are no objections, I'll queue these up in my tree after 3.9-rc1 is out.
thanks for doing this work,
Indeed, thanks John
Cheers, Erik
On Fri, Mar 1, 2013 at 1:42 AM, John Stultz john.stultz@linaro.org wrote:
As proposed yesterday, here's the Android sync driver patches for staging.
I've preserved the commit history, but moved all the changes over to be against the staging directory (instead of drivers/base).
The goal of submitting this driver to staging is to try to get more collaberation, as there are some similar efforts going on in the community with dmabuf-fences. My email from yesterday with more details for how I hope this goes is here: http://comments.gmane.org/gmane.linux.kernel/1448420
I've been offline in a week of snowboarding, but I'll throw my late Ack - I've discussed this a bit with John offline and I agree with his general plan for integrating android sync points into mainline.
Erik also provided a nice background on the patch set in his reply yesterday, which I'll quote here:
"In Honeycomb where we introduced the Hardware Composer HAL. This is a userspace layer that allows composition acceleration on a per platform basis. Different SoC vendors have implemented this using overlays, 2d blitters, a combinations of both, or other clever/disgusting means. Along with the HWC we consolidated a lot of our camera and media pipeline to allow their input to be fed into the GPU or display(overlay.) In order to exploit parallelism the the graphics pipeline, this introduced lots of implicit synchronization dependancies. After a couple years of working with many different SoC vendors, we found that it was really difficult to communicate our system's expectations of the implicit contract and it was difficult for the SoC vendors to properly implement the implicit contract in each of their IP blocks (display, gpu, camera, video codecs). It was also incredibly difficult to debug when problems/deadlocks arose.
dma_buf fences should be tons easier to debug thanks to integration with lockdep. Also their design fundamentally excludes deadlock-loops in the fences themselves. And I also think that we should be able to hide the complexity from most drivers in e.g. drm/ttm or the v2l core. So I'm still bullish on implicit fencing (and will keep on pushing that for all things intel).
But I guess the simpler programming model afforded by that for userspace isn't of much use for the google guys now that they've pushed the effort to convert SurfaceFlinger to explicit fence handling ...
Cheers, Daniel
linaro-mm-sig@lists.linaro.org