Short GEM Introduction - Linaro-mm-sig

7 May 2011


      Hi all,
A bit later than what I've hoped for, but here we go [Jesse and Dave,
please correct/clarify/extend where you see fit]:
The core idea of GEM is to identify graphic buffer objects with 32bit ids.
The reason being "X runs out of open fds" (KDE easily reaches a few
thousand).
The core design principle behind GEM is that the kernel is in full control
of the allocation of these buffer objects and is free to move the around
in any way it sees fit. This is to make concurrent rendering by multiple
processes possible while userspace can still assume that it is in sole
possession of the gpu - GEM means "graphics execution manager".
Below some more details on what GEM is and does, what it does
_not_ do and how it relates to other graphic subsystems.
GEM does ...
------------
- lifecycle management. Userspace references are associated with the drm
  fd and get reaped on close (in case userspace forgets about them).
- per-device global names to exchange buffers between processes (eg dri2).
  These names are again 32bit ids. These global ids do not count as
  userspace references and don't prevent a buffer from being reaped.
- it implements very few generic ioctls:
  * flink for creating a global name for a buffer object
  * open for getting a per-fd handle to a buffer object with a global name
  * close for dropping a per-fd handle.
- a little bit of kernel-internal helpers to facilitate mmap (by blending
  multiple buffer objects into the single drm device address space) and a
  few other things.
That's it, i.e. GEM is very much meant to be as simple as possible.
Driver-specific GEM ioctls
--------------------------
The generic GEM stuff is obviously not very useful. So drivers implement
quite a bit driver-specific ioctls, like:
- buffer creation. In recent kernels there is some support to create dumb
  scanout objects for KMS. But they're only really useful for boot-splashs
  and unaccelerated dumb KMS drivers. Creating buffers usable for
  rendering is only possible with driver specific ioctls.
- command submission. An important part is mapping abstract buffer ids to
  actual gpu address (and rewriting batchbuffers with these). In the
  future, with support for virtual gpu address spaces this might change.
- tiling management. The kernel needs to know this to correctly
  tile/detile buffers when moving them around (e.g. evicting from vram).
- command completion signalling and gpu/cpu synchronization.
There are currently two approaches for implementing a GEM driver:
- roll-your-own, used by drm/i915 (and sometimes getting flaked for NIH).
- ttm-base: radeon & nouveau.
GEM does not ...
----------------
This still leaves out a few things that I've seen mentioned as
ideas/requirements here and elsewhere:
- cross-device buffer sharing and namespaces (see below) and
- buffer format handling and mediation between different users (except
  tiling as mentioned above). The reason here is that gpus are a mess
  and one of the worst parts is format handling. Better keep that out
  of the kernel ...
KMS (kernel mode setting)
-------------------------
KMS is essentially just a port of the xrandr api to the kernel as an ioctl
interface:
- crtcs feed (possible multiple) outputs and get their data from a
  framebuffer object. A major part of KMS is also the support for
  vsynced-pageflipping of framebuffers.
- Internally there's some support infrastructure to simplify drivers (all
  the drm_*_helper.c code).
- framebuffers are created from a opaque driver-specific 32bit id and a
  format description. For GEM drivers these ids name GEM objects, but that
  need not be: The recently merged qemu kms driver does not implement gem
  and has one unique buffer object with id 0.
- as mentioned above there newly is a generic ioctl to create an object
  suitable as a dumb scanout (plus some support to mmap it).
- currently KMS has no generic support for overlays (there are
  driver-specific ioctls in i915 and vmgfx, though). Jesse Barnes has
  posted an RFC to remedy this:
http://www.mail-archive.com/dri-devel@lists.freedesktop.org/msg10415.html
GEM and PRIME
-------------
PRIME is a proof-of-concept implementation from Dave Airlie for sharing
GEM objects between drivers/devices: Buffer sharing is done with a list of
struct page pointers. While being shared, buffers can't be moved anymore.
No further buffer description is passed along in the kernel, format/layout
mediation is to be handled in userspace.
Blog-post describing the initial design for sharing buffers between an
integrated Intel igd and a discrete ATI gpu:
http://airlied.livejournal.com/71734.html
Other code using the same framework to render on an Intel igd and display
the framebuffer on an usb-connected displayport:
http://git.kernel.org/?p=linux/kernel/git/airlied/drm-testing.git%3Ba=shortl...
GEM/KMS and fbdev
-----------------
There's some minimal support to emulate an fbdev with a gem/kms driver.
Resolution can't be changed and it's unaccelerated. There's been some
muttering once in a while to better integrate this with either a kms
kernel console driver or by routing fbdev resolution changes to kms.
But the main use case is to display a kernel oops, which works. For
everything else there's X (or an EGL client that understands kms).
-Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48