New subject: [PATCH 01/13] dma-buf/dma-fence: Add dma_fence_init_noref()

12 Mar 2023


      From: Rob Clark robdclark@chromium.org
Inspired by https://lore.kernel.org/dri-devel/20200604081224.863494-10-daniel.vetter@ffw...
it seemed like a good idea to get rid of memory allocation in job_run()
and use lockdep annotations to yell at us about anything that could
deadlock against shrinker/reclaim.  Anything that can trigger reclaim,
or block on any other thread that has triggered reclaim, can block the
GPU shrinker from releasing memory if it is waiting the job to complete,
causing deadlock.
The first two patches avoid memory allocation for the hw_fence by
embedding it in the already allocated submit object.  The next three
decouple various allocations that were done in the hw_init path, but
only the first time, to let lockdep see that they won't happen in the
job_run() path.  (The hw_init() path re-initializes the GPU after runpm
resume, etc, which can happen in the job_run() path.)
The remaining patches clean up locking issues in various corners of PM
and interconnect which happen in the runpm path.  These fixes can be
picked up independently by the various maintainers.  In all cases I've
added lockdep annotations to help keep the runpm resume path deadlock-
free vs reclaim, but I've broken those out into their own patches.. it
is possible that these might find issues in other code-paths not hit on
the hw I have.  (It is a bit tricky because of locks held across call-
backs, such as devfreq->lock held across devfreq_dev_profile callbacks.
I've audited these and other callbacks in icc, etc, to look for problems
and fixed one I found in smd-rpm.  But that took me through a number of
drivers and subsystems that I am not familiar with so it is entirely
possible that I overlooked some problematic allocations.)
There is one remaining issue to resolve before we can enable the job_run
annotations, but it is entirely self contained in drm/msm/gem.  So it
should not block review of these patches.  So I figured it best to send
out what I have so far.
Rob Clark (13):
  dma-buf/dma-fence: Add dma_fence_init_noref()
  drm/msm: Embed the hw_fence in msm_gem_submit
  drm/msm/gpu: Move fw loading out of hw_init() path
  drm/msm/gpu: Move BO allocation out of hw_init
  drm/msm/a6xx: Move ioremap out of hw_init path
  PM / devfreq: Drop unneed locking to appease lockdep
  PM / devfreq: Teach lockdep about locking order
  PM / QoS: Fix constraints alloc vs reclaim locking
  PM / QoS: Decouple request alloc from dev_pm_qos_mtx
  PM / QoS: Teach lockdep about dev_pm_qos_mtx locking order
  soc: qcom: smd-rpm: Use GFP_ATOMIC in write path
  interconnect: Fix locking for runpm vs reclaim
  interconnect: Teach lockdep about icc_bw_lock order
drivers/base/power/qos.c                   | 83 ++++++++++++++++------
 drivers/devfreq/devfreq.c                  | 52 +++++++-------
 drivers/dma-buf/dma-fence.c                | 43 ++++++++---
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c      | 48 ++++++-------
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c      | 18 +++--
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c      | 46 ++++++------
 drivers/gpu/drm/msm/adreno/adreno_device.c |  6 ++
 drivers/gpu/drm/msm/adreno/adreno_gpu.c    |  9 +--
 drivers/gpu/drm/msm/msm_fence.c            | 43 +++++------
 drivers/gpu/drm/msm/msm_fence.h            |  2 +-
 drivers/gpu/drm/msm/msm_gem.h              | 10 +--
 drivers/gpu/drm/msm/msm_gem_submit.c       |  8 +--
 drivers/gpu/drm/msm/msm_gpu.c              |  4 +-
 drivers/gpu/drm/msm/msm_gpu.h              |  6 ++
 drivers/gpu/drm/msm/msm_ringbuffer.c       |  4 +-
 drivers/interconnect/core.c                | 18 ++++-
 drivers/soc/qcom/smd-rpm.c                 |  2 +-
 include/linux/dma-fence.h                  |  2 +
 18 files changed, 237 insertions(+), 167 deletions(-)
-- 
2.39.2

[PATCH 00/13] drm/msm+PM+icc: Make job_run() reclaim-safe