From: Rob Clark robdclark@chromium.org
Inspired by https://lore.kernel.org/dri-devel/20200604081224.863494-10-daniel.vetter@ffw... it seemed like a good idea to get rid of memory allocation in job_run() and use lockdep annotations to yell at us about anything that could deadlock against shrinker/reclaim. Anything that can trigger reclaim, or block on any other thread that has triggered reclaim, can block the GPU shrinker from releasing memory if it is waiting the job to complete, causing deadlock.
The first two patches avoid memory allocation for the hw_fence by embedding it in the already allocated submit object. The next three decouple various allocations that were done in the hw_init path, but only the first time, to let lockdep see that they won't happen in the job_run() path. (The hw_init() path re-initializes the GPU after runpm resume, etc, which can happen in the job_run() path.)
The remaining patches clean up locking issues in various corners of PM and interconnect which happen in the runpm path. These fixes can be picked up independently by the various maintainers. In all cases I've added lockdep annotations to help keep the runpm resume path deadlock- free vs reclaim, but I've broken those out into their own patches.. it is possible that these might find issues in other code-paths not hit on the hw I have. (It is a bit tricky because of locks held across call- backs, such as devfreq->lock held across devfreq_dev_profile callbacks. I've audited these and other callbacks in icc, etc, to look for problems and fixed one I found in smd-rpm. But that took me through a number of drivers and subsystems that I am not familiar with so it is entirely possible that I overlooked some problematic allocations.)
There is one remaining issue to resolve before we can enable the job_run annotations, but it is entirely self contained in drm/msm/gem. So it should not block review of these patches. So I figured it best to send out what I have so far.
Rob Clark (13): dma-buf/dma-fence: Add dma_fence_init_noref() drm/msm: Embed the hw_fence in msm_gem_submit drm/msm/gpu: Move fw loading out of hw_init() path drm/msm/gpu: Move BO allocation out of hw_init drm/msm/a6xx: Move ioremap out of hw_init path PM / devfreq: Drop unneed locking to appease lockdep PM / devfreq: Teach lockdep about locking order PM / QoS: Fix constraints alloc vs reclaim locking PM / QoS: Decouple request alloc from dev_pm_qos_mtx PM / QoS: Teach lockdep about dev_pm_qos_mtx locking order soc: qcom: smd-rpm: Use GFP_ATOMIC in write path interconnect: Fix locking for runpm vs reclaim interconnect: Teach lockdep about icc_bw_lock order
drivers/base/power/qos.c | 83 ++++++++++++++++------ drivers/devfreq/devfreq.c | 52 +++++++------- drivers/dma-buf/dma-fence.c | 43 ++++++++--- drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 48 ++++++------- drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 18 +++-- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 46 ++++++------ drivers/gpu/drm/msm/adreno/adreno_device.c | 6 ++ drivers/gpu/drm/msm/adreno/adreno_gpu.c | 9 +-- drivers/gpu/drm/msm/msm_fence.c | 43 +++++------ drivers/gpu/drm/msm/msm_fence.h | 2 +- drivers/gpu/drm/msm/msm_gem.h | 10 +-- drivers/gpu/drm/msm/msm_gem_submit.c | 8 +-- drivers/gpu/drm/msm/msm_gpu.c | 4 +- drivers/gpu/drm/msm/msm_gpu.h | 6 ++ drivers/gpu/drm/msm/msm_ringbuffer.c | 4 +- drivers/interconnect/core.c | 18 ++++- drivers/soc/qcom/smd-rpm.c | 2 +- include/linux/dma-fence.h | 2 + 18 files changed, 237 insertions(+), 167 deletions(-)