On Mon, Feb 28, 2022 at 06:24:53PM +0000, Deucher, Alexander wrote:
[Public]
-----Original Message----- From: Greg Kroah-Hartman gregkh@linuxfoundation.org Sent: Monday, February 28, 2022 12:23 PM To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org; stable@vger.kernel.org; Paul Menzel pmenzel@molgen.mpg.de; Koenig, Christian Christian.Koenig@amd.com; Yu, Qiang Qiang.Yu@amd.com; Deucher, Alexander Alexander.Deucher@amd.com Subject: [PATCH 5.16 022/164] drm/amdgpu: check vm ready by amdgpu_vm->evicting flag
From: Qiang Yu qiang.yu@amd.com
commit c1a66c3bc425ff93774fb2f6eefa67b83170dd7e upstream.
Workstation application ANSA/META v21.1.4 get this error dmesg when running CI test suite provided by ANSA/META: [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't update BO_VA (- 16)
This is caused by:
- create a 256MB buffer in invisible VRAM 2. CPU map the buffer and access
it causes vm_fault and try to move it to visible VRAM 3. force visible VRAM space and traverse all VRAM bos to check if evicting this bo is valuable 4. when checking a VM bo (in invisible VRAM), amdgpu_vm_evictable() will set amdgpu_vm->evicting, but latter due to not in visible VRAM, won't really evict it so not add it to amdgpu_vm->evicted 5. before next CS to clear the amdgpu_vm->evicting, user VM ops ioctl will pass amdgpu_vm_ready() (check amdgpu_vm->evicted) but fail in amdgpu_vm_bo_update_mapping() (check amdgpu_vm->evicting) and get this error log
This error won't affect functionality as next CS will finish the waiting VM ops. But we'd better clear the error log by checking the amdgpu_vm->evicting flag in amdgpu_vm_ready() to stop calling amdgpu_vm_bo_update_mapping() later.
Another reason is amdgpu_vm->evicted list holds all BOs (both user buffer and page table), but only page table BOs' eviction prevent VM ops. amdgpu_vm->evicting flag is set only for page table BOs, so we should use evicting flag instead of evicted list in amdgpu_vm_ready().
The side effect of this change is: previously blocked VM op (user buffer in "evicted" list but no page table in it) gets done immediately.
v2: update commit comments.
Acked-by: Paul Menzel pmenzel@molgen.mpg.de Reviewed-by: Christian König christian.koenig@amd.com Signed-off-by: Qiang Yu qiang.yu@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
A regression was reported against this patch in 5.17. Please drop for now.
Dropped from 5.10.y, 5.15.y, and 5.16.y. Please feel free to send it to stable@vger.kernel.org when it is now working correctly.
thanks,
greg k-h