On 11/28/25 11:10, Philipp Stanner wrote:
On Fri, 2025-11-28 at 11:06 +0100, Christian König wrote:
On 11/27/25 12:10, Philipp Stanner wrote:
On Thu, 2025-11-13 at 15:51 +0100, Christian König wrote:
This should allow amdkfd_fences to outlive the amdgpu module.
v2: implement Felix suggestion to lock the fence while signaling it.
Signed-off-by: Christian König christian.koenig@amd.com
[…]
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c index a085faac9fe1..8fac70b839ed 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c @@ -1173,7 +1173,7 @@ static void kfd_process_wq_release(struct work_struct *work) synchronize_rcu(); ef = rcu_access_pointer(p->ef); if (ef)
dma_fence_signal(ef);
amdkfd_fence_signal(ef);kfd_process_remove_sysfs(p); kfd_debugfs_remove_process(p); @@ -1990,7 +1990,6 @@ kfd_process_gpuid_from_node(struct kfd_process *p, struct kfd_node *node, static int signal_eviction_fence(struct kfd_process *p) { struct dma_fence *ef;
- int ret;
rcu_read_lock(); ef = dma_fence_get_rcu_safe(&p->ef); @@ -1998,10 +1997,10 @@ static int signal_eviction_fence(struct kfd_process *p) if (!ef) return -EINVAL;
- ret = dma_fence_signal(ef);
- amdkfd_fence_signal(ef);
dma_fence_put(ef);
- return ret;
- return 0;
Oh wait, that's the code I'm also touching in my return code series!
https://lore.kernel.org/dri-devel/cef83fed-5994-4c77-962c-9c7aac9f7306@amd.c...
Does this series then solve the problem Felix pointed out in evict_process_worker()?
No it doesn't, I wasn't aware that the higher level code actually needs the status. After all Felix is the maintainer of this part.
This patch here needs to be rebased on top of yours and changed accordingly to still return the fence status correctly.
But thanks for pointing that out.
Alright, so my (repaired, v2) status-code-removal series shall enter drm-misc-next first, and then your series here. ACK?
Works for me, I just need both to re-base the amdgpu patches on top.
Christian.
P.