On 7/21/20 10:55 AM, Christian König wrote:
Am 21.07.20 um 10:47 schrieb Thomas Hellström (Intel):
On 7/21/20 9:45 AM, Christian König wrote:
Am 21.07.20 um 09:41 schrieb Daniel Vetter:
On Mon, Jul 20, 2020 at 01:15:17PM +0200, Thomas Hellström (Intel) wrote:
Hi,
On 7/9/20 2:33 PM, Daniel Vetter wrote:
Comes up every few years, gets somewhat tedious to discuss, let's write this down once and for all.
What I'm not sure about is whether the text should be more explicit in flat out mandating the amdkfd eviction fences for long running compute workloads or workloads where userspace fencing is allowed.
Although (in my humble opinion) it might be possible to completely untangle kernel-introduced fences for resource management and dma-fences used for completion- and dependency tracking and lift a lot of restrictions for the dma-fences, including prohibiting infinite ones, I think this makes sense describing the current state.
Yeah I think a future patch needs to type up how we want to make that happen (for some cross driver consistency) and what needs to be considered. Some of the necessary parts are already there (with like the preemption fences amdkfd has as an example), but I think some clear docs on what's required from both hw, drivers and userspace would be really good.
I'm currently writing that up, but probably still need a few days for this.
Great! I put down some (very) initial thoughts a couple of weeks ago building on eviction fences for various hardware complexity levels here:
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.fre...
I don't think that this will ever be possible.
See that Daniel describes in his text is that indefinite fences are a bad idea for memory management, and I think that this is a fixed fact.
In other words the whole concept of submitting work to the kernel which depends on some user space interaction doesn't work and never will.
Well the idea here is that memory management will *never* depend on indefinite fences: As soon as someone waits on a memory manager fence (be it eviction, shrinker or mmu notifier) it breaks out of any dma-fence dependencies and /or user-space interaction. The text tries to describe what's required to be able to do that (save for non-preemptible gpus where someone submits a forever-running shader).
So while I think this is possible (until someone comes up with a case where it wouldn't work of course), I guess Daniel has a point in that it won't happen because of inertia and there might be better options.
/Thomas