On Tue, Jul 6, 2021 at 2:46 PM Oded Gabbay oded.gabbay@gmail.com wrote:
On Tue, Jul 6, 2021 at 3:23 PM Daniel Vetter daniel@ffwll.ch wrote:
On Tue, Jul 06, 2021 at 02:21:10PM +0200, Christoph Hellwig wrote:
On Tue, Jul 06, 2021 at 10:40:37AM +0200, Daniel Vetter wrote:
Greg, I hope this will be good enough for you to merge this code.
So we're officially going to use dri-devel for technical details review and then Greg for merging so we don't have to deal with other merge criteria dri-devel folks have?
I don't expect anything less by now, but it does make the original claim that drivers/misc will not step all over accelerators folks a complete farce under the totally-not-a-gpu banner.
This essentially means that for any other accelerator stack that doesn't fit the dri-devel merge criteria, even if it's acting like a gpu and uses other gpu driver stuff, you can just send it to Greg and it's good to go.
There's quite a lot of these floating around actually (and many do have semi-open runtimes, like habanalabs have now too, just not open enough to be actually useful). It's going to be absolutely lovely having to explain to these companies in background chats why habanalabs gets away with their stack and they don't.
FYI, I fully agree with Daniel here. Habanlabs needs to open up their runtime if they want to push any additional feature in the kernel. The current situation is not sustainable.
Well, that's like, your opinion...
Before anyone replies: The runtime is open, the compiler is still closed. This has become the new default for accel driver submissions, I think mostly because all the interesting bits for non-3d accelerators are in the accel ISA, and no longer in the runtime. So vendors are fairly happy to throw in the runtime as a freebie.
It's still incomplete, and it's still useless if you want to actually hack on the driver stack.
-Daniel
I don't understand what's not sustainable here.
There is zero code inside the driver that communicates or interacts with our TPC code (TPC is the Tensor Processing Core). Even submitting works to the TPC is done via a generic queue interface. And that queue IP is common between all our engines (TPC/DMA/NIC). The driver provides all the specs of that queue IP, because the driver's code is handling that queue. But why is the TPC compiler code even relevant here ?
Can I use the hw how it's intended to be used without it?
If the answer is no, then essentially what you're doing with your upstream driver is getting all the benefits of an upstream driver, while upstream gets nothing. We can't use your stack, not as-is. Sure we can use the queue, but we can't actually submit anything interesting. And I'm pretty sure the point of your hw is to do more than submit no-op packets to a queue.
This is all "I want my cake and eat it too" approach to upstreaming, and it's totally fine attitude to have, but if you don't see why there's maybe an different side to it then I don't get what you're arguing. Upstream isn't free lunch for nothing.
Frankly I'm starting to assume you're arguing this all in bad faith just because habanalabds doesn't want to actually have an open driver stack, so any attack is good, no matter what. Which is also what everyone else does who submits their accel driver to upstream, and which gets us back to the starting point of this sub-thread of me really appreciation how this will improve background discussions going forward for everyone.
Like if the requirement for accel drivers truly is that you can submit a dummy command to the queues then I have about 5-10 drivers at least I could merge instantly. For something like the intel gpu driver it would be about 50 lines of code (including all the structure boiler plate the ioctls require)in userspace to submit a dummy queue command. GPU and accel vendors would really love that, because it would allow them to freeload on upstream and do essentially nothing in return.
And we'd end up with an unmaintainable disaster of a gpu or well accelerator subsystem because there's nothing you can change or improve because all the really useful bits of the stack are closed. And ofc that's not any companies problem anymore, so ofc you with the habanalabs hat on don't care and call this *extreme*.
btw, you can today see our TPC code at https://github.com/HabanaAI/Habana_Custom_Kernel There is a link there to the TPC user guide and link to download the LLVM compiler.
I got stuck clicking links before I found the source for that llvm compiler. Can you give me a direct link to the repo with sourcecode instead please?
Thanks, Daniel
linaro-mm-sig@lists.linaro.org