On Thu, Jan 10, 2019 at 11:36:41AM +0000, Carsten Haitzler wrote:
On 09/01/2019 19:07, Ard Biesheuvel wrote:
I can confirm that this change fixes all the issues I observed on AMD Seattle with HD5450 and HD7450 cards which use the Radeon driver (not the amdpgu one)
Hooray. Another happy user. :) I suspect Bero will report success too. At least this is at worst the "tip of the iceberg" of the problem and that patch fixes it with a sledgehammer. At best it's the exact right fix. :) that's another topic. I see another mail with some patch so I'll continue there.
Thanks.
So I will attempt to dig into this a bit further myself, and hopefully find something that carries over to amdgpu as well, so I may ask you to test something if I do.
It may not be perfect, but it is better than it was and other MIPS/PPC and even x86 32bit systems already need this kind of fix. In the same way it seems ARM needs it too and no one to date has bothered upstream. I'd rather things improve for at least some set of people than they do not improve at all for an undefined amount of time. Note that working is an improvement to "fast but doesn't work" in my book. :) Don't get me wrong. Looking for a better fix in the meantime,if one could exist, is a positive thing. It's not something I can get stuck into as above.
I'd just like to see if we can fix properly before we upstream a hack.
If we find a significantly better fix in short order - sure. If this is going to drag out into weeks and weeks of back and forth, I think we should consider getting a fix out until something better can be found. Just keep in mind, for every day no fix is available someone somewhere is yelling at some system that doesn't work and they don't know why. They may not know C or how to even compile things... but they are unhappy. :)
This patch perpetuates the unfounded accusation that the Arm architecture is fundamenatally incompatible with write-combining and PCI. If we don't bother to diagnose the reported failures correctly, removing hacks such as this when we are forced to understand the problem properly tends to be considerably more effort in my experience, particularly if the same hack has been adopted by other drivers or subsystems.
So I don't think this patch is anything more than a short-term hack, which isn't something we should commit to maintaining upstream. I'm over the moon that it allows you to use your workstation effectively, but please let's try to root-cause this (as Ard is doing) before we rush something in that we're unable to reason about.
I know you're not a fan of rebooting, but I'd appreciate it if you could please help with testing (or throw me an AMD card for a few days so I can do it myself).
Thanks,
Will