On 09/01/2019 14:48, Ard Biesheuvel wrote:
On Wed, 9 Jan 2019 at 15:38, Carsten Haitzler Carsten.Haitzler@arm.com wrote:
On 09/01/2019 14:36, Ard Biesheuvel wrote:
On Wed, 9 Jan 2019 at 15:34, Carsten Haitzler Carsten.Haitzler@arm.com wrote:
On 09/01/2019 14:33, Bero Rosenkränzer wrote:
On Wed, 9 Jan 2019 at 14:55, Carsten Haitzler Carsten.Haitzler@arm.com wrote:
My understanding is that ARM can't do "WC" in a guaranteed way like x86, so turning it off is the right thing to do anyway,
My understanding too.
FWIW I've added the fix to the OpenMandriva distro kernel https://github.com/OpenMandrivaSoftware/linux/commit/657041c5665c681d4519cf8... Let's see if any user starts screaming ;)
ttyl bero
let's see,. i have put in a patch to the internal kernel patch review before i send off to dri-devel. it's exactly your patch there just with a commit log explaining why.
So what exactly is it about x86 style wc that ARM cannot do?
From Pavel Shamis here at ARM:
"Short version.
X86 has well define behavior for WC memory – it combines multiples consecutive stores (has to be aligned to the cache line ) in 64B cache line writes over PCIe.
On Arm WC corresponds to Normal NC. Arm uarch does not do combining to cache line size. On some uarch we do 16B combining but not cache line.
The first uarch that will be doing cache line size combining is Aries.
It is important to note that WC is an opportunistic optimization and the software/hardware should not make an assumption that it always “combines” (true for x86 and arm)"
OK, so that only means that ARM WC mappings may behave more like x86 uncached mappings than x86 WC mappings. It does not explain why things break if we use them.
The problem with using uncached mappings here is that it breaks use cases that expect memory semantics, for unaligned access or DC ZVA instructions. At least VDPAU on nouveau breaks due to this, and likely many more other use cases as well.
For amdgpu though it works and this is and AMD+Radeon only code path. At least it works on the only ARM system I have an AMD GPU plugged into. you need the same fix for SynQuacer. Gettign a fix upstream like this will alleaviet a reasonable amount of pain for end-users even if not perfect.
I do not plan on going any further with this patch because it's for my tx2 and that is my ONLY workstation at work and it takes like 10 minutes per reboot cycle. I have many things to do and getting my gfx card to a working state was the primary focus. Spending days just rebooting to try things with something I am not familiar with (thwe ttm mappings) is not something I have time for. Looking at the history of other bugs that affect WC/UC mappings in radeon/madgpu shows that this is precisely the kind of fix that has been done multiple times in the past for x86 and obviously some MIPS and PPC systems. there's mountains of precedent that this is a quick and simple fix that has been implemented many time in the past, so from that point of view I think its a decent fix in and of itself when it comes to time vs. reward.
It may not be perfect, but it is better than it was and other MIPS/PPC and even x86 32bit systems already need this kind of fix. In the same way it seems ARM needs it too and no one to date has bothered upstream. I'd rather things improve for at least some set of people than they do not improve at all for an undefined amount of time. Note that working is an improvement to "fast but doesn't work" in my book. :) Don't get me wrong. Looking for a better fix in the meantime,if one could exist, is a positive thing. It's not something I can get stuck into as above.
So *please*, do not send that patch to dri-devel. Let's instead fix the root cause, which may be related to the thing pointed out by Will, i.e., that ttm_set_pages_uc() is not implemented correctly.
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.