On Fri, 2018-11-09 at 11:34 +0000, Marc Zyngier wrote:
On 08/11/18 19:49, Trent Piepho wrote:
On Thu, 2018-11-08 at 09:49 +0000, Marc Zyngier wrote:
Then that lasted four years until it was changed Aug 2017 in https://pa tchwork.kernel.org/patch/9893303/
That lasted just six months until someone tried to revert it, https://p atchwork.kernel.org/patch/9893303/
Should be https://patchwork.kernel.org/patch/10208879/
Seems pretty clear the way it is now is much worse than the way it was before, even if the previous design may have had another flaw. Though I've yet to see anyone point out something makes the previous design broken. Sub-optimal yes, but not broken.
This is not what was reported by the previous submitter. I guess they changed this for a reason, no? I'm prepared to admit this is a end-point driver bug, but we need to understand what is happening and stop changing this driver randomly.
See Vignesh's recent message about the last change. It was a mistaken attempt to fix a problem, which it didn't fix, and I think we all agree it's not right.
Reverting it is not "changing this driver randomly." And I take that as a personal offense. You imply I just applied patches randomly until something appeared to work. Maybe you think this is all over my head? Far from it. I traced every part of the interrupt path and thought through every race. Hamstrung by lack of docs, I still determined the behavior that was relevant through empirical analysis. I only discovered this was a recent regression and Vignesh's earlier attempt to revert it after I was done and was trying to determine how the code got this way in the first place.
It feels like you're using this bug to hold designware hostage in a broken kernel, and me along with them. I don't have the documentation, no one does, there's no way for me to give you want you want. But I've got hardware that doesn't work in the mainline kernel.
I take it as a personal offence that I'd be holding anything or anyone hostage. I think I have a long enough track record working with the Linux kernel not to take any of this nonsense. What's my interest in keeping anything in this sorry state? Think about it for a minute.
I'm sorry if you took it that way. I appreciate that there are still people who care about fixing things right and don't settle for whatever the easiest thing is that lets them say they're done, even if that just leaves time bombs for everyone who comes after.
So I thank you for taking a stand.
But I think it's clear that 8c934095fa2f was a mistake that causes serious bugs. That's not a random guess; it's well considered and well tested. Not reverting it now isn't helping anyone using stable kernels with this regression.