On 01/10/18 18:31, Daniel Mack wrote:
On 30/9/2018 11:10 PM, Chris Packham wrote:
With this in mind, I don't see why this
- st = readl_relaxed(nfc->regs + NDSR);
- if (st & (NDSR_RDY(0) | NDSR_RDY(1)))
complete(&nfc->complete);
Yeah, me neither. Chris, are you absolutely sure this is the reason? I'm asking because it took me several tries sometimes to trigger the bug, so is there a chance that you see an error at all times, regardless of whether my patch is applied?
It seems pretty consistent. Without this patch there seems to be no problem. With this patch it triggers pretty much straight away. I can't discount that there might be something wrong with my dts (the R/B configuration was missing initially).
I've also been able to run this on the DB-88F6820-AMC board with the same result (the dts for this is in the for-next branch of git://git.infradead.org/linux-mvebu.git).
The really odd thing is the following seems to avoid the problem
st = readl_relaxed(nfc->regs + NDSR);
udelay(1000);
if (st & (NDSR_RDY(0) | NDSR_RDY(1)))
complete(&nfc->complete);
Which is weird because the st value has already been read so the udelay should have no effect.
Erm, yes. That's totally weird. Which gcc are you using for this?
arm-softfloat-linux-gnueabi-gcc (crosstool-NG crosstool-ng-1.22.0) 4.9.3
Could you please try and use readl() here instead of readl_relaxed()? That will place a memory barrier after the read to enforce ordering.
I'd previously tried readl() based on the same hunch. No change.
I think my snippet above might be misleading. While a delay between readl_relaxed() and the if should not change the outcome, this is also a delay between marvell_nfc_enable_int() and marvell_nfc_disable_int() which is probably more significant. Sure enough if I move the delay to just before the marvell_nfc_disable_int() the error is not seen.
But if this is a problem, many other parts of that driver should be equally affected.
On 28/09/18 19:43, Daniel Mack wrote:
Also, does my .EALREADY approach (v1) make any difference?
The v1 of this patch doesn't show the problem.
That's also very strange because the condition it triggers on is exactly the same.
One difference is that by calling complete() interrupts will be disabled in the spinlock.
Thanks, Daniel