On 29 February 2016 at 21:05, Laszlo Ersek lersek@redhat.com wrote:
On 02/29/16 20:39, Ryan Harkin wrote:
Hi Ard/Leif/anyone who cares,
So I was trying to work out who broke MMC support in TC2 in the upstream EDK2 tree. It was difficult because the tree is borked on TC2 in so many interesting ways throughout history, but I eventually bisected down to this patch:
300fc77 2015-08-25 ArmPlatformPkg/PL180MciDxe: check PrimeCell ID before initializing [Ard Biesheuvel]
Basically, TC2 reads 0x02 for MCI_PERIPH_ID_REG3, when, according to the spec, the register is supposed to read 0x00 in all cases. So the driver doesn't probe and is never initialised. I guess this is an FPGA bug in TC2? It's probably known about, but not to me ;-)
Anyway, how to fix it??
We could mask off the "stuck" bit, we could not check ID_REG3, there are other things we could do.
I decided to mask off the bit rather than discard the register check in my patch below, just to get things working
But would you like to do?
For extra point.... this was extra fun to track down due to other problems. TC2 stopped booting since this patch was submitted
d340ef7 2014-08-26 ArmPkg/ArmArchTimerLib: Remove non required [depex] and IoLib [Olivier Martin]
I've always carried a revert patch in my tree because I was previously told I was wrong and that it wasn't a problem, even though it clearly is. TC2 is spewing out a constant stream of this message:
IRQ Exception PC at 0xBFB74C20 CPSR 0x60000133
It wasn't fixed until Ard's patch that broke MMC support. Ugh!
I'm suspecting that the MMC support has a dependency on IoLib - for that is the part of the patch that broke TC2 in the first place. But I have yet to investigate that problem; I don't even know what IoLib is.
IoLib is a library class that lets you massage IO ports and MMIO registers.
MdePkg/Include/Library/IoLib.h
The patch you quoted does two things: it removes ArmArchTimerLib's build-time dependency on the IoLib class, and it removes the runtime (dispatch) dependency on EFI_CPU_ARCH_PROTOCOL of any module that is linked against ArmArchTimerLib (unless that module has the same dependency due to another library instance it links against, or due to its own explicit [depex] section).
Removing the library class dependency could introduce such a problem only if the actual library instance used for that dependency had a constructor function that is henceforth no longer called, and this function changed something related to interrupts. Very unlikely.
Removing the DXE dispatch dependency on EFI_CPU_ARCH_PROTOCOL is the likely culprit, in my opinion. The driver that provides said architectural protocol probably massages interrupt configuration on the CPU or the GIC in its entry point function in such a way that ArmArchTimerLib actually silently depends on, without explicitly calling EFI_CPU_ARCH_PROTOCOL member functions. By removing the depex, the DXE core may have reordered another driver (that links against ArmArchTimerLib) versus the driver providing EFI_CPU_ARCH_PROTOCOL -- for which reason the timerlib functions may now run without the necessary interrupt setup.
Instances of the TimerLib class have always been finicky. For example, in OvmfPkg we have three instances (for various module types & firmware phases). The two instances that get linked into early module types (SEC, and PEI_CORE, PEIM, DXE_CORE) massage chipset registers directly, because that was the only robust way to make sure that whichever of these module (types) needed the ACPI timer could actually utilize it. Through these library instances, every such "early" module (that needs TimerLib) looks at the chipset registers, and sets the needed bits if they are not in place yet.
Thanks for the analysis
It appears that PL180MciDxe's dependency on gEfiCpuArchProtocolGuid was transitively fulfilled by its dependency on TimerLib, which is implemented by ArmArchTimerLib on TC2, and the patch removes it from the depex
Simply replacing TRUE with gEfiCpuArchProtocolGuid in the Depex section of PL180MciDxe.inf should do the trick.
As far as the Primecell ID is concerned, let's just whitelist whatever TC2 exposes, even if in error.
Thanks, Ard.