On 15 June 2011 11:14, Alexander Sack asac@linaro.org wrote:
Anyway, I really hope this is unlikely to happen for now
[I should start this email with a disclaimer: this isn't intended to be finger pointing, just an explanation of why we should expect and plan for QEMU breakage rather than hoping it is unlikely.]
History tells us that QEMU often breaks when the kernel gains functionality to use new bits of the hardware. The most recent example is that a change which went into the 1105 final release omap3 kernel at some point between the 20110520 snapshot and the 20110526 final release did break QEMU (graphics output stopped working). [Technical detail: the kernel now probes for the presence of a monitor and QEMU wasn't emulating the hardware on the I2C bus which returns EDID info so the kernel thought there was no monitor present and didn't turn on the display. This missing feature in QEMU will be fixed in the 2011.06 release.]
Other examples we've seen: * kernel access to non-existent device registers triggering so many warnings from QEMU as to render it unusable [also present in 1105; worked around in qemu 2011.06] * the move to the omap-specific serial driver required model changes * probing for cp15 perf registers hit qemu bugs where they weren't implemented * bugs in the MMC controller model tickled by a u-boot MMC driver rewrite * newer kernels use the ARM1136r1 TLS registers but QEMU's 1136 model doesn't implement them
The underlying cause here is that QEMU's models are not tested in any formal way against a specification or against a test suite used for validating the hardware. The main test is "does it boot Linux?". So it's inevitable that new kernel features will be exercising essentially untested QEMU code, and breakage is quite likely.
CI will help to flag this kind of problem up sooner (and I have a blueprint for this cycle to work with the validation folks to expand the range of QEMU automated testing and benchmarking), but if we want to guarantee that QEMU and the kernel work together I think we really need to pretty much freeze the kernel two weeks before QEMU's release date, in order to have a fighting chance at catching and fixing problems. Alternatively the kernel team could refuse to merge qemu-breaking changes, but that seems to me like putting the cart before the horse.
(Rolling back to previous qemu release is generally not a possible fix because typically the bug has been in qemu all along and is not a regression.)
-- PMM