This boot report shows the issue with failing to track some regressions that I mentioned on another thread last week:
On 16/03/2020 13:24, kernelci.org bot wrote:
next/master boot: 275 boots: 23 failed, 233 passed with 7 offline, 12 untried/unknown (next-20200316)
Full Boot Summary: https://kernelci.org/boot/all/job/next/branch/master/kernel/next-20200316/ Full Build Summary: https://kernelci.org/build/next/branch/master/kernel/next-20200316/
Tree: next Branch: master Git Describe: next-20200316 Git Commit: 8548fd2f20ed19b0e8c0585b71fdfde1ae00ae3c Git URL: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git Tested: 104 unique boards, 23 SoC families, 30 builds out of 329
Boot Regressions Detected:
arm:
[...]
2 regressions were detected for multi_v7_defconfig:
multi_v7_defconfig: gcc-8: imx6ul-pico-hobbit: lab-pengutronix: new failure (last pass: next-20191011) tegra124-nyan-big: lab-collabora: new failure (last pass: next-20200226)
[...]
Boot Failures Detected:
arm:
[...]
However, all these platforms failed to boot:
multi_v7_defconfig: gcc-8: bcm2836-rpi-2-b: 1 failed lab imx6ul-pico-hobbit: 1 failed lab rk3288-veyron-jaq: 1 failed lab sun4i-a10-cubieboard: 1 failed lab tegra124-nyan-big: 1 failed lab
There was no mention of conflicting results, in fact the rk3388-veyron-jaq and tegra124-nyan-big are only in Collabora's lab so they can't have conflicts with other labs. And we know these platforms did boot fine at some point in the past, picking a random earlier job:
https://kernelci.org/boot/id/5e2167ead92d42035d74e98a/
As such, they should have been tracked as a regression. It's true that some previous boot test results are missing for multi_v7_defconfig on next/master, probably due to the Jenkins-Docker issue we had in production for a few weeks. Still, that shouldn't be a good enough reason to not detect a regression. I've noticed other similar and more obvious cases in the past, with platforms that started failing but no regression was detected, and bisections not being triggered as a result.
This issue is going away now with the move to have everything as test results and no boot results any more. I'm keeping a close eye on test case regressions and so far they seem to be working in a reliable way.
Guillaume