This has been quite a week in LKFT, and I want to take a moment to reflect.
This month, Greg has been pushing release candidates on Mondays and Fridays (double his typical rate), causing a rather frantic pace for us going into the holidays and finishing our sprint.
Last week, we had builds taking 8+ hours due to multiple causes, all of which have been resolved going into this week thanks to the builds team and Ben. Without those build and infrastructure improvements, we would have had a really difficult time this week. Thank you for continuously improving builds.
On Monday, Greg pushed. Our hikeys were offline due to the firmware upgrade, but Maria and Dave were able to pull some hikeys from production to get us back online. We were able to post full results to Greg on Tuesday.
Then, we had several large build and test-related pull requests outstanding, which we took the window of opportunity to merge. It caused some chaos in the infrastructure, but we were able to troubleshoot and resolve by Wednesday. (In the future, the ability to push changes through a staging environment before hitting production will alleviate this risk).
On Thursday, we noticed some unusual test failures. Fathi and Naresh were able to coordinate and discover a new problem in the build that was causing kernel modules to be missing. This was solved the same day.
I'm getting to the best part, thank you for reading this far :)
On Thursday afternoon, we were seeing Greg's "pre-rc" pushes (he often pushes partial trees before actually announcing.. we usually ignore them, but we're getting smarter). Late in his day, Milosz noticed that 4.14's pre-rc wasn't even booting and so I took a look and noticed that it was broken for all arm64 boards, and sent a note to the mailing list. When Naresh came online, he was able to work with Sumit to bisect and determine the bad patch, posting the results back to the stable mailing list. Then he followed up by identifying the upstream patch which actually fixes the problem.
By the time I came back online this morning, 4.14 was fully diagnosed and reported upstream and to Greg, within a couple hours of his RC announcement (!).
This might sound like chaos, and it is a bit more chaotic than I think we all would prefer. However! I feel so privileged to be able to work with you all, and see all of this come together to the point that even though Greg is pushing broken trees on the Friday before Christmas, and we've had a lot of changes made just this week, everything is green going into the new year.
tl;dr: There have been 6 RCs (times 3 branches, times 4 boards) already this month, and we have been able to meet our 48h SLA on every one of them (assuming today's goes out on time), thanks to a team of talented people that I am so thankful to be able to work with.
Merry Christmas and Happy New Year,
Dan
Great work everyone! Thanks for the update Dan. It really puts all the hard work of the last year into perspective. I'm really proud of everyone for getting to this point and for being able to coordinate and address necessary changes so quickly, even in less than ideal circumstances.
On Dec 22, 2017 9:49 AM, "Dan Rue" dan.rue@linaro.org wrote:
This has been quite a week in LKFT, and I want to take a moment to reflect.
This month, Greg has been pushing release candidates on Mondays and Fridays (double his typical rate), causing a rather frantic pace for us going into the holidays and finishing our sprint.
Last week, we had builds taking 8+ hours due to multiple causes, all of which have been resolved going into this week thanks to the builds team and Ben. Without those build and infrastructure improvements, we would have had a really difficult time this week. Thank you for continuously improving builds.
On Monday, Greg pushed. Our hikeys were offline due to the firmware upgrade, but Maria and Dave were able to pull some hikeys from production to get us back online. We were able to post full results to Greg on Tuesday.
Then, we had several large build and test-related pull requests outstanding, which we took the window of opportunity to merge. It caused some chaos in the infrastructure, but we were able to troubleshoot and resolve by Wednesday. (In the future, the ability to push changes through a staging environment before hitting production will alleviate this risk).
On Thursday, we noticed some unusual test failures. Fathi and Naresh were able to coordinate and discover a new problem in the build that was causing kernel modules to be missing. This was solved the same day.
I'm getting to the best part, thank you for reading this far :)
On Thursday afternoon, we were seeing Greg's "pre-rc" pushes (he often pushes partial trees before actually announcing.. we usually ignore them, but we're getting smarter). Late in his day, Milosz noticed that 4.14's pre-rc wasn't even booting and so I took a look and noticed that it was broken for all arm64 boards, and sent a note to the mailing list. When Naresh came online, he was able to work with Sumit to bisect and determine the bad patch, posting the results back to the stable mailing list. Then he followed up by identifying the upstream patch which actually fixes the problem.
By the time I came back online this morning, 4.14 was fully diagnosed and reported upstream and to Greg, within a couple hours of his RC announcement (!).
This might sound like chaos, and it is a bit more chaotic than I think we all would prefer. However! I feel so privileged to be able to work with you all, and see all of this come together to the point that even though Greg is pushing broken trees on the Friday before Christmas, and we've had a lot of changes made just this week, everything is green going into the new year.
tl;dr: There have been 6 RCs (times 3 branches, times 4 boards) already this month, and we have been able to meet our 48h SLA on every one of them (assuming today's goes out on time), thanks to a team of talented people that I am so thankful to be able to work with.
Merry Christmas and Happy New Year,
Dan
On Fri, Dec 22, 2017 at 09:49:02AM -0600, Dan Rue wrote:
This has been quite a week in LKFT, and I want to take a moment to reflect.
This month, Greg has been pushing release candidates on Mondays and Fridays (double his typical rate), causing a rather frantic pace for us going into the holidays and finishing our sprint.
Last week, we had builds taking 8+ hours due to multiple causes, all of which have been resolved going into this week thanks to the builds team and Ben. Without those build and infrastructure improvements, we would have had a really difficult time this week. Thank you for continuously improving builds.
OK this is bugging me, even though I think I'm allowed some story telling license here. The build problems were two weeks ago! Hikey problems were last week. Point stands!
On Monday, Greg pushed. Our hikeys were offline due to the firmware upgrade, but Maria and Dave were able to pull some hikeys from production to get us back online. We were able to post full results to Greg on Tuesday.
Then, we had several large build and test-related pull requests outstanding, which we took the window of opportunity to merge. It caused some chaos in the infrastructure, but we were able to troubleshoot and resolve by Wednesday. (In the future, the ability to push changes through a staging environment before hitting production will alleviate this risk).
On Thursday, we noticed some unusual test failures. Fathi and Naresh were able to coordinate and discover a new problem in the build that was causing kernel modules to be missing. This was solved the same day.
I'm getting to the best part, thank you for reading this far :)
On Thursday afternoon, we were seeing Greg's "pre-rc" pushes (he often pushes partial trees before actually announcing.. we usually ignore them, but we're getting smarter). Late in his day, Milosz noticed that 4.14's pre-rc wasn't even booting and so I took a look and noticed that it was broken for all arm64 boards, and sent a note to the mailing list. When Naresh came online, he was able to work with Sumit to bisect and determine the bad patch, posting the results back to the stable mailing list. Then he followed up by identifying the upstream patch which actually fixes the problem.
By the time I came back online this morning, 4.14 was fully diagnosed and reported upstream and to Greg, within a couple hours of his RC announcement (!).
This might sound like chaos, and it is a bit more chaotic than I think we all would prefer. However! I feel so privileged to be able to work with you all, and see all of this come together to the point that even though Greg is pushing broken trees on the Friday before Christmas, and we've had a lot of changes made just this week, everything is green going into the new year.
tl;dr: There have been 6 RCs (times 3 branches, times 4 boards) already this month, and we have been able to meet our 48h SLA on every one of them (assuming today's goes out on time), thanks to a team of talented people that I am so thankful to be able to work with.
Merry Christmas and Happy New Year,
Dan