Summary ------------------------------------------------------------------------
kernel: 4.16.0-rc5 git repo: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git git branch: master git commit: 3032f8c504d2b15d58e4c96060a96b47e215573c git describe: v4.16-rc5-46-g3032f8c504d2 Test details: https://qa-reports.linaro.org/lkft/linux-mainline-oe/build/v4.16-rc5-46-g303...
No regressions (compared to build v4.16-rc5-4-gfc6eabbbf8ef)
Boards, architectures and test suites: -------------------------------------
dragonboard-410c * boot - fail: 39
juno-r2 - arm64 * boot - pass: 20, * kselftest - pass: 51, skip: 15, fail: 3 * libhugetlbfs - pass: 90, skip: 1, * ltp-cap_bounds-tests - pass: 2, * ltp-containers-tests - pass: 64, skip: 17, * ltp-fcntl-locktests-tests - pass: 2, * ltp-filecaps-tests - pass: 2, * ltp-fs-tests - pass: 61, skip: 2, * ltp-fs_bind-tests - pass: 2, * ltp-fs_perms_simple-tests - pass: 19, * ltp-fsx-tests - pass: 2, * ltp-hugetlb-tests - pass: 22, * ltp-ipc-tests - pass: 9, * ltp-math-tests - pass: 11, * ltp-nptl-tests - pass: 2, * ltp-pty-tests - pass: 4, * ltp-sched-tests - pass: 10, skip: 4, * ltp-securebits-tests - pass: 4, * ltp-syscalls-tests - pass: 1002, skip: 148, * ltp-timers-tests - pass: 12, skip: 1,
qemu_x86_64 * boot - pass: 20, * kselftest - pass: 59, skip: 24, * libhugetlbfs - pass: 90, skip: 1, * ltp-cap_bounds-tests - pass: 2, * ltp-containers-tests - pass: 64, skip: 17, * ltp-fcntl-locktests-tests - pass: 2, * ltp-filecaps-tests - pass: 2, * ltp-fs-tests - pass: 57, skip: 6, * ltp-fs_bind-tests - pass: 2, * ltp-fs_perms_simple-tests - pass: 19, * ltp-fsx-tests - pass: 2, * ltp-hugetlb-tests - pass: 22, * ltp-io-tests - pass: 3, * ltp-ipc-tests - pass: 9, * ltp-math-tests - pass: 11, * ltp-nptl-tests - pass: 2, * ltp-pty-tests - pass: 4, * ltp-sched-tests - pass: 13, skip: 1, * ltp-securebits-tests - pass: 4, * ltp-syscalls-tests - pass: 1001, skip: 149, * ltp-timers-tests - pass: 12, skip: 1,
x15 - arm * boot - pass: 20, * kselftest - pass: 43, skip: 18, fail: 5 * libhugetlbfs - pass: 87, skip: 1, * ltp-cap_bounds-tests - pass: 2, * ltp-containers-tests - pass: 62, skip: 17, fail: 2 * ltp-fcntl-locktests-tests - pass: 2, * ltp-filecaps-tests - pass: 2, * ltp-fs-tests - pass: 61, skip: 2, * ltp-fs_bind-tests - pass: 2, * ltp-fs_perms_simple-tests - pass: 19, * ltp-fsx-tests - pass: 2, * ltp-hugetlb-tests - pass: 20, skip: 2, * ltp-io-tests - pass: 3, * ltp-ipc-tests - pass: 9, * ltp-math-tests - pass: 11, * ltp-nptl-tests - pass: 2, * ltp-pty-tests - pass: 4, * ltp-sched-tests - pass: 13, skip: 1, * ltp-securebits-tests - pass: 4, * ltp-syscalls-tests - pass: 1053, skip: 97, * ltp-timers-tests - pass: 12, skip: 1,
x86_64 * boot - pass: 18, * kselftest - pass: 60, skip: 17, fail: 3 * libhugetlbfs - pass: 90, skip: 1, * ltp-cap_bounds-tests - pass: 2, * ltp-containers-tests - pass: 64, skip: 17, * ltp-fcntl-locktests-tests - pass: 2, * ltp-filecaps-tests - pass: 2, * ltp-fs_bind-tests - pass: 2, * ltp-fs_perms_simple-tests - pass: 19, * ltp-fsx-tests - pass: 2, * ltp-io-tests - pass: 3, * ltp-ipc-tests - pass: 9, * ltp-math-tests - pass: 11, * ltp-nptl-tests - pass: 2, * ltp-pty-tests - pass: 4, * ltp-sched-tests - pass: 9, skip: 5, * ltp-securebits-tests - pass: 4, * ltp-syscalls-tests - pass: 1031, skip: 119, * ltp-timers-tests - pass: 12, skip: 1,
On Thu, Mar 15, 2018 at 10:12 AM, Linaro QA qa-reports@linaro.org wrote:
Summary
kernel: 4.16.0-rc5 git repo: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git git branch: master git commit: 3032f8c504d2b15d58e4c96060a96b47e215573c git describe: v4.16-rc5-46-g3032f8c504d2 Test details: https://qa-reports.linaro.org/lkft/linux-mainline-oe/build/v4.16-rc5-46-g303...
No regressions (compared to build v4.16-rc5-4-gfc6eabbbf8ef)
Boards, architectures and test suites:
dragonboard-410c
- boot - fail: 39
We need to reconcile this failure with the subject line to allow better filtering of email.
While it is true that there are no regressions compared to a previous build from a few hours ago, a boot failure is a huge regression from the "baseline". We should be blaring horns and flashing lights to draw attention to it. :-) I know the QCLT is serious about tracking and fixing these as soon as possible.
So here are a few proposals to improve the report:
1. Convert these statistics to searchable keyboards that I can setup email filters on.
e.g. dragonboard-410c * boot - fail: 39
might become,
dragonboard-410c: boot: fail
(Is 39 the number of times a boot was attempted?)
2. Change the subject on this email to reflect that regressions still exist but nothing changed from the previous build. If possible point to the last build where this failure was NOT present.
3. Add some easy to search keywords in the email instead of resorting to pattern matching. This removes the need for proposal 1.
I see from a previous report the following summary, so lkft did warn us but it got lost in the noise. Most times we'd like to only know if db410c fails. How can we achieve that? Perhaps adding a unique keyword e.g. DB410CBOOTFAIL somewhere in the email? Doesn't need to be in your nicely formatted summary. I can then safely ignore all "regression found" email that don't have the keyword.
Regressions (compared to build v4.14.26) ------------------------------------------------------------------------
dragonboard-410c: boot: * dragonboard-410c
* test src: not informed
hi6220-hikey - arm64: boot: * hi6220-hikey
* test src: not informed
Hi Amit -
On Thu, Mar 15, 2018 at 10:49:32AM +0530, Amit Kucheria wrote:
On Thu, Mar 15, 2018 at 10:12 AM, Linaro QA qa-reports@linaro.org wrote:
Summary
kernel: 4.16.0-rc5 git repo: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git git branch: master git commit: 3032f8c504d2b15d58e4c96060a96b47e215573c git describe: v4.16-rc5-46-g3032f8c504d2 Test details: https://qa-reports.linaro.org/lkft/linux-mainline-oe/build/v4.16-rc5-46-g303...
No regressions (compared to build v4.16-rc5-4-gfc6eabbbf8ef)
Boards, architectures and test suites:
dragonboard-410c
- boot - fail: 39
We need to reconcile this failure with the subject line to allow better filtering of email.
While it is true that there are no regressions compared to a previous build from a few hours ago, a boot failure is a huge regression from the "baseline". We should be blaring horns and flashing lights to draw attention to it. :-) I know the QCLT is serious about tracking and fixing these as soon as possible.
We agree. The current email template was designed for a very specific usecase and is not optimal for your needs. We plan to have a meeting to discuss this at Connect. There's also related issue https://github.com/Linaro/squad/issues/242.
Second, if you look at the latest mainline results at https://qa-reports.linaro.org/lkft/linux-mainline-oe/build/v4.16-rc5-60-g0aa..., dragonboard fails to boot something like 10% of the time due to some transient issue between lava, the lava job template, and the board itself (see https://qa-reports.linaro.org/lkft/linux-mainline-oe/build/v4.16-rc5-60-g0aa... for an example lava run). It is hard to blare sirens for a boot failure, when boot failures are a regular and acceptable occurrence. Granted, failing 100% of the time should be and is a big issue.
So here are a few proposals to improve the report:
- Convert these statistics to searchable keyboards that I can setup
email filters on.
e.g. dragonboard-410c * boot - fail: 39
might become,
dragonboard-410c: boot: fail
(Is 39 the number of times a boot was attempted?)
Yes, 39 attempts. A regular run has about 20 separate lava runs per board, and qa-reports automatically re-submits some types of failed jobs.
- Change the subject on this email to reflect that regressions still
exist but nothing changed from the previous build. If possible point to the last build where this failure was NOT present.
We originally did just that, but it was removed from the LKFT template because other people did not like it. Really, the only way to solve this is with per-user email settings.
- Add some easy to search keywords in the email instead of resorting
to pattern matching. This removes the need for proposal 1.
I see from a previous report the following summary, so lkft did warn us but it got lost in the noise. Most times we'd like to only know if db410c fails. How can we achieve that? Perhaps adding a unique keyword e.g. DB410CBOOTFAIL somewhere in the email? Doesn't need to be in your nicely formatted summary. I can then safely ignore all "regression found" email that don't have the keyword.
That's a good suggestion. It would be even better if you only received emails with db410c regressions.
We've failed when we successfully detect a problem, but nobody notices due to the signal:noise ratio. We have to fix that.
This particular boot problem impacted multiple arm64 boards, and so we (the LKFT triage team) didn't consider it a qualcomm landing team issue, and so we did not inform you about our findings (our mistake). Meanwhile, you were working to find the root cause while we already knew it.
Next time, we should think to CC you on our investigations related to db410c, and you should also consider asking us about things you notice either in #linaro-lkft or lkft-triage@lists.linaro.org. A lot of this is just learning to work together. We try to stay on top of these results and usually have done a preliminary investigation within a day or two of any regressions.
Regressions (compared to build v4.14.26)
dragonboard-410c: boot: * dragonboard-410c
* test src: not informed
hi6220-hikey - arm64: boot: * hi6220-hikey
* test src: not informed
On Thu, Mar 15, 2018 at 9:36 PM, Dan Rue dan.rue@linaro.org wrote:
Hi Amit -
On Thu, Mar 15, 2018 at 10:49:32AM +0530, Amit Kucheria wrote:
On Thu, Mar 15, 2018 at 10:12 AM, Linaro QA qa-reports@linaro.org wrote:
Summary
kernel: 4.16.0-rc5 git repo: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git git branch: master git commit: 3032f8c504d2b15d58e4c96060a96b47e215573c git describe: v4.16-rc5-46-g3032f8c504d2 Test details: https://qa-reports.linaro.org/lkft/linux-mainline-oe/build/v4.16-rc5-46-g303...
No regressions (compared to build v4.16-rc5-4-gfc6eabbbf8ef)
Boards, architectures and test suites:
dragonboard-410c
- boot - fail: 39
We need to reconcile this failure with the subject line to allow better filtering of email.
While it is true that there are no regressions compared to a previous build from a few hours ago, a boot failure is a huge regression from the "baseline". We should be blaring horns and flashing lights to draw attention to it. :-) I know the QCLT is serious about tracking and fixing these as soon as possible.
We agree. The current email template was designed for a very specific usecase and is not optimal for your needs. We plan to have a meeting to discuss this at Connect. There's also related issue https://github.com/Linaro/squad/issues/242.
Please let us know when and where the meeting is, if there is no conflict some of us might attend.
Second, if you look at the latest mainline results at https://qa-reports.linaro.org/lkft/linux-mainline-oe/build/v4.16-rc5-60-g0aa..., dragonboard fails to boot something like 10% of the time due to some transient issue between lava, the lava job template, and the board itself (see https://qa-reports.linaro.org/lkft/linux-mainline-oe/build/v4.16-rc5-60-g0aa... for an example lava run). It is hard to blare sirens for a boot failure, when boot failures are a regular and acceptable occurrence. Granted, failing 100% of the time should be and is a big issue.
So it is not possible to distinguish between infrastructure failure and a genuine boot failure at this point, I take it? In any case, we want to hear about boot failures. Communication for test failures can be improved later.
So here are a few proposals to improve the report:
- Convert these statistics to searchable keyboards that I can setup
email filters on.
e.g. dragonboard-410c * boot - fail: 39
might become,
dragonboard-410c: boot: fail
(Is 39 the number of times a boot was attempted?)
Yes, 39 attempts. A regular run has about 20 separate lava runs per board, and qa-reports automatically re-submits some types of failed jobs.
- Change the subject on this email to reflect that regressions still
exist but nothing changed from the previous build. If possible point to the last build where this failure was NOT present.
We originally did just that, but it was removed from the LKFT template because other people did not like it. Really, the only way to solve this is with per-user email settings.
Allowing people to subscribe to a subset of results would be nice indeed.
- Add some easy to search keywords in the email instead of resorting
to pattern matching. This removes the need for proposal 1.
I see from a previous report the following summary, so lkft did warn us but it got lost in the noise. Most times we'd like to only know if db410c fails. How can we achieve that? Perhaps adding a unique keyword e.g. DB410CBOOTFAIL somewhere in the email? Doesn't need to be in your nicely formatted summary. I can then safely ignore all "regression found" email that don't have the keyword.
That's a good suggestion. It would be even better if you only received emails with db410c regressions.
Even if I only received emails for db410c only, keywords would still be useful to filter out real issues from the general noise.
We've failed when we successfully detect a problem, but nobody notices due to the signal:noise ratio. We have to fix that.
On the contrary, we're now detecting things that we wouldn't have noticed for a week or a month earlier. So I think this is already improving the status quo. Now if we could only make it zero-day, no pressure. ;-)
This particular boot problem impacted multiple arm64 boards, and so we (the LKFT triage team) didn't consider it a qualcomm landing team issue, and so we did not inform you about our findings (our mistake). Meanwhile, you were working to find the root cause while we already knew it.
Next time, we should think to CC you on our investigations related to db410c, and you should also consider asking us about things you notice either in #linaro-lkft or lkft-triage@lists.linaro.org. A lot of this is just learning to work together. We try to stay on top of these results and usually have done a preliminary investigation within a day or two of any regressions.
Sounds good. Let's discuss more next week in HKG.
Regards, Amit
Hi Dan,
I wasn't able to catchup with you at Connect, apologies.
Did the LKFT team reach any decisions specifically related to reporting? Is there a summary somewhere that I can read?
Cheers, Amit
On Thu, Mar 15, 2018 at 9:52 PM, Amit Kucheria amit.kucheria@linaro.org wrote:
On Thu, Mar 15, 2018 at 9:36 PM, Dan Rue dan.rue@linaro.org wrote:
Hi Amit -
On Thu, Mar 15, 2018 at 10:49:32AM +0530, Amit Kucheria wrote:
On Thu, Mar 15, 2018 at 10:12 AM, Linaro QA qa-reports@linaro.org wrote:
Summary
kernel: 4.16.0-rc5 git repo: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git git branch: master git commit: 3032f8c504d2b15d58e4c96060a96b47e215573c git describe: v4.16-rc5-46-g3032f8c504d2 Test details: https://qa-reports.linaro.org/lkft/linux-mainline-oe/build/v4.16-rc5-46-g303...
No regressions (compared to build v4.16-rc5-4-gfc6eabbbf8ef)
Boards, architectures and test suites:
dragonboard-410c
- boot - fail: 39
We need to reconcile this failure with the subject line to allow better filtering of email.
While it is true that there are no regressions compared to a previous build from a few hours ago, a boot failure is a huge regression from the "baseline". We should be blaring horns and flashing lights to draw attention to it. :-) I know the QCLT is serious about tracking and fixing these as soon as possible.
We agree. The current email template was designed for a very specific usecase and is not optimal for your needs. We plan to have a meeting to discuss this at Connect. There's also related issue https://github.com/Linaro/squad/issues/242.
Please let us know when and where the meeting is, if there is no conflict some of us might attend.
Second, if you look at the latest mainline results at https://qa-reports.linaro.org/lkft/linux-mainline-oe/build/v4.16-rc5-60-g0aa..., dragonboard fails to boot something like 10% of the time due to some transient issue between lava, the lava job template, and the board itself (see https://qa-reports.linaro.org/lkft/linux-mainline-oe/build/v4.16-rc5-60-g0aa... for an example lava run). It is hard to blare sirens for a boot failure, when boot failures are a regular and acceptable occurrence. Granted, failing 100% of the time should be and is a big issue.
So it is not possible to distinguish between infrastructure failure and a genuine boot failure at this point, I take it? In any case, we want to hear about boot failures. Communication for test failures can be improved later.
So here are a few proposals to improve the report:
- Convert these statistics to searchable keyboards that I can setup
email filters on.
e.g. dragonboard-410c * boot - fail: 39
might become,
dragonboard-410c: boot: fail
(Is 39 the number of times a boot was attempted?)
Yes, 39 attempts. A regular run has about 20 separate lava runs per board, and qa-reports automatically re-submits some types of failed jobs.
- Change the subject on this email to reflect that regressions still
exist but nothing changed from the previous build. If possible point to the last build where this failure was NOT present.
We originally did just that, but it was removed from the LKFT template because other people did not like it. Really, the only way to solve this is with per-user email settings.
Allowing people to subscribe to a subset of results would be nice indeed.
- Add some easy to search keywords in the email instead of resorting
to pattern matching. This removes the need for proposal 1.
I see from a previous report the following summary, so lkft did warn us but it got lost in the noise. Most times we'd like to only know if db410c fails. How can we achieve that? Perhaps adding a unique keyword e.g. DB410CBOOTFAIL somewhere in the email? Doesn't need to be in your nicely formatted summary. I can then safely ignore all "regression found" email that don't have the keyword.
That's a good suggestion. It would be even better if you only received emails with db410c regressions.
Even if I only received emails for db410c only, keywords would still be useful to filter out real issues from the general noise.
We've failed when we successfully detect a problem, but nobody notices due to the signal:noise ratio. We have to fix that.
On the contrary, we're now detecting things that we wouldn't have noticed for a week or a month earlier. So I think this is already improving the status quo. Now if we could only make it zero-day, no pressure. ;-)
This particular boot problem impacted multiple arm64 boards, and so we (the LKFT triage team) didn't consider it a qualcomm landing team issue, and so we did not inform you about our findings (our mistake). Meanwhile, you were working to find the root cause while we already knew it.
Next time, we should think to CC you on our investigations related to db410c, and you should also consider asking us about things you notice either in #linaro-lkft or lkft-triage@lists.linaro.org. A lot of this is just learning to work together. We try to stay on top of these results and usually have done a preliminary investigation within a day or two of any regressions.
Sounds good. Let's discuss more next week in HKG.
Regards, Amit
kernel-build-reports@lists.linaro.org