On Mon, Jan 22, 2018 at 07:52:57PM +0530, Naresh Kamboju wrote:
> On 20 January 2018 at 07:52, Linaro QA <qa-reports(a)linaro.org> wrote:
> > Summary
> > ------------------------------------------------------------------------
> >
> > kernel: 4.15.0-rc8
> > git repo: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> > git branch: master
> > git commit: 8dd903d2cf7b6dfe98be7c19f891882583c7266e
> > git describe: v4.15-rc8-225-g8dd903d2cf7b
> > Test details: https://qa-reports.linaro.org/lkft/linux-mainline-oe/build/v4.15-rc8-225-g8…
> >
> > Regressions (compared to build v4.15-rc8-120-gdda3e15231b3)
> > ------------------------------------------------------------------------
> >
> > x86_64:
> > kselftest:
> > * fsgsbase_64
>
> Re-submitted job
>
> fsgsbase_64 test failed 2 out of 10 times.
> Need to investigate the real reason for failure.
>
> Fail log:
> ----------
> [RUN] ARCH_SET_GS(0x0) and clear gs, then schedule to 0x1
> Before schedule, set selector to 0x1
> other thread: ARCH_SET_GS(0x1) -- sel is 0x0
> [FAIL] GS/BASE changed from 0x1/0x0 to 0x0/0x0
> <trim>
> [RUN] ARCH_SET_GS(0x1), then schedule to 0x200000000
> Before schedule, set selector to 0x2
> other thread: ARCH_SET_GS(0x200000000) -- sel is 0x0
> [FAIL] GS/BASE changed from 0x2/0x0 to 0x0/0x1
>
> https://lkft.validation.linaro.org/scheduler/job/100583#L2613
>
> - Naresh
I looked into this a bit and I think it's a bad test.
We've seen it fail all the way back in 4.14-rc6 (about as far as our
data goes):
https://lkft.validation.linaro.org/scheduler/job/20773#results_3354469
The failure rate is much less than 20%. Sometimes when I run it in a
loop I can get several hundreds of runs before it fails, but it will
always fail eventually, which explains why we see it so infrequently.
The most recent commit to fsgsbase.c says:
commit 23d98c204386a98d9ef9f9e744f41443ece4929f
Author: Andy Lutomirski <luto(a)kernel.org>
Date: Tue Aug 1 07:11:36 2017 -0700
selftests/x86/fsgsbase: Test selectors 1, 2, and 3
Those are funny cases. Make sure they work.
(Something is screwy with signal handling if a selector is 1, 2, or 3.
Anyone who wants to dive into that rabbit hole is welcome to do so.)
I think we may be seeing the "something screwy".
I vote to file the bug (naresh), report the issue upstream (drue), and
then add it to the skiplist.
Dan