On 9/9/20 11:36 PM, Geert Uytterhoeven wrote:
Hi Günter,
On Wed, Sep 9, 2020 at 8:24 PM Guenter Roeck linux@roeck-us.net wrote:
On 9/9/20 11:01 AM, Greg Kroah-Hartman wrote:
On Wed, Sep 09, 2020 at 09:47:05AM -0700, Guenter Roeck wrote:
On Tue, Sep 08, 2020 at 05:22:22PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.8.8 release. There are 186 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 10 Sep 2020 15:21:57 +0000. Anything received after that time might be too late.
Build results: total: 154 pass: 153 fail: 1 Failed builds: powerpc:allmodconfig Qemu test results: total: 430 pass: 430 fail: 0
The powerpc problem is the same as before:
Inconsistent kallsyms data Try make KALLSYMS_EXTRA_PASS=1 as a workaround
KALLSYMS_EXTRA_PASS=1 doesn't help. The problem is sporadic, elusive, and all but impossible to bisect. The same build passes on another system, for example, with a different load pattern. It may pass with -j30 and fail with -j40. The problem started at some point after v5.8, and got worse over time; by now it almost always happens. I'd be happy to debug if there is a means to do it, but I don't have an idea where to even start. I'd disable KALLSYMS in my test configurations, but the symbol is selected from various places and thus difficult to disable. So unless I stop building ppc:allmodconfig entirely we'll just have to live with the failure.
Ah, I was worried when I saw your dashboard orange for this kernel.
I guess the powerpc maintainers don't care? Sad :(
Not sure if the powerpc architecture is to blame. Bisect attempts end up all over the place, and don't typically include any powerpc changes. I have no idea how kallsyms is created, but my suspicion is that it is a generic problem and that powerpc just happens to hit it right now. I have added KALLSYMS_EXTRA_PASS=1 to several architecture builds over time, when they reported similar problems. Right now I set it for alpha, arm, and m68k. powerpc just happens to be the first architecture where it doesn't help.
This is a generic problem, cfr. scripts/link-vmlinux.sh:
# kallsyms support # Generate section listing all symbols and add it into vmlinux # It's a three step process: # 1) Link .tmp_vmlinux1 so it has all symbols and sections, # but __kallsyms is empty. # Running kallsyms on that gives us .tmp_kallsyms1.o with # the right size # 2) Link .tmp_vmlinux2 so it now has a __kallsyms section of # the right size, but due to the added section, some # addresses have shifted. # From here, we generate a correct .tmp_kallsyms2.o # 3) That link may have expanded the kernel image enough that # more linker branch stubs / trampolines had to be added, which # introduces new names, which further expands kallsyms. Do another # pass if that is the case. In theory it's possible this results # in even more stubs, but unlikely. # KALLSYMS_EXTRA_PASS=1 may also used to debug or work around # other bugs.
Ah, that explains a lot.
Adding even more kallsyms_steps may help (or not, if you're really unlucky). Perhaps the number of passes should be handled automatically (i.e. run until it succeeds, with a sane (16?) upper limit to avoid endless builds, so it can still fail, in theory).
Turns out it needs four steps. I prepared a patch to try up to 8 steps. We'll see if it gets accepted.
Thanks, Guenter