On Thu, Jun 07, 2018 at 05:54:56PM +0000, Max Asbock wrote:
From: Greg KH [greg@kroah.com] Sent: Thursday, June 07, 2018 1:37 AM To: Max Asbock Cc: stable@vger.kernel.org; tytso@mit.edu; Chris McDermott Subject: [External] Re: panic at boot time with kernel >= 4.9.98 - uninitialized system_wq in early interrupt
Ick :(
I'm guessing you also see these problems on 4.17? Can you test there to be sure of that?
We haven't had a chance to test 4.17 on the system where this happens. I am suspecting this won't be a problem on 4.17 as workqueue init has been split up and there is now a workqueue_init_early() in start_kernel(): /* * Allow workqueue creation and work item queueing/cancelling * early. Work item execution depends on kthreads and starts after * workqueue_init(). */ workqueue_init_early();
So far we have only seen this with 4.9.x. Also, this only happens when lots of memory is installed (10TB). i am guessing the large memory size changes the timing of the initialization steps and brings out the problem. When we get access to the system again we can attempt to boot the latest main-line kernel to verify that the work_init_early indeed fixes the issue there.
Ugh, I forgot about the workqueue rewrite.
How about 4.14, does that work for you? If so, just use that, you shouldn't be using the 4.9 kernel tree on x86-based hardware unless you are somehow forced to due to horrible closed source kernel drivers. You want and need the fixes and speedups that are on 4.14.y, it's a measurable difference.
thanks,
greg k-h