On Mon, Nov 06, 2017 at 09:15:14PM +0000, James Hogan wrote:
> From: Matt Redfearn <matt.redfearn(a)imgtec.com>
>
> commit 9e8c399a88f0b87e41a894911475ed2a8f8dff9e upstream.
>
> Commit 6f542ebeaee0 ("MIPS: Fix race on setting and getting
> cpu_online_mask") effectively reverted commit 8f46cca1e6c06 ("MIPS: SMP:
> Fix possibility of deadlock when bringing CPUs online") and thus has
> reinstated the possibility of deadlock.
>
> The commit was based on testing of kernel v4.4, where the CPU hotplug
> core code issued a BUG() if the starting CPU is not marked online when
> the boot CPU returns from __cpu_up. The commit fixes this race (in
> v4.4), but re-introduces the deadlock situation.
>
> As noted in the commit message, upstream differs in this area. Commit
> 8df3e07e7f21f ("cpu/hotplug: Let upcoming cpu bring itself fully up")
> adds a completion event in the CPU hotplug core code, making this race
> impossible. However, people were unhappy with relying on the core code
> to do the right thing.
>
> To address the issues both commits were trying to fix, add a second
> completion event in the MIPS smp hotplug path. It removes the
> possibility of a race, since the MIPS smp hotplug code now synchronises
> both the boot and secondary CPUs before they return to the hotplug core
> code. It also addresses the deadlock by ensuring that the secondary CPU
> is not marked online before it's counters are synchronised.
>
> This fix should also be backported to fix the race condition introduced
> by the backport of commit 8f46cca1e6c06 ("MIPS: SMP: Fix possibility of
> deadlock when bringing CPUs online"), through really that race only
> existed before commit 8df3e07e7f21f ("cpu/hotplug: Let upcoming cpu
> bring itself fully up").
>
> Signed-off-by: Matt Redfearn <matt.redfearn(a)imgtec.com>
> Fixes: 6f542ebeaee0 ("MIPS: Fix race on setting and getting cpu_online_mask")
> CC: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext(a)nokia.com>
> Cc: <stable(a)vger.kernel.org> # v4.1+: 8f46cca1e6c0: "MIPS: SMP: Fix possibility of deadlock when bringing CPUs online"
> Cc: <stable(a)vger.kernel.org> # v4.1+: a00eeede507c: "MIPS: SMP: Use a completion event to signal CPU up"
> Cc: <stable(a)vger.kernel.org> # v4.1+: 6f542ebeaee0: "MIPS: Fix race on setting and getting cpu_online_mask"
These did not apply to 3.18, so this patch overall did not apply there
either.
I don't know if you care about 3.18, but if so, can you provide
backports of these for that tree, and then resend this patch so I can
queue it up?
thanks,
greg k-h
On Mon, Nov 06, 2017 at 12:11:36PM +0200, Mika Westerberg wrote:
> Hi,
>
> Currently there is an issue in v4.13 where booting system with
> Thunderbolt device connected causes problems. Typically the device will
> not be tunneled properly and will not function as expected.
>
> The issue has been root caused to the way Linux initializes and handles
> ACPI GPEs (General Purpose Events). In summary we enable them too late
> and that results the PCI core finds upstream PCIe port of the
> Thunderbolt host controller too early (it is not fully configured by the
> BIOS SMI handler yet).
>
> This has been fixed in mainline starting from v4.14-rc1 by the following
> commits:
>
> ecc1165b8b74 ("ACPICA: Dispatch active GPEs at init time")
> 1312b7e0caca ("ACPICA: Make it possible to enable runtime GPEs earlier")
> eb7f43c4adb4 ("ACPI / scan: Enable GPEs before scanning the namespace")
>
> In order to get Thunderbolt working even when there is a device
> connected during boot, can the above 3 commits be included in stable for
> v4.13?
Now all queued up, thanks.
greg k-h