On Mon, Nov 06, 2017 at 09:15:14PM +0000, James Hogan wrote:
From: Matt Redfearn matt.redfearn@imgtec.com
commit 9e8c399a88f0b87e41a894911475ed2a8f8dff9e upstream.
Commit 6f542ebeaee0 ("MIPS: Fix race on setting and getting cpu_online_mask") effectively reverted commit 8f46cca1e6c06 ("MIPS: SMP: Fix possibility of deadlock when bringing CPUs online") and thus has reinstated the possibility of deadlock.
The commit was based on testing of kernel v4.4, where the CPU hotplug core code issued a BUG() if the starting CPU is not marked online when the boot CPU returns from __cpu_up. The commit fixes this race (in v4.4), but re-introduces the deadlock situation.
As noted in the commit message, upstream differs in this area. Commit 8df3e07e7f21f ("cpu/hotplug: Let upcoming cpu bring itself fully up") adds a completion event in the CPU hotplug core code, making this race impossible. However, people were unhappy with relying on the core code to do the right thing.
To address the issues both commits were trying to fix, add a second completion event in the MIPS smp hotplug path. It removes the possibility of a race, since the MIPS smp hotplug code now synchronises both the boot and secondary CPUs before they return to the hotplug core code. It also addresses the deadlock by ensuring that the secondary CPU is not marked online before it's counters are synchronised.
This fix should also be backported to fix the race condition introduced by the backport of commit 8f46cca1e6c06 ("MIPS: SMP: Fix possibility of deadlock when bringing CPUs online"), through really that race only existed before commit 8df3e07e7f21f ("cpu/hotplug: Let upcoming cpu bring itself fully up").
Signed-off-by: Matt Redfearn matt.redfearn@imgtec.com Fixes: 6f542ebeaee0 ("MIPS: Fix race on setting and getting cpu_online_mask") CC: Matija Glavinic Pecotic matija.glavinic-pecotic.ext@nokia.com Cc: stable@vger.kernel.org # v4.1+: 8f46cca1e6c0: "MIPS: SMP: Fix possibility of deadlock when bringing CPUs online" Cc: stable@vger.kernel.org # v4.1+: a00eeede507c: "MIPS: SMP: Use a completion event to signal CPU up" Cc: stable@vger.kernel.org # v4.1+: 6f542ebeaee0: "MIPS: Fix race on setting and getting cpu_online_mask"
These did not apply to 3.18, so this patch overall did not apply there either.
I don't know if you care about 3.18, but if so, can you provide backports of these for that tree, and then resend this patch so I can queue it up?
thanks,
greg k-h