Adding stable and lkml.
Sorry for spam others.
-Mukesh
On 7/23/2018 1:57 PM, Mukesh Ojha wrote:
Hi All,
I wanted to discuss about one of the corner case exists in 4.9 kernel (4.9.x) where If hotplug of one of the CPU fails due to failure in one of the callback, which is to be called after "notify:online"(as notify_online will create sysfs nodes for the hotplug cpu) .
So, while cleaning up notify_dead() does not get called as step https://elixir.bootlin.com/linux/v4.9/ident/step->skip_onerr set to true for "notify:prepare"and due to that sysfs nodes of that cpu does not get cleaned up which can cause issue in next hotplug attempt of that cpu.
Fails cpuhp_up_callbacks https://elixir.bootlin.com/linux/v4.9/ident/cpuhp_up_callbacks => cpuhp_invoke_callback https://elixir.bootlin.com/linux/v4.9/ident/cpuhp_invoke_callback => undo_cpu_up https://elixir.bootlin.com/linux/v4.9/ident/undo_cpu_up
.name = "notify:prepare", .teardown.single = notify_dead https://elixir.bootlin.com/linux/v4.9/ident/notify_dead, .skip_onerr = true,
I think the possible solution here could be to remove the - .skip_onerr = true,
for "notify:prepare"so that CPU_DEAD notification get send.
Please, feel free to suggest if it has any side-effect as i don't feel any.
Ref:
https://elixir.bootlin.com/linux/v4.9/source/kernel/cpu.c#L458
Cheers, Mukesh