Re: [v3 2/5] arm64: kvm: allow EL2 context to be reset on shutdown

10 Apr 2015

      Mark
Cc: Marc, Geoff
On 04/10/2015 12:02 AM, Mark Rutland wrote:
...
On Thu, Apr 09, 2015 at 05:53:33AM +0100, AKASHI Takahiro wrote:
...
Mark,
On 04/08/2015 10:05 PM, Mark Rutland wrote:
...
On Thu, Apr 02, 2015 at 06:40:13AM +0100, AKASHI Takahiro wrote:
...
The current kvm implementation keeps EL2 vector table installed even
when the system is shut down. This prevents kexec from putting the system
with kvm back into EL2 when starting a new kernel.
This patch resolves this issue by calling a cpu tear-down function via
reboot notifier, kvm_reboot_notify(), which is invoked by
kernel_restart_prepare() in kernel_kexec().
While kvm has a generic hook, kvm_reboot(), we can't use it here because
a cpu teardown function will not be invoked, under current implementation,
if no guest vm has been created by kvm_create_vm().
Please note that kvm_usage_count is zero in this case.
We'd better, in the future, implement cpu hotplug support and put the
arch-specific initialization into kvm_arch_hardware_enable/disable().
This way, we would be able to revert this patch.
Why can't we use kvm_arch_hardware_enable/disable() currently?
IIUC, kvm will call kvm_arch_hardware_enable() iff a new guest is being
created *and* cpus have not been initialized yet. kvm_usage_count==0
indicates this. Similarly, kvm will call kvm_arch_hardware_disable() whenever
a guest is being terminated (i.e. kvm_usage_count != 0).
Therefore if kvm_arch_hardware_enable/disable() also handle EL2 vector table
initialization, we don't have to have any particular operations, as my patch
does, for kexec case.
(a long-term solution)
Since arm64 doesn't implement kvm_arch_hardware_enable() (I don't know why),
I'm trying to fix the problem by adding a minimum tear-down function, kvm_cpu_reset,
and invoking it via a reboot hook.
(an interim fix)
What I don't understand is why we can't move the init and tear-down
functions into kvm_arch_hardware_enable/disable(). They seem to be for
precisely what you are implementing, with the only difference being the
time that they are called.
I don't know, neither. I just followed the discussions between Marc and Geoff,
and their conclusion. I guessed that *refactoring* might be more complicated than
expected.
FYI, I gave a quick try to kvm_arch_hardware_enable() approach by removing
cpu_init_hyp_mode() from init_hyp_mode() and putting it into kvm_arch_hardware_enable(),
and it seems to work, at least, in my environment:
    boot => start a kvm guest => kexec reboot => start a kvm guest
...
Either I'm missing something, or we can simply implement the existing
hooks. I assume I'm missing something.
Marc, Geoff, any comments?
...
...
...
...
+static struct notifier_block kvm_reboot_nb = {

.notifier_call		= kvm_reboot_notify,
.next			= NULL,
.priority		= 0, /* FIXME */

It would be helpful for the comment to explain why this is wrong, and
what needs fixing.
Thank for reminding me of this.
*priority* enforces a calling order of registered hook functions.
If some hook returns NOTIFY_STOP_MASK, subsequent hooks won't be called.
(Nevertheless, reboot sequence will go ahead. See kernel_restart_prepare()/
notifier_call_chain().)
So we should make sure that kvm_reboot_notify() be called

after any hook functions which may depend on kvm, and

Which hooks depend on KVM?
I think I answered this question below:
...
...
But how can we guarantee this and determine a priority of kvm_reboot_notify()?
Looking into all the occurrences of register_reboot_notifier(),

=> nothing
=> virt/kvm/kvm_main.c (priority: 0)
=> drivers/cpufreq/s32416-cpufreq.c (priority: 0)
    drivers/cpufreq/s5pv210-cpufreq.c (priority: 0)

So a priority higher than zero might be safe and better, but exactly what?
Some hooks use "INT_MAX."
Thanks,
-Takahiro AKASHI
...
...

before any hook functions which kvm may depend on, and

Which other hooks does KVM depend on?
...

before any hook functions that may return NOTIFY_STOP_MASK

I think this would be solved by using kvm_arch_hardware_enable/disable.
As far as I can tell, the VMs would be destroyed earlier (and hence KVM
disabled) before we got to the final teardown.
Thanks,
Mark.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: [v3 2/5] arm64: kvm: allow EL2 context to be reset on shutdown