From: Zhong, Yang yang.zhong@intel.com Sent: Wednesday, December 29, 2021 9:13 PM
Highly appreciate for your review. This version mostly addressed the comments from Sean. Most comments are adopted except three which are not closed and need more discussions:
Move the entire xfd write emulation code to x86.c. Doing so requires introducing a new kvm_x86_ops callback to disable msr write bitmap. According to Paolo's earlier comment he prefers to handle it in vmx.c.
Directly check msr_bitmap in update_exception_bitmap() (for trapping #NM) and vcpu_enter_guest() (for syncing guest xfd after vm-exit) instead of introducing an extra flag in the last patch. However, doing so requires another new kvm_x86_ops callback for checking msr_bitmap since vcpu_enter_guest() is x86 common code. Having an extra flag sounds simpler here (at least for the initial AMX support). It does penalize nested guest with one xfd sync per exit, but it's not worse than a normal guest which initializes xfd but doesn't run AMX applications at all. Those could be improved afterwards.
Another option is to move xfd sync into vmx_vcpu_run(), given that disabling xfd interception is vmx specific code thus it makes some sense to also handle related side-effect in vmx.c. If this can be agreed then yes there is no need of an extra flag and just checking msr_bitmap is sufficient (and more accurate).
Paolo, how is your opinion?
- Disable #NM trap for nested guest. This version still chooses to always trap #NM (regardless in L1 or L2) as long as xfd write interception is
disabled. In reality #NM is rare if nested guest doesn't intend to run AMX applications and always-trap is safer than dynamic trap for the basic support in case of any oversight here.
Sean just pointed out some potential issues in current logic, about handling #NM raised in L2 guest (could happen just due to L1 interception).
It's being discussed here:
https://lore.kernel.org/all/YcyaN7V4wwGI7wDV@google.com/
Thanks Kevin