On Fri, Jul 25, 2025 at 03:54:10PM -0700, Jiaqi Yan wrote:
On Sat, Jul 19, 2025 at 2:24 PM Jiaqi Yan jiaqiyan@google.com wrote:
On Sat, Jul 12, 2025 at 12:57 PM Oliver Upton oliver.upton@linux.dev wrote:
On Fri, Jul 11, 2025 at 04:59:11PM -0700, Jiaqi Yan wrote:
- Add some detail about FEAT_RAS where we may still exit to userspace for host-controlled memory, as we cannot differentiate between a stage-1 or stage-2 TTW SEA when taken on the descriptor PA
Ah, IIUC, you are saying even if the FSC code tells fault is on TTW (esr_fsc_is_secc_ttw or esr_fsc_is_sea_ttw), it can either be guest stage-1's or stage-2's descriptor PA, and we can tell which from which.
However, if ESR_ELx_S1PTW is set, we can tell this is a sub-case of stage-2 descriptor PA, their usage is for stage-1 PTW but they are stage-2 memory.
Is my current understanding right?
Yep, that's exactly what I'm getting at. As you note, stage-2 aborts during a stage-1 walk are sufficiently described, but not much else.
Got it, thanks!
+/*
- Returns true if the SEA should be handled locally within KVM if the abort is
- caused by a kernel memory allocation (e.g. stage-2 table memory).
- */
+static bool host_owns_sea(struct kvm_vcpu *vcpu, u64 esr) +{
/*
* Without FEAT_RAS HCR_EL2.TEA is RES0, meaning any external abort
* taken from a guest EL to EL2 is due to a host-imposed access (e.g.
* stage-2 PTW).
*/
if (!cpus_have_final_cap(ARM64_HAS_RAS_EXTN))
return true;
/* KVM owns the VNCR when the vCPU isn't in a nested context. */
if (is_hyp_ctxt(vcpu) && (esr & ESR_ELx_VNCR))
return true;
/*
* Determining if an external abort during a table walk happened at
* stage-2 is only possible with S1PTW is set. Otherwise, since KVM
* sets HCR_EL2.TEA, SEAs due to a stage-1 walk (i.e. accessing the PA
* of the stage-1 descriptor) can reach here and are reported with a
* TTW ESR value.
*/
return esr_fsc_is_sea_ttw(esr) && (esr & ESR_ELx_S1PTW);
Should we include esr_fsc_is_secc_ttw? like (esr_fsc_is_sea_ttw(esr) || esr_fsc_is_secc_ttw(esr)) && (esr & ESR_ELx_S1PTW)
Parity / ECC errors are not permitted if FEAT_RAS is implemented (which is tested for up front).
Ah, thanks for pointing this out.
+}
int kvm_handle_guest_sea(struct kvm_vcpu *vcpu) {
u64 esr = kvm_vcpu_get_esr(vcpu);
struct kvm_run *run = vcpu->run;
struct kvm *kvm = vcpu->kvm;
u64 esr_mask = ESR_ELx_EC_MASK |
ESR_ELx_FnV |
ESR_ELx_EA |
ESR_ELx_CM |
ESR_ELx_WNR |
ESR_ELx_FSC;
Do you (and why) exclude ESR_ELx_IL on purpose?
Unintended :)
Will add into my patch.
BTW, if my previous statement about TTW SEA is correct, then I also understand why we need to explicitly exclude ESR_ELx_S1PTW.
Right, we shouldn't be exposing genuine stage-2 external aborts to userspace.
u64 ipa;
/* * Give APEI the opportunity to claim the abort before handling it * within KVM. apei_claim_sea() expects to be called with IRQs
@@ -1824,7 +1864,32 @@ int kvm_handle_guest_sea(struct kvm_vcpu *vcpu) if (apei_claim_sea(NULL) == 0)
I assume kvm should still lockdep_assert_irqs_enabled(), right? That is, a WARN_ON_ONCE is still useful in case?
Ah, this is diffed against my VNCR prefix which has this context. Yes, I want to preserve the lockdep assertion.
Thanks for sharing the patch! Should I wait for you to send and queue to kvmarm/next and rebase my v3 to it? Or should I insert it into my v3 patch series with you as the commit author, and Signed-off-by you?
Friendly ping for this question, my v3 is ready but want to confirm the best option here.
Recently we found even the newer ARM64 platforms used by our org has to rely on KVM to more gracefully handle SEA (lacking support from APEI), so we would really want to work with upstream to lock down the proposed approach/UAPI asap.
Posted the VNCR fix which I plan on taking in 6.17. Feel free to rebase your work on top of kvmarm-6.17 or -rc1 when it comes out.
https://lore.kernel.org/kvmarm/20250729182342.3281742-1-oliver.upton@linux.d...
Thanks, Oliver