On Thu, Dec 08, 2022, Ricardo Koller wrote:
On Thu, Dec 08, 2022 at 12:37:23AM +0000, Oliver Upton wrote:
On Thu, Dec 08, 2022 at 12:24:20AM +0000, Sean Christopherson wrote:
Even still, that's just a kludge to make ucalls work. We have other MMIO devices (GIC distributor, for example) that work by chance since nothing conflicts with the constant GPAs we've selected in the tests.
I'd rather we go down the route of having an address allocator for the for both the VA and PA spaces to provide carveouts at runtime.
Aren't those two separate issues? The PA, a.k.a. memslots space, can be solved by allocating a dedicated memslot, i.e. doesn't need a carve. At worst, collisions will yield very explicit asserts, which IMO is better than whatever might go wrong with a carve out.
Perhaps the use of the term 'carveout' wasn't right here.
What I'm suggesting is we cannot rely on KVM memslots alone to act as an allocator for the PA space. KVM can provide devices to the guest that aren't represented as memslots. If we're trying to fix PA allocations anyway, why not make it generic enough to suit the needs of things beyond ucalls?
One extra bit of information: in arm, IO is any access to an address (within bounds) not backed by a memslot. Not the same as x86 where MMIO are writes to read-only memslots. No idea what other arches do.
I don't think that's correct, doesn't this code turn write abort on a RO memslot into an io_mem_abort()? Specifically, the "(write_fault && !writable)" check will match, and assuming none the the edge cases in the if-statement fire, KVM will send the write down io_mem_abort().
gfn = fault_ipa >> PAGE_SHIFT; memslot = gfn_to_memslot(vcpu->kvm, gfn); hva = gfn_to_hva_memslot_prot(memslot, gfn, &writable); write_fault = kvm_is_write_fault(vcpu); if (kvm_is_error_hva(hva) || (write_fault && !writable)) { /* * The guest has put either its instructions or its page-tables * somewhere it shouldn't have. Userspace won't be able to do * anything about this (there's no syndrome for a start), so * re-inject the abort back into the guest. */ if (is_iabt) { ret = -ENOEXEC; goto out; }
if (kvm_vcpu_abt_iss1tw(vcpu)) { kvm_inject_dabt(vcpu, kvm_vcpu_get_hfar(vcpu)); ret = 1; goto out_unlock; }
/* * Check for a cache maintenance operation. Since we * ended-up here, we know it is outside of any memory * slot. But we can't find out if that is for a device, * or if the guest is just being stupid. The only thing * we know for sure is that this range cannot be cached. * * So let's assume that the guest is just being * cautious, and skip the instruction. */ if (kvm_is_error_hva(hva) && kvm_vcpu_dabt_is_cm(vcpu)) { kvm_incr_pc(vcpu); ret = 1; goto out_unlock; }
/* * The IPA is reported as [MAX:12], so we need to * complement it with the bottom 12 bits from the * faulting VA. This is always 12 bits, irrespective * of the page size. */ fault_ipa |= kvm_vcpu_get_hfar(vcpu) & ((1 << 12) - 1); ret = io_mem_abort(vcpu, fault_ipa); goto out_unlock; }