On Mon, Aug 22, 2022 at 10:42:15PM +0100, Marc Zyngier wrote:
Hi Gavin,
On Mon, 22 Aug 2022 02:58:20 +0100, Gavin Shan gshan@redhat.com wrote:
Hi Marc,
On 8/19/22 6:00 PM, Marc Zyngier wrote:
On Fri, 19 Aug 2022 01:55:57 +0100, Gavin Shan gshan@redhat.com wrote:
The ring-based dirty memory tracking has been available and enabled on x86 for a while. The feature is beneficial when the number of dirty pages is small in a checkpointing system or live migration scenario. More details can be found from fb04a1eddb1a ("KVM: X86: Implement ring-based dirty memory tracking").
This enables the ring-based dirty memory tracking on ARM64. It's notable that no extra reserved ring entries are needed on ARM64 because the huge pages are always split into base pages when page dirty tracking is enabled.
Can you please elaborate on this? Adding a per-CPU ring of course results in extra memory allocation, so there must be a subtle x86-specific detail that I'm not aware of...
Sure. I guess it's helpful to explain how it works in next revision. Something like below:
This enables the ring-based dirty memory tracking on ARM64. The feature is enabled by CONFIG_HAVE_KVM_DIRTY_RING, detected and enabled by CONFIG_HAVE_KVM_DIRTY_RING. A ring buffer is created on every vcpu and each entry is described by 'struct kvm_dirty_gfn'. The ring buffer is pushed by host when page becomes dirty and pulled by userspace. A vcpu exit is forced when the ring buffer becomes full. The ring buffers on all vcpus can be reset by ioctl command KVM_RESET_DIRTY_RINGS.
Yes, I think so. Adding a per-CPU ring results in extra memory allocation. However, it's avoiding synchronization among multiple vcpus when dirty pages happen on multiple vcpus. More discussion can be found from [1]
Oh, I totally buy the relaxation of the synchronisation (though I doubt this will have any visible effect until we have something like Oliver's patches to allow parallel faulting).
Heh, yeah I need to get that out the door. I'll also note that Gavin's changes are still relevant without that series, as we do write unprotect in parallel at PTE granularity after commit f783ef1c0e82 ("KVM: arm64: Add fast path to handle permission relaxation during dirty logging").
-- Thanks, Oliver