Hi Drew,
On 8/22/22 4:29 PM, Gavin Shan wrote:
On 8/19/22 3:28 PM, Andrew Jones wrote:
On Fri, Aug 19, 2022 at 08:55:59AM +0800, Gavin Shan wrote:
It's assumed that 1024 host pages, instead of guest pages, are dirtied in each iteration in guest_code(). The current implementation misses the case of mismatched page sizes in host and guest. For example, ARM64 could have 64KB page size in guest, but 4KB page size in host. (TEST_PAGES_PER_LOOP / 16), instead of TEST_PAGES_PER_LOOP, host pages are dirtied in every iteration.
Fix the issue by touching all sub-pages when we have mismatched page sizes in host and guest.
I'll let the dirty-log test authors decide what's best to do for this test, but I'd think we should let the guest continue dirtying its pages without knowledge of the host pages. Then, adjust the host test code to assert all sub-pages, other than the ones it expects the guest to have written, remain untouched.
I don't think what is clarified in the change log is correct. The current implementation already had the logic to handle the mismatched page sizes in vm_dirty_log_verify() where 'step' is used for it by fetching value from vm_num_host_pages(mode, 1). Please ignore this patch for now, as explained below.
The issue I have is the 'dirty_log_test' hangs when I have 4KB host page size and 64KB guest page size. It seems the vcpu doesn't exit due to full ring buffer state or kick-off. I will have more investigations to figure out the root cause.
[...]
Please ignore this PATCH[3/5], I think this should be fixed by selecting correct dirty ring count and the fix will be folded to PATCH[5/5] in next revision.
In dirty_log_test, we have 1GB memory for guest to write and make them dirty. When we have mismatch page sizes on host and guest, which is either 4kb-host-64kb-guest or 64kb-host-4kb-guest apart from 16kb case, 16384 host pages are dirtied in each iteration. The default dirty ring count is 65536. So the vcpu never exit due to full-dirty-ring-buffer state. This leads the guest's code keep running and the dirty log isn't collected by the main thread.
#define TEST_DIRTY_RING_COUNT 65536
dirty_pages_per_iteration = (0x40000000 / 0x10000) = 0x4000 = 16384
Thanks, Gavin