On Thu, Oct 03, 2024, Ackerley Tng wrote:
Ackerley Tng ackerleytng@google.com writes:
Elliot Berman quic_eberman@quicinc.com writes:
From x86 CoCo perspective, I think it also makes sense to not zero the folio when changing faultiblity from private to shared:
- If guest is sharing some data with host, you've wiped the data and guest has to copy again.
- Or, if SEV/TDX enforces that page is zero'd between transitions, Linux has duplicated the work that trusted entity has already done.
Fuad and I can help add some details for the conversion. Hopefully we can figure out some of the plan at plumbers this week.
Zeroing the page prevents leaking host data (see function docstring for kvm_gmem_prepare_folio() introduced in [1]), so we definitely don't want to introduce a kernel data leak bug here.
Actually it seems like filemap_grab_folio() already gets a zeroed page.
filemap_grab_folio() eventually calls __alloc_pages_noprof() -> get_page_from_freelist() -> prep_new_page() -> post_alloc_hook()
and post_alloc_hook() calls kernel_init_pages(), which zeroes the page, depending on kernel config.
Paolo, was calling clear_highpage() in kvm_gmem_prepare_folio() zeroing an already empty page returned from filemap_grab_folio()?
Yes and no. CONFIG_INIT_ON_ALLOC_DEFAULT_ON and init_on_alloc are very much hardening features, not functional behavior that other code _needs_ to be aware of. E.g. enabling init-on-alloc comes with a measurable performance cost.
Ignoring hardening, the guest_memfd mapping specifically sets the gfp_mask to GFP_HIGHUSER, i.e. doesn't set __GFP_ZERO.
That said, I wouldn't be opposed to skipping the clear_highpage() call when want_init_on_alloc() is true.
Also, the intended behavior (or at least, what intended) of kvm_gmem_prepare_folio() was it would do clear_highpage() if and only if a trusted entity does NOT zero the page. Factoring that in is a bit harder, as it probably requires another arch hook (or providing an out-param from kvm_arch_gmem_prepare()). I.e. the want_init_on_alloc() case isn't the only time KVM could shave cycles by not redundantly zeroing memory.