Re: [PATCH] ARM64: KVM: Fix coherent_icache_guest_page() for host with external L3-cache.

16 Aug 2013


      On Fri, Aug 16, 2013 at 10:48:39PM +0530, Anup Patel wrote:
...
On 16 August 2013 22:44, Christoffer Dall christoffer.dall@linaro.orgwrote:
...
On Fri, Aug 16, 2013 at 10:32:30AM +0530, Anup Patel wrote:
...
On Thu, Aug 15, 2013 at 10:23 PM, Christoffer Dall
christoffer.dall@linaro.org wrote:
...
On Thu, Aug 15, 2013 at 04:37:27PM +0100, Marc Zyngier wrote:
...
On 2013-08-15 16:13, Anup Patel wrote:
...
Hi Marc,
On Thu, Aug 15, 2013 at 8:17 PM, Marc Zyngier <marc.zyngier@arm.com
...
...
...
wrote:
> On 2013-08-15 14:31, Anup Patel wrote:
>>
>> On Thu, Aug 15, 2013 at 2:01 PM, Marc Zyngier
>> marc.zyngier@arm.com
>> wrote:
>>>
>>> On 2013-08-15 07:26, Anup Patel wrote:
>>>>
>>>>
>>>> Hi Marc,
>>>>
>>>> On Thu, Aug 15, 2013 at 10:22 AM, Marc Zyngier
>>>> marc.zyngier@arm.com
>>>> wrote:
>>>>>
>>>>>
>>>>> Hi Anup,
>>>>>
>>>>>
>>>>> On 2013-08-14 15:22, Anup Patel wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Aug 14, 2013 at 5:34 PM, Marc Zyngier
>>>>>> marc.zyngier@arm.com
>>>>>> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Hi Pranav,
>>>>>>>
>>>>>>> On 2013-08-14 12:47, Pranavkumar Sawargaonkar wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Systems with large external L3-cache (few MBs), might have
>>>>>>>> dirty
>>>>>>>> content belonging to the guest page in L3-cache. To tackle
>>>>>>>> this,
>>>>>>>> we need to flush such dirty content from d-cache so that
>>>>>>>> guest
>>>>>>>> will see correct contents of guest page when guest MMU is
>>>>>>>> disabled.
>>>>>>>>
>>>>>>>> The patch fixes coherent_icache_guest_page() for external
>>>>>>>> L3-cache.
>>>>>>>>
>>>>>>>> Signed-off-by: Pranavkumar Sawargaonkar
>>>>>>>> pranavkumar@linaro.org
>>>>>>>> Signed-off-by: Anup Patel anup.patel@linaro.org
>>>>>>>> ---
>>>>>>>>  arch/arm64/include/asm/kvm_mmu.h |    2 ++
>>>>>>>>  1 file changed, 2 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/arch/arm64/include/asm/kvm_mmu.h
>>>>>>>> b/arch/arm64/include/asm/kvm_mmu.h
>>>>>>>> index efe609c..5129038 100644
>>>>>>>> --- a/arch/arm64/include/asm/kvm_mmu.h
>>>>>>>> +++ b/arch/arm64/include/asm/kvm_mmu.h
>>>>>>>> @@ -123,6 +123,8 @@ static inline void
>>>>>>>> coherent_icache_guest_page(struct kvm *kvm, gfn_t gfn)
>>>>>>>>       if (!icache_is_aliasing()) {            /* PIPT */
>>>>>>>>               unsigned long hva = gfn_to_hva(kvm, gfn);
>>>>>>>>               flush_icache_range(hva, hva + PAGE_SIZE);
>>>>>>>> +             /* Flush d-cache for systems with external
>>>>>>>> caches. */
>>>>>>>> +             __flush_dcache_area((void *) hva, PAGE_SIZE);
>>>>>>>>       } else if (!icache_is_aivivt()) {       /* non
>>>>>>>> ASID-tagged
>>>>>>>> VIVT
>>>>>>>> */
>>>>>>>>               /* any kind of VIPT cache */
>>>>>>>>               __flush_icache_all();
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> [adding Will to the discussion as we talked about this in the
>>>>>>> past]
>>>>>>>
>>>>>>> That's definitely an issue, but I'm not sure the fix is to
hit
...
...
...
...
>>>>>>> the
>>>>>>> data
>>>>>>> cache on each page mapping. It looks overkill.
>>>>>>>
>>>>>>> Wouldn't it be enough to let userspace do the cache cleaning?
>>>>>>> kvmtools
>>>>>>> knows which bits of the guest memory have been touched, and
>>>>>>> can do a
>>>>>>> "DC
>>>>>>> DVAC" on this region.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> It seems a bit unnatural to have cache cleaning is user-space.
>>>>>> I am
>>>>>> sure
>>>>>> other architectures don't have such cache cleaning in
>>>>>> user-space for
>>>>>> KVM.
>>>>>>
>>>>>>>
>>>>>>> The alternative is do it in the kernel before running any
vcpu
...
...
...
...
>>>>>>> - but
>>>>>>> that's not very nice either (have to clean the whole of the
>>>>>>> guest
>>>>>>> memory, which makes a full dcache clean more appealing).
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Actually, cleaning full d-cache by set/way upon first run of
>>>>>> VCPU was
>>>>>> our second option but current approach seemed very simple
hence
...
...
...
...
>>>>>> we went for this.
>>>>>>
>>>>>> If more people vote for full d-cache clean upon first run of
>>>>>> VCPU then
>>>>>> we should revise this patch.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Can you please give the attached patch a spin on your HW? I've
>>>>> boot-tested
>>>>> it on a model, but of course I can't really verify that it
fixes
...
...
...
...
>>>>> your
>>>>> issue.
>>>>>
>>>>> As far as I can see, it should fix it without any additional
>>>>> flushing.
>>>>>
>>>>> Please let me know how it goes.
>>>>
>>>>
>>>>
>>>> HCR_EL2.DC=1 means all memory access for Stage1 MMU off are
>>>> treated as "Normal Non-shareable, Inner Write-Back
>>>> Write-Allocate,
>>>> Outer Write-Back Write-Allocate memory"
>>>>
>>>> HCR_EL2.DC=0 means all memory access for Stage1 MMU off are
>>>> treated as "Strongly-ordered device memory"
>>>>
>>>> Now if Guest/VM access hardware MMIO devices directly (such as
>>>> VGIC CPU interface) with MMU off then MMIO devices will be
>>>> accessed as normal memory when HCR_EL2.DC=1.
>>>
>>>
>>>
>>> I don't think so. Stage-2 still applies, and should force MMIO to
>>> be
>>> accessed as device memory.
>>>
>>>
>>>> The HCR_EL2.DC=1 makes sense only if we have all software
>>>> emulated devices for Guest/VM which is not true for KVM ARM or
>>>> KVM ARM64 because we use VGIC.
>>>>
>>>> IMHO, this patch enforces incorrect memory attribute for
Guest/VM
...
...
...
...
>>>> when Stage1 MMU is off.
>>>
>>>
>>>
>>> See above. My understanding is that HCR.DC controls the default
>>> output of
>>> Stage-1, and Stage-2 overrides still apply.
>>
>>
>> You had mentioned that PAGE_S2_DEVICE attribute was redundant
>> and wanted guest to decide the memory attribute. In other words,
>> you
>> did not want to enforce any memory attribute in Stage2.
>>
>> Please refer to this patch
>> https://patchwork.kernel.org/patch/2543201/
>
>
> This patch has never been merged. If you carry on following the
> discussion,
> you will certainly notice it was dropped for a very good reason:
>
https://lists.cs.columbia.edu/pipermail/kvmarm/2013-May/005827.html
...
...
...
...
>
> So Stage-2 memory attributes are used, they are not going away, and
> they are
> essential to the patch I sent this morning.
>
>
>         M.
> --
> Fast, cheap, reliable. Pick two.
HCR_EL2.DC=1 will break Guest/VM bootloader/firmware if the Guest/VM
is
provided a DMA-capable device in pass-through mode. The reason being
bootloader/firmware typically don't enable MMU and such
bootloader/firmware
will programme a pass-through DMA-capable device without any flushes
to
guest RAM (because it has not enabled MMU).
A good example of such a device would be SATA AHCI controller given
to a
Guest/VM as direct access (using SystemMMU) and Guest
bootloader/firmware
accessing this SATA AHCI controller to load kernel images from SATA
disk.
In this situation, we will have to hack Guest bootloader/firmware
AHCI driver to
explicitly flushes to Guest RAM (because have HCR_EL2.DC=1).
OK. So *that* looks like a valid argument against HCR_EL2.DC==1. Out
of
...
...
...
curiosity: is that a made up example or something you actually have?
Back to square one:
Can you please benchmark the various cache cleaning options (global at
startup time, per-page on S2 translation fault, and user-space)?
Eh, why is this a more valid argument than the vgic?  The device
passthrough Stage-2 mappings would still have the Stage-2 memory
attributes to configure that memory region as device memory.  Why is it
relevant if the device is DMA-capable in this context?
Most ARM bootloader/firmware run with MMU off hence, they will not do
explicit cache flushes when programming DMA-capable device. Now If
HCR_EL2.DC=1 then Guest RAM updates done by bootloader/firmware
will not be visible to DMA-capable device.
Read my question again: The bootloaders running with the MMU off in a VM
will only disable the MMU for Stage-1 translations.  Stage-2
translations are still enforced using hte Stage-2 page tables and their
attributes for all mappings to devices will still enforce
strongly-ordered device type memory.
Please read my reply again. Also try to read-up SATA AHCI spec.
Can you please explain how the specs realate to my question?
The only case I can see is if the guest puts data in a DMA region that
it expects the device to read.  Is this the case?
Just so we're clear, this is quite different from programming the device
through the device memory.
-Christoffer

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: [PATCH] ARM64: KVM: Fix coherent_icache_guest_page() for host with external L3-cache.