From: Quentin Perret qperret@google.com
commit 43c1ff8b75011bc3e3e923adf31ba815864a2494 upstream.
Memory regions marked as "no-map" in the host device-tree routinely include TrustZone carev-outs and DMA pools. Although donating such pages to the hypervisor may not breach confidentiality, it could be used to corrupt its state in uncontrollable ways. To prevent this, let's block host-initiated memory transitions targeting "no-map" pages altogether in nVHE protected mode as there should be no valid reason to do this in current operation.
Thankfully, the pKVM EL2 hypervisor has a full copy of the host's list of memblock regions, so we can easily check for the presence of the MEMBLOCK_NOMAP flag on a region containing pages being donated from the host.
Reviewed-by: Philippe Mathieu-Daudé philmd@linaro.org Tested-by: Vincent Donnefort vdonnefort@google.com Signed-off-by: Quentin Perret qperret@google.com Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20221110190259.26861-8-will@kernel.org [ bp: clean ] Signed-off-by: Suraj Jitindar Singh surajjs@amazon.com --- arch/arm64/kvm/hyp/nvhe/mem_protect.c | 22 ++++++++++++++++------ 1 file changed, 16 insertions(+), 6 deletions(-)
diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c index 07f9dc9848ef..0f6c053686c7 100644 --- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c +++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c @@ -195,7 +195,7 @@ struct kvm_mem_range { u64 end; };
-static bool find_mem_range(phys_addr_t addr, struct kvm_mem_range *range) +static struct memblock_region *find_mem_range(phys_addr_t addr, struct kvm_mem_range *range) { int cur, left = 0, right = hyp_memblock_nr; struct memblock_region *reg; @@ -218,18 +218,28 @@ static bool find_mem_range(phys_addr_t addr, struct kvm_mem_range *range) } else { range->start = reg->base; range->end = end; - return true; + return reg; } }
- return false; + return NULL; }
bool addr_is_memory(phys_addr_t phys) { struct kvm_mem_range range;
- return find_mem_range(phys, &range); + return !!find_mem_range(phys, &range); +} + +static bool addr_is_allowed_memory(phys_addr_t phys) +{ + struct memblock_region *reg; + struct kvm_mem_range range; + + reg = find_mem_range(phys, &range); + + return reg && !(reg->flags & MEMBLOCK_NOMAP); }
static bool is_in_mem_range(u64 addr, struct kvm_mem_range *range) @@ -348,7 +358,7 @@ static bool host_stage2_force_pte_cb(u64 addr, u64 end, enum kvm_pgtable_prot pr static int host_stage2_idmap(u64 addr) { struct kvm_mem_range range; - bool is_memory = find_mem_range(addr, &range); + bool is_memory = !!find_mem_range(addr, &range); enum kvm_pgtable_prot prot; int ret;
@@ -425,7 +435,7 @@ static int __check_page_state_visitor(u64 addr, u64 end, u32 level, struct check_walk_data *d = arg; kvm_pte_t pte = *ptep;
- if (kvm_pte_valid(pte) && !addr_is_memory(kvm_pte_to_phys(pte))) + if (kvm_pte_valid(pte) && !addr_is_allowed_memory(kvm_pte_to_phys(pte))) return -EINVAL;
return d->get_page_state(pte) == d->desired ? 0 : -EPERM;
From: Will Deacon will@kernel.org
commit 09cce60bddd6461a93a5bf434265a47827d1bc6f upstream.
Since host stage-2 mappings are created lazily, we cannot rely solely on the pte in order to recover the target physical address when checking a host-initiated memory transition as this permits donation of unmapped regions corresponding to MMIO or "no-map" memory.
Instead of inspecting the pte, move the addr_is_allowed_memory() check into the host callback function where it is passed the physical address directly from the walker.
Cc: Quentin Perret qperret@google.com Fixes: e82edcc75c4e ("KVM: arm64: Implement do_share() helper for sharing memory") Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20230518095844.1178-1-will@kernel.org [ bp: s/ctx->addr/addr in __check_page_state_visitor due to missing commit "KVM: arm64: Combine visitor arguments into a context structure" in stable. ] Signed-off-by: Suraj Jitindar Singh surajjs@amazon.com --- arch/arm64/kvm/hyp/nvhe/mem_protect.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c index 0f6c053686c7..0faa330a41ed 100644 --- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c +++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c @@ -424,7 +424,7 @@ struct pkvm_mem_share {
struct check_walk_data { enum pkvm_page_state desired; - enum pkvm_page_state (*get_page_state)(kvm_pte_t pte); + enum pkvm_page_state (*get_page_state)(kvm_pte_t pte, u64 addr); };
static int __check_page_state_visitor(u64 addr, u64 end, u32 level, @@ -435,10 +435,7 @@ static int __check_page_state_visitor(u64 addr, u64 end, u32 level, struct check_walk_data *d = arg; kvm_pte_t pte = *ptep;
- if (kvm_pte_valid(pte) && !addr_is_allowed_memory(kvm_pte_to_phys(pte))) - return -EINVAL; - - return d->get_page_state(pte) == d->desired ? 0 : -EPERM; + return d->get_page_state(pte, addr) == d->desired ? 0 : -EPERM; }
static int check_page_state_range(struct kvm_pgtable *pgt, u64 addr, u64 size, @@ -453,8 +450,11 @@ static int check_page_state_range(struct kvm_pgtable *pgt, u64 addr, u64 size, return kvm_pgtable_walk(pgt, addr, size, &walker); }
-static enum pkvm_page_state host_get_page_state(kvm_pte_t pte) +static enum pkvm_page_state host_get_page_state(kvm_pte_t pte, u64 addr) { + if (!addr_is_allowed_memory(addr)) + return PKVM_NOPAGE; + if (!kvm_pte_valid(pte) && pte) return PKVM_NOPAGE;
@@ -521,7 +521,7 @@ static int host_initiate_unshare(u64 *completer_addr, return __host_set_page_state_range(addr, size, PKVM_PAGE_OWNED); }
-static enum pkvm_page_state hyp_get_page_state(kvm_pte_t pte) +static enum pkvm_page_state hyp_get_page_state(kvm_pte_t pte, u64 addr) { if (!kvm_pte_valid(pte)) return PKVM_NOPAGE;
On Wed, 20 Sep 2023 20:27:29 +0100, Suraj Jitindar Singh surajjs@amazon.com wrote:
From: Will Deacon will@kernel.org
commit 09cce60bddd6461a93a5bf434265a47827d1bc6f upstream.
Since host stage-2 mappings are created lazily, we cannot rely solely on the pte in order to recover the target physical address when checking a host-initiated memory transition as this permits donation of unmapped regions corresponding to MMIO or "no-map" memory.
Instead of inspecting the pte, move the addr_is_allowed_memory() check into the host callback function where it is passed the physical address directly from the walker.
Cc: Quentin Perret qperret@google.com Fixes: e82edcc75c4e ("KVM: arm64: Implement do_share() helper for sharing memory") Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20230518095844.1178-1-will@kernel.org [ bp: s/ctx->addr/addr in __check_page_state_visitor due to missing commit "KVM: arm64: Combine visitor arguments into a context structure" in stable. ]
Same question.
Signed-off-by: Suraj Jitindar Singh surajjs@amazon.com
Again, I find this backport pretty pointless. What is the rationale for it?
M.
On Thu, 2023-09-21 at 08:15 +0100, Marc Zyngier wrote:
On Wed, 20 Sep 2023 20:27:29 +0100, Suraj Jitindar Singh surajjs@amazon.com wrote:
From: Will Deacon will@kernel.org
commit 09cce60bddd6461a93a5bf434265a47827d1bc6f upstream.
Since host stage-2 mappings are created lazily, we cannot rely solely on the pte in order to recover the target physical address when checking a host-initiated memory transition as this permits donation of unmapped regions corresponding to MMIO or "no-map" memory.
Instead of inspecting the pte, move the addr_is_allowed_memory() check into the host callback function where it is passed the physical address directly from the walker.
Cc: Quentin Perret qperret@google.com Fixes: e82edcc75c4e ("KVM: arm64: Implement do_share() helper for sharing memory") Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20230518095844.1178-1-will@kernel.org [ bp: s/ctx->addr/addr in __check_page_state_visitor due to missing commit "KVM: arm64: Combine visitor arguments into a context structure" in stable. ]
Same question.
Noting what changes were made to the patch from the upstream mainline version when it was applied to the stable tree.
Signed-off-by: Suraj Jitindar Singh surajjs@amazon.com
Again, I find this backport pretty pointless. What is the rationale for it?
The 2 patches were backported to address CVE-2023-21264. This one addresses the CVE.
Thanks
M.
On Wed, 20 Sep 2023 20:27:28 +0100, Suraj Jitindar Singh surajjs@amazon.com wrote:
From: Quentin Perret qperret@google.com
commit 43c1ff8b75011bc3e3e923adf31ba815864a2494 upstream.
Memory regions marked as "no-map" in the host device-tree routinely include TrustZone carev-outs and DMA pools. Although donating such pages to the hypervisor may not breach confidentiality, it could be used to corrupt its state in uncontrollable ways. To prevent this, let's block host-initiated memory transitions targeting "no-map" pages altogether in nVHE protected mode as there should be no valid reason to do this in current operation.
Thankfully, the pKVM EL2 hypervisor has a full copy of the host's list of memblock regions, so we can easily check for the presence of the MEMBLOCK_NOMAP flag on a region containing pages being donated from the host.
Reviewed-by: Philippe Mathieu-Daudé philmd@linaro.org Tested-by: Vincent Donnefort vdonnefort@google.com Signed-off-by: Quentin Perret qperret@google.com Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20221110190259.26861-8-will@kernel.org [ bp: clean ]
What is this?
Signed-off-by: Suraj Jitindar Singh surajjs@amazon.com
What is the rationale for backporting this? It wasn't tagged as Cc: to stable for a reason: pKVM isn't functional upstream, and won't be for the next couple of cycles *at least*.
So at it stands, I'm against such a backport.
Thanks,
M.
On Thu, 2023-09-21 at 08:13 +0100, Marc Zyngier wrote:
On Wed, 20 Sep 2023 20:27:28 +0100, Suraj Jitindar Singh surajjs@amazon.com wrote:
From: Quentin Perret qperret@google.com
commit 43c1ff8b75011bc3e3e923adf31ba815864a2494 upstream.
Memory regions marked as "no-map" in the host device-tree routinely include TrustZone carev-outs and DMA pools. Although donating such pages to the hypervisor may not breach confidentiality, it could be used to corrupt its state in uncontrollable ways. To prevent this, let's block host-initiated memory transitions targeting "no-map" pages altogether in nVHE protected mode as there should be no valid reason to do this in current operation.
Thankfully, the pKVM EL2 hypervisor has a full copy of the host's list of memblock regions, so we can easily check for the presence of the MEMBLOCK_NOMAP flag on a region containing pages being donated from the host.
Reviewed-by: Philippe Mathieu-Daudé philmd@linaro.org Tested-by: Vincent Donnefort vdonnefort@google.com Signed-off-by: Quentin Perret qperret@google.com Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20221110190259.26861-8-will@kernel.org [ bp: clean ]
What is this?
Noting any details about the backport. In this case it was a clean backport.
Signed-off-by: Suraj Jitindar Singh surajjs@amazon.com
What is the rationale for backporting this? It wasn't tagged as Cc: to stable for a reason: pKVM isn't functional upstream, and won't be for the next couple of cycles *at least*.
So at it stands, I'm against such a backport.
The 2 patches were backported to address CVE-2023-21264. This one provides context for the proceeding patch.
I wasn't aware that it's non functional. Does this mean that the code won't be compiled or just that it can't actually be run currently from the upstream codebase?
I guess I'm trying to understand if the conditions of the CVE are a real concern even if it isn't technically functional.
Thanks
Thanks,
M.
On Thu, Sep 21, 2023 at 10:22:54PM +0000, Jitindar Singh, Suraj wrote:
On Thu, 2023-09-21 at 08:13 +0100, Marc Zyngier wrote:
On Wed, 20 Sep 2023 20:27:28 +0100, Suraj Jitindar Singh surajjs@amazon.com wrote:
From: Quentin Perret qperret@google.com
commit 43c1ff8b75011bc3e3e923adf31ba815864a2494 upstream.
Memory regions marked as "no-map" in the host device-tree routinely include TrustZone carev-outs and DMA pools. Although donating such pages to the hypervisor may not breach confidentiality, it could be used to corrupt its state in uncontrollable ways. To prevent this, let's block host-initiated memory transitions targeting "no-map" pages altogether in nVHE protected mode as there should be no valid reason to do this in current operation.
Thankfully, the pKVM EL2 hypervisor has a full copy of the host's list of memblock regions, so we can easily check for the presence of the MEMBLOCK_NOMAP flag on a region containing pages being donated from the host.
Reviewed-by: Philippe Mathieu-Daudé philmd@linaro.org Tested-by: Vincent Donnefort vdonnefort@google.com Signed-off-by: Quentin Perret qperret@google.com Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20221110190259.26861-8-will@kernel.org [ bp: clean ]
What is this?
Noting any details about the backport. In this case it was a clean backport.
Signed-off-by: Suraj Jitindar Singh surajjs@amazon.com
What is the rationale for backporting this? It wasn't tagged as Cc: to stable for a reason: pKVM isn't functional upstream, and won't be for the next couple of cycles *at least*.
So at it stands, I'm against such a backport.
The 2 patches were backported to address CVE-2023-21264. This one provides context for the proceeding patch.
I wasn't aware that it's non functional. Does this mean that the code won't be compiled or just that it can't actually be run currently from the upstream codebase?
I guess I'm trying to understand if the conditions of the CVE are a real concern even if it isn't technically functional.
Why do you think the CVE is actually even valid? Who filed it and why?
Remember, CVEs almost never mean anything for the kernel, they are not able to be given out by the kernel security team, and they just don't make any sense for us.
I'll go drop these patches from the stable queues for now, and wait for you all to agree what is happening here.
thanks,
greg k-h
On Thu, 21 Sep 2023 23:22:54 +0100, "Jitindar Singh, Suraj" surajjs@amazon.com wrote:
On Thu, 2023-09-21 at 08:13 +0100, Marc Zyngier wrote:
On Wed, 20 Sep 2023 20:27:28 +0100, Suraj Jitindar Singh surajjs@amazon.com wrote:
From: Quentin Perret qperret@google.com
commit 43c1ff8b75011bc3e3e923adf31ba815864a2494 upstream.
Memory regions marked as "no-map" in the host device-tree routinely include TrustZone carev-outs and DMA pools. Although donating such pages to the hypervisor may not breach confidentiality, it could be used to corrupt its state in uncontrollable ways. To prevent this, let's block host-initiated memory transitions targeting "no-map" pages altogether in nVHE protected mode as there should be no valid reason to do this in current operation.
Thankfully, the pKVM EL2 hypervisor has a full copy of the host's list of memblock regions, so we can easily check for the presence of the MEMBLOCK_NOMAP flag on a region containing pages being donated from the host.
Reviewed-by: Philippe Mathieu-Daudé philmd@linaro.org Tested-by: Vincent Donnefort vdonnefort@google.com Signed-off-by: Quentin Perret qperret@google.com Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20221110190259.26861-8-will@kernel.org [ bp: clean ]
What is this?
Noting any details about the backport. In this case it was a clean backport.
I don't think this has anything to do here. If you want to add a note indicating what was changed in the patch, make it *extremely* visible in the commit message, and not hidden as some obscure form of metadata.
Signed-off-by: Suraj Jitindar Singh surajjs@amazon.com
What is the rationale for backporting this? It wasn't tagged as Cc: to stable for a reason: pKVM isn't functional upstream, and won't be for the next couple of cycles *at least*.
So at it stands, I'm against such a backport.
The 2 patches were backported to address CVE-2023-21264. This one provides context for the proceeding patch.
I care about CVEs as much as I care about holes in my socks (i.e. very little). If there is a concern, it should be brought up on the list as a discussion, and not as a consequence of some script kiddie automatically generating CVEs.
I wasn't aware that it's non functional. Does this mean that the code won't be compiled or just that it can't actually be run currently from the upstream codebase?
This code is inactive unless you pass the correct option on the command line, and as it is brings zero benefit over standard KVM. The only place this matters is in the Android kernel, as it has full support for pKVM, and has the fix already. We carry it upstream at a courtesy to the pKVM developers, but that's about it.
I guess I'm trying to understand if the conditions of the CVE are a real concern even if it isn't technically functional.
This CVE is a waste of precious bytes, and I have no interest in seeing this backported.
Thanks,
M.
linux-stable-mirror@lists.linaro.org