From: Borislav Petkov bp@alien8.de [...] On Mon, Jul 08, 2024 at 06:39:45PM +0000, Dexuan Cui wrote:
When a TDX guest runs on Hyper-V, the hv_netvsc driver's netvsc_init_buf() allocates buffers using vzalloc(), and needs to share the buffers with the host OS by calling set_memory_decrypted(), which is not working for vmalloc() yet. Add the support by handling the pages one by one.
"Add support..." and the patch is cc:stable?
I meant to use "Cc: stable@vger.kernel.org # 6.6+". Sorry for missing the "# 6.6+".
This looks like it is fixing something and considering how you're rushing this, I'd let this cook for a whole round and queue it after 6.11-rc1. So that it gets tested properly.
x86/tdx: Fix set_memory_decrypted() for vmalloc() buffers
When a TD mode Linux TDX VM runs on Hyper-V, the Linux hv_netvsc driver needs to share a vmalloc()'d buffer with the host OS: see netvsc_init_buf() -> vmbus_establish_gpadl() -> ... -> __vmbus_establish_gpadl() -> set_memory_decrypted().
Currently set_memory_decrypted() doesn't work for a vmalloc()'d buffer because tdx_enc_status_changed() uses __pa(vaddr), i.e., it assumes that the 'vaddr' can't be from vmalloc(), and consequently hv_netvsc fails to load.
Fix this by handling the pages one by one.
hv_netvsc is the first user of vmalloc() + set_memory_decrypted(), which is why nobody noticed this until now.
v6.6 is a longterm kernel, which is used by some distros, so I hope this patch can be in v6.6.y and newer, so it won't be carried out of tree.
I think the patch (without Kirill's kexec fix) has been well tested, e.g., it has been in Ubuntu's linux-azure kernel for about 2 years. Kirill's kexec fix works in my testing and it looks safe to me.
I hope this can be in 6.11-rc1 if you see no high risks. It's also fine to me if you decide to queue the patch after 6.11-rc1.
Co-developed-by: Kirill A. Shutemov kirill.shutemov@linux.intel.com Signed-off-by: Kirill A. Shutemov kirill.shutemov@linux.intel.com
https://lwn.net/ml/linux-kernel/20230412151937.pxfyralfichwzyv6@box/
Signed-off-by: Dexuan Cui decui@microsoft.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com
https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?id=e1b8a...
Reviewed-by: Michael Kelley mikelley@microsoft.com
https://lwn.net/ml/linux-kernel/BYAPR21MB16885F59B6F5594F31AE957AD79A9@BYAPR...
Reviewed-by: Kuppuswamy Sathyanarayanan
https://lwn.net/ml/linux-kernel/d20baf1e-a736-667f-2082-0c0539013f2b@linux.i...
Reviewed-by: Rick Edgecombe rick.p.edgecombe@intel.com
https://lwn.net/ml/linux-kernel/e8b1b0b5f32115c0ef8f1aeb0b805c4d9a953b31.cam...
Reviewed-by: Dave Hansen dave.hansen@linux.intel.com
https://lwn.net/ml/linux-kernel/4732ef96-9a47-3513-4494-48e4684d65cd@intel.c...
Acked-by: Kai Huang kai.huang@intel.com
https://lwn.net/ml/linux-kernel/6b6e7f943b7e28fa6ae6c77e1002ac61c41c1ee2.cam...
When were you able to collect all those tags on a newly submitted patch?
This is not really a newly submitted patch :-) Please refer to the links above.
v9 was posted here (Jun 2023): https://lwn.net/ml/linux-kernel/20230621191317.4129-3-decui@microsoft.com/
v10 was posted here (Aug 2023): https://lwn.net/ml/linux-kernel/20230811214826.9609-3-decui%40microsoft.com/
The last submission was May 2024: https://lwn.net/ml/linux-kernel/20240521021238.1803-1-decui@microsoft.com/ (Sorry, I should have made it clear that this is actually v11)
Do you even know what the meaning of those tags is or you just slap them willy-nilly, just for fun?
The original patch was submitted in Nov 2022: https://lwn.net/ml/linux-kernel/20221121195151.21812-4-decui@microsoft.com/
I added Kirill's Co-developed-by in v4 (Apr 2023) https://lwn.net/ml/linux-kernel/20230412151937.pxfyralfichwzyv6@box/ and added Kirill's Signed-off-by in v5, and added other people's Reviewed-by and Acked-by over time. There are only minor changes since v4, so I think it's appropriate to keep all the tags in the final commit.
Cc: stable@vger.kernel.org
Why?
Fixes: what?
Please refer to my reply above.
This is not to fix a buggy commit. The described scenario never worked before, so I suppose a "Fixes:" tag is not needed.
From reading this, it seems to me you need to brush up on https://kernel.org/doc/html/latest/process/submitting-patches.html
Thanks for the link! I read it and did learn something.
while waiting.
Thx.
-- Regards/Gruss, Boris.
I hope I have provided a satisfactory reply above.
How do you like the v12 below? It's also attached. If this looks good to you, I can post it today or tomorrow.
Thanks, Dexuan
From 132f656fdbf3b4f00752140aac10f3674b598b5a Mon Sep 17 00:00:00 2001 From: Dexuan Cui decui@microsoft.com Date: Mon, 20 May 2024 19:12:38 -0700 Subject: [PATCH v12] x86/tdx: Fix set_memory_decrypted() for vmalloc() buffers
When a TD mode Linux TDX VM runs on Hyper-V, the Linux hv_netvsc driver needs to share a vmalloc()'d buffer with the host OS: see netvsc_init_buf() -> vmbus_establish_gpadl() -> ... -> __vmbus_establish_gpadl() -> set_memory_decrypted().
Currently set_memory_decrypted() doesn't work for a vmalloc()'d buffer because tdx_enc_status_changed() uses __pa(vaddr), i.e., it assumes that the 'vaddr' can't be from vmalloc(), and consequently hv_netvsc fails to load.
Fix this by handling the pages one by one.
hv_netvsc is the first user of vmalloc() + set_memory_decrypted(), which is why nobody noticed this until now.
Co-developed-by: Kirill A. Shutemov kirill.shutemov@linux.intel.com Signed-off-by: Kirill A. Shutemov kirill.shutemov@linux.intel.com Signed-off-by: Dexuan Cui decui@microsoft.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Reviewed-by: Michael Kelley mikelley@microsoft.com Reviewed-by: Kuppuswamy Sathyanarayanan sathyanarayanan.kuppuswamy@linux.intel.com Reviewed-by: Rick Edgecombe rick.p.edgecombe@intel.com Reviewed-by: Dave Hansen dave.hansen@linux.intel.com Acked-by: Kai Huang kai.huang@intel.com Cc: stable@vger.kernel.org # 6.6+ --- arch/x86/coco/tdx/tdx.c | 43 ++++++++++++++++++++++++++++++++++------- 1 file changed, 36 insertions(+), 7 deletions(-)
diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c index 078e2bac25531..8f471260924f7 100644 --- a/arch/x86/coco/tdx/tdx.c +++ b/arch/x86/coco/tdx/tdx.c @@ -8,6 +8,7 @@ #include <linux/export.h> #include <linux/io.h> #include <linux/kexec.h> +#include <linux/mm.h> #include <asm/coco.h> #include <asm/tdx.h> #include <asm/vmx.h> @@ -782,6 +783,19 @@ static bool tdx_map_gpa(phys_addr_t start, phys_addr_t end, bool enc) return false; }
+static bool tdx_enc_status_changed_phys(phys_addr_t start, phys_addr_t end, + bool enc) +{ + if (!tdx_map_gpa(start, end, enc)) + return false; + + /* shared->private conversion requires memory to be accepted before use */ + if (enc) + return tdx_accept_memory(start, end); + + return true; +} + /* * Inform the VMM of the guest's intent for this physical page: shared with * the VMM or private to the guest. The VMM is expected to change its mapping @@ -789,15 +803,30 @@ static bool tdx_map_gpa(phys_addr_t start, phys_addr_t end, bool enc) */ static bool tdx_enc_status_changed(unsigned long vaddr, int numpages, bool enc) { - phys_addr_t start = __pa(vaddr); - phys_addr_t end = __pa(vaddr + numpages * PAGE_SIZE); + unsigned long start = vaddr; + unsigned long end = start + numpages * PAGE_SIZE; + unsigned long step = end - start; + unsigned long addr; + + /* Step through page-by-page for vmalloc() mappings */ + if (is_vmalloc_addr((void *)vaddr)) + step = PAGE_SIZE; + + for (addr = start; addr < end; addr += step) { + phys_addr_t start_pa; + phys_addr_t end_pa; + + /* The check fails on vmalloc() mappings */ + if (virt_addr_valid(addr)) + start_pa = __pa(addr); + else + start_pa = slow_virt_to_phys((void *)addr);
- if (!tdx_map_gpa(start, end, enc)) - return false; + end_pa = start_pa + step;
- /* shared->private conversion requires memory to be accepted before use */ - if (enc) - return tdx_accept_memory(start, end); + if (!tdx_enc_status_changed_phys(start_pa, end_pa, enc)) + return false; + }
return true; }