From: Kirill A. Shutemov kirill.shutemov@linux.intel.com
commit 373e715e31bf4e0f129befe87613a278fac228d3 upstream.
All normal kernel memory is "TDX private memory". This includes everything from kernel stacks to kernel text. Handling exceptions on arbitrary accesses to kernel memory is essentially impossible because they can happen in horribly nasty places like kernel entry/exit. But, TDX hardware can theoretically _deliver_ a virtualization exception (#VE) on any access to private memory.
But, it's not as bad as it sounds. TDX can be configured to never deliver these exceptions on private memory with a "TD attribute" called ATTR_SEPT_VE_DISABLE. The guest has no way to *set* this attribute, but it can check it.
Ensure ATTR_SEPT_VE_DISABLE is set in early boot. panic() if it is unset. There is no sane way for Linux to run with this attribute clear so a panic() is appropriate.
There's small window during boot before the check where kernel has an early #VE handler. But the handler is only for port I/O and will also panic() as soon as it sees any other #VE, such as a one generated by a private memory access.
[ dhansen: Rewrite changelog and rebase on new tdx_parse_tdinfo(). Add Kirill's tested-by because I made changes since he wrote this. ]
Fixes: 9a22bf6debbf ("x86/traps: Add #VE support for TDX guest") Reported-by: ruogui.ygr@alibaba-inc.com Signed-off-by: Kirill A. Shutemov kirill.shutemov@linux.intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Tested-by: Kirill A. Shutemov kirill.shutemov@linux.intel.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20221028141220.29217-3-kirill.shutemov%40linux.i... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/coco/tdx/tdx.c | 21 ++++++++++++++++----- 1 file changed, 16 insertions(+), 5 deletions(-)
--- a/arch/x86/coco/tdx/tdx.c +++ b/arch/x86/coco/tdx/tdx.c @@ -34,6 +34,8 @@ #define VE_GET_PORT_NUM(e) ((e) >> 16) #define VE_IS_IO_STRING(e) ((e) & BIT(4))
+#define ATTR_SEPT_VE_DISABLE BIT(28) + /* * Wrapper for standard use of __tdx_hypercall with no output aside from * return code. @@ -102,6 +104,7 @@ static void tdx_parse_tdinfo(u64 *cc_mas { struct tdx_module_output out; unsigned int gpa_width; + u64 td_attr;
/* * TDINFO TDX module call is used to get the TD execution environment @@ -109,19 +112,27 @@ static void tdx_parse_tdinfo(u64 *cc_mas * information, etc. More details about the ABI can be found in TDX * Guest-Host-Communication Interface (GHCI), section 2.4.2 TDCALL * [TDG.VP.INFO]. - * - * The GPA width that comes out of this call is critical. TDX guests - * can not meaningfully run without it. */ tdx_module_call(TDX_GET_INFO, 0, 0, 0, 0, &out);
- gpa_width = out.rcx & GENMASK(5, 0); - /* * The highest bit of a guest physical address is the "sharing" bit. * Set it for shared pages and clear it for private pages. + * + * The GPA width that comes out of this call is critical. TDX guests + * can not meaningfully run without it. */ + gpa_width = out.rcx & GENMASK(5, 0); *cc_mask = BIT_ULL(gpa_width - 1); + + /* + * The kernel can not handle #VE's when accessing normal kernel + * memory. Ensure that no #VE will be delivered for accesses to + * TD-private memory. Only VMM-shared memory (MMIO) will #VE. + */ + td_attr = out.rdx; + if (!(td_attr & ATTR_SEPT_VE_DISABLE)) + panic("TD misconfiguration: SEPT_VE_DISABLE attibute must be set.\n"); }
/*