From: Ard Biesheuvel ardb@kernel.org
This is the final batch of changes to bring linux-6.1.y in sync with 6.6 and later in terms of compatibility with tightened boot security requirements imposed by MicroSoft, compliance with which is a prerequisite for them to be willing to resume signing distro shim images with the MS 3rd party secure boot certificate.
Without this, distros can only boot on off-the-shelf x86 PCs after disabling secure boot explicitly.
Most of these changes appeared in v6.8 and have been backported to v6.6 already.
Ard Biesheuvel (20): x86/efi: Drop EFI stub .bss from .data section x86/efi: Disregard setup header of loaded image x86/efistub: Reinstate soft limit for initrd loading x86/efi: Drop alignment flags from PE section headers x86/boot: Remove the 'bugger off' message x86/boot: Omit compression buffer from PE/COFF image memory footprint x86/boot: Drop redundant code setting the root device x86/boot: Drop references to startup_64 x86/boot: Grab kernel_info offset from zoffset header directly x86/boot: Set EFI handover offset directly in header asm x86/boot: Define setup size in linker script x86/boot: Derive file size from _edata symbol x86/boot: Construct PE/COFF .text section from assembler x86/boot: Drop PE/COFF .reloc section x86/boot: Split off PE/COFF .data section x86/boot: Increase section and file alignment to 4k/512 x86/efistub: Use 1:1 file:memory mapping for PE/COFF .compat section x86/sme: Move early SME kernel encryption handling into .head.text x86/sev: Move early startup code into .head.text section x86/efistub: Remap kernel text read-only before dropping NX attribute
Hou Wenlong (2): x86/head/64: Add missing __head annotation to startup_64_load_idt() x86/head/64: Move the __head definition to <asm/init.h>
Pasha Tatashin (1): x86/mm: Remove P*D_PAGE_MASK and P*D_PAGE_SIZE macros
arch/x86/boot/Makefile | 2 +- arch/x86/boot/compressed/Makefile | 2 +- arch/x86/boot/compressed/misc.c | 1 + arch/x86/boot/compressed/sev.c | 3 + arch/x86/boot/compressed/vmlinux.lds.S | 6 +- arch/x86/boot/header.S | 211 ++++++--------- arch/x86/boot/setup.ld | 14 +- arch/x86/boot/tools/build.c | 273 +------------------- arch/x86/include/asm/boot.h | 1 + arch/x86/include/asm/init.h | 2 + arch/x86/include/asm/mem_encrypt.h | 8 +- arch/x86/include/asm/page_types.h | 12 +- arch/x86/include/asm/sev.h | 10 +- arch/x86/kernel/amd_gart_64.c | 2 +- arch/x86/kernel/head64.c | 7 +- arch/x86/kernel/sev-shared.c | 23 +- arch/x86/kernel/sev.c | 11 +- arch/x86/mm/mem_encrypt_boot.S | 4 +- arch/x86/mm/mem_encrypt_identity.c | 58 ++--- arch/x86/mm/pat/set_memory.c | 6 +- arch/x86/mm/pti.c | 2 +- drivers/firmware/efi/libstub/Makefile | 7 - drivers/firmware/efi/libstub/x86-stub.c | 58 ++--- 23 files changed, 194 insertions(+), 529 deletions(-)
From: Ard Biesheuvel ardb@kernel.org
[ Commit 5f51c5d0e905608ba7be126737f7c84a793ae1aa upstream ]
Now that the EFI stub always zero inits its BSS section upon entry, there is no longer a need to place the BSS symbols carried by the stub into the .data section.
Signed-off-by: Ard Biesheuvel ardb@kernel.org Signed-off-by: Ingo Molnar mingo@kernel.org Link: https://lore.kernel.org/r/20230912090051.4014114-18-ardb@google.com Signed-off-by: Ard Biesheuvel ardb@kernel.org --- arch/x86/boot/compressed/vmlinux.lds.S | 1 - drivers/firmware/efi/libstub/Makefile | 7 ------- 2 files changed, 8 deletions(-)
diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S index 112b2375d021..32892e81bf61 100644 --- a/arch/x86/boot/compressed/vmlinux.lds.S +++ b/arch/x86/boot/compressed/vmlinux.lds.S @@ -46,7 +46,6 @@ SECTIONS _data = . ; *(.data) *(.data.*) - *(.bss.efistub) _edata = . ; } . = ALIGN(L1_CACHE_BYTES); diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile index 473ef18421db..748781c25787 100644 --- a/drivers/firmware/efi/libstub/Makefile +++ b/drivers/firmware/efi/libstub/Makefile @@ -102,13 +102,6 @@ lib-y := $(patsubst %.o,%.stub.o,$(lib-y)) # https://bugs.llvm.org/show_bug.cgi?id=46480 STUBCOPY_FLAGS-y += --remove-section=.note.gnu.property
-# -# For x86, bootloaders like systemd-boot or grub-efi do not zero-initialize the -# .bss section, so the .bss section of the EFI stub needs to be included in the -# .data section of the compressed kernel to ensure initialization. Rename the -# .bss section here so it's easy to pick out in the linker script. -# -STUBCOPY_FLAGS-$(CONFIG_X86) += --rename-section .bss=.bss.efistub,load,alloc STUBCOPY_RELOC-$(CONFIG_X86_32) := R_386_32 STUBCOPY_RELOC-$(CONFIG_X86_64) := R_X86_64_64
From: Ard Biesheuvel ardb@kernel.org
[ Commit 7e50262229faad0c7b8c54477cd1c883f31cc4a7 upstream ]
The native EFI entrypoint does not take a struct boot_params from the loader, but instead, it constructs one from scratch, using the setup header data placed at the start of the image.
This setup header is placed in a way that permits legacy loaders to manipulate the contents (i.e., to pass the kernel command line or the address and size of an initial ramdisk), but EFI boot does not use it in that way - it only copies the contents that were placed there at build time, but EFI loaders will not (and should not) manipulate the setup header to configure the boot. (Commit 63bf28ceb3ebbe76 "efi: x86: Wipe setup_data on pure EFI boot" deals with some of the fallout of using setup_data in a way that breaks EFI boot.)
Given that none of the non-zero values that are copied from the setup header into the EFI stub's struct boot_params are relevant to the boot now that the EFI stub no longer enters via the legacy decompressor, the copy can be omitted altogether.
Signed-off-by: Ard Biesheuvel ardb@kernel.org Signed-off-by: Ingo Molnar mingo@kernel.org Link: https://lore.kernel.org/r/20230912090051.4014114-19-ardb@google.com Signed-off-by: Ard Biesheuvel ardb@kernel.org --- drivers/firmware/efi/libstub/x86-stub.c | 46 +++----------------- 1 file changed, 6 insertions(+), 40 deletions(-)
diff --git a/drivers/firmware/efi/libstub/x86-stub.c b/drivers/firmware/efi/libstub/x86-stub.c index dc50dda40239..c592ecd40dab 100644 --- a/drivers/firmware/efi/libstub/x86-stub.c +++ b/drivers/firmware/efi/libstub/x86-stub.c @@ -426,9 +426,8 @@ void __noreturn efi_stub_entry(efi_handle_t handle, efi_status_t __efiapi efi_pe_entry(efi_handle_t handle, efi_system_table_t *sys_table_arg) { - struct boot_params *boot_params; - struct setup_header *hdr; - void *image_base; + static struct boot_params boot_params __page_aligned_bss; + struct setup_header *hdr = &boot_params.hdr; efi_guid_t proto = LOADED_IMAGE_PROTOCOL_GUID; int options_size = 0; efi_status_t status; @@ -449,30 +448,9 @@ efi_status_t __efiapi efi_pe_entry(efi_handle_t handle, efi_exit(handle, status); }
- image_base = efi_table_attr(image, image_base); - - status = efi_allocate_pages(sizeof(struct boot_params), - (unsigned long *)&boot_params, ULONG_MAX); - if (status != EFI_SUCCESS) { - efi_err("Failed to allocate lowmem for boot params\n"); - efi_exit(handle, status); - } - - memset(boot_params, 0x0, sizeof(struct boot_params)); - - hdr = &boot_params->hdr; - - /* Copy the setup header from the second sector to boot_params */ - memcpy(&hdr->jump, image_base + 512, - sizeof(struct setup_header) - offsetof(struct setup_header, jump)); - - /* - * Fill out some of the header fields ourselves because the - * EFI firmware loader doesn't load the first sector. - */ + /* Assign the setup_header fields that the kernel actually cares about */ hdr->root_flags = 1; hdr->vid_mode = 0xffff; - hdr->boot_flag = 0xAA55;
hdr->type_of_loader = 0x21;
@@ -481,25 +459,13 @@ efi_status_t __efiapi efi_pe_entry(efi_handle_t handle, if (!cmdline_ptr) goto fail;
- efi_set_u64_split((unsigned long)cmdline_ptr, - &hdr->cmd_line_ptr, &boot_params->ext_cmd_line_ptr); - - hdr->ramdisk_image = 0; - hdr->ramdisk_size = 0; - - /* - * Disregard any setup data that was provided by the bootloader: - * setup_data could be pointing anywhere, and we have no way of - * authenticating or validating the payload. - */ - hdr->setup_data = 0; + efi_set_u64_split((unsigned long)cmdline_ptr, &hdr->cmd_line_ptr, + &boot_params.ext_cmd_line_ptr);
- efi_stub_entry(handle, sys_table_arg, boot_params); + efi_stub_entry(handle, sys_table_arg, &boot_params); /* not reached */
fail: - efi_free(sizeof(struct boot_params), (unsigned long)boot_params); - efi_exit(handle, status); }
From: Ard Biesheuvel ardb@kernel.org
[ Commit decd347c2a75d32984beb8807d470b763a53b542 upstream ]
Commit
8117961d98fb2 ("x86/efi: Disregard setup header of loaded image")
dropped the memcopy of the image's setup header into the boot_params struct provided to the core kernel, on the basis that EFI boot does not need it and should rely only on a single protocol to interface with the boot chain. It is also a prerequisite for being able to increase the section alignment to 4k, which is needed to enable memory protections when running in the boot services.
So only the setup_header fields that matter to the core kernel are populated explicitly, and everything else is ignored. One thing was overlooked, though: the initrd_addr_max field in the setup_header is not used by the core kernel, but it is used by the EFI stub itself when it loads the initrd, where its default value of INT_MAX is used as the soft limit for memory allocation.
This means that, in the old situation, the initrd was virtually always loaded in the lower 2G of memory, but now, due to initrd_addr_max being 0x0, the initrd may end up anywhere in memory. This should not be an issue principle, as most systems can deal with this fine. However, it does appear to tickle some problems in older UEFI implementations, where the memory ends up being corrupted, resulting in errors when unpacking the initramfs.
So set the initrd_addr_max field to INT_MAX like it was before.
Fixes: 8117961d98fb2 ("x86/efi: Disregard setup header of loaded image") Reported-by: Radek Podgorny radek@podgorny.cz Closes: https://lore.kernel.org/all/a99a831a-8ad5-4cb0-bff9-be637311f771@podgorny.cz Signed-off-by: Ard Biesheuvel ardb@kernel.org --- drivers/firmware/efi/libstub/x86-stub.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/firmware/efi/libstub/x86-stub.c b/drivers/firmware/efi/libstub/x86-stub.c index c592ecd40dab..1f5edcb6339a 100644 --- a/drivers/firmware/efi/libstub/x86-stub.c +++ b/drivers/firmware/efi/libstub/x86-stub.c @@ -453,6 +453,7 @@ efi_status_t __efiapi efi_pe_entry(efi_handle_t handle, hdr->vid_mode = 0xffff;
hdr->type_of_loader = 0x21; + hdr->initrd_addr_max = INT_MAX;
/* Convert unicode cmdline to ascii */ cmdline_ptr = efi_convert_cmdline(image, &options_size);
From: Ard Biesheuvel ardb@kernel.org
[ Commit bfab35f552ab3dd6d017165bf9de1d1d20f198cc upstream ]
The section header flags for alignment are documented in the PE/COFF spec as being applicable to PE object files only, not to PE executables such as the Linux bzImage, so let's drop them from the PE header.
Signed-off-by: Ard Biesheuvel ardb@kernel.org Signed-off-by: Ingo Molnar mingo@kernel.org Link: https://lore.kernel.org/r/20230912090051.4014114-20-ardb@google.com Signed-off-by: Ard Biesheuvel ardb@kernel.org --- arch/x86/boot/header.S | 12 ++++-------- 1 file changed, 4 insertions(+), 8 deletions(-)
diff --git a/arch/x86/boot/header.S b/arch/x86/boot/header.S index d31982509654..38b611eb1a3c 100644 --- a/arch/x86/boot/header.S +++ b/arch/x86/boot/header.S @@ -208,8 +208,7 @@ section_table: .word 0 # NumberOfLineNumbers .long IMAGE_SCN_CNT_CODE | \ IMAGE_SCN_MEM_READ | \ - IMAGE_SCN_MEM_EXECUTE | \ - IMAGE_SCN_ALIGN_16BYTES # Characteristics + IMAGE_SCN_MEM_EXECUTE # Characteristics
# # The EFI application loader requires a relocation section @@ -229,8 +228,7 @@ section_table: .word 0 # NumberOfLineNumbers .long IMAGE_SCN_CNT_INITIALIZED_DATA | \ IMAGE_SCN_MEM_READ | \ - IMAGE_SCN_MEM_DISCARDABLE | \ - IMAGE_SCN_ALIGN_1BYTES # Characteristics + IMAGE_SCN_MEM_DISCARDABLE # Characteristics
#ifdef CONFIG_EFI_MIXED # @@ -248,8 +246,7 @@ section_table: .word 0 # NumberOfLineNumbers .long IMAGE_SCN_CNT_INITIALIZED_DATA | \ IMAGE_SCN_MEM_READ | \ - IMAGE_SCN_MEM_DISCARDABLE | \ - IMAGE_SCN_ALIGN_1BYTES # Characteristics + IMAGE_SCN_MEM_DISCARDABLE # Characteristics #endif
# @@ -270,8 +267,7 @@ section_table: .word 0 # NumberOfLineNumbers .long IMAGE_SCN_CNT_CODE | \ IMAGE_SCN_MEM_READ | \ - IMAGE_SCN_MEM_EXECUTE | \ - IMAGE_SCN_ALIGN_16BYTES # Characteristics + IMAGE_SCN_MEM_EXECUTE # Characteristics
.set section_count, (. - section_table) / 40 #endif /* CONFIG_EFI_STUB */
From: Ard Biesheuvel ardb@kernel.org
[ Commit 768171d7ebbce005210e1cf8456f043304805c15 upstream ]
Ancient (pre-2003) x86 kernels could boot from a floppy disk straight from the BIOS, using a small real mode boot stub at the start of the image where the BIOS would expect the boot record (or boot block) to appear.
Due to its limitations (kernel size < 1 MiB, no support for IDE, USB or El Torito floppy emulation), this support was dropped, and a Linux aware bootloader is now always required to boot the kernel from a legacy BIOS.
To smoothen this transition, the boot stub was not removed entirely, but replaced with one that just prints an error message telling the user to install a bootloader.
As it is unlikely that anyone doing direct floppy boot with such an ancient kernel is going to upgrade to v6.5+ and expect that this boot method still works, printing this message is kind of pointless, and so it should be possible to remove the logic that emits it.
Let's free up this space so it can be used to expand the PE header in a subsequent patch.
Signed-off-by: Ard Biesheuvel ardb@kernel.org Signed-off-by: Ingo Molnar mingo@kernel.org Acked-by: H. Peter Anvin (Intel) hpa@zytor.com Link: https://lore.kernel.org/r/20230912090051.4014114-21-ardb@google.com Signed-off-by: Ard Biesheuvel ardb@kernel.org --- arch/x86/boot/header.S | 49 -------------------- arch/x86/boot/setup.ld | 7 +-- 2 files changed, 4 insertions(+), 52 deletions(-)
diff --git a/arch/x86/boot/header.S b/arch/x86/boot/header.S index 38b611eb1a3c..b8d241e57b49 100644 --- a/arch/x86/boot/header.S +++ b/arch/x86/boot/header.S @@ -38,63 +38,14 @@ SYSSEG = 0x1000 /* historical load address >> 4 */
.code16 .section ".bstext", "ax" - - .global bootsect_start -bootsect_start: #ifdef CONFIG_EFI_STUB # "MZ", MS-DOS header .word MZ_MAGIC -#endif - - # Normalize the start address - ljmp $BOOTSEG, $start2 - -start2: - movw %cs, %ax - movw %ax, %ds - movw %ax, %es - movw %ax, %ss - xorw %sp, %sp - sti - cld - - movw $bugger_off_msg, %si - -msg_loop: - lodsb - andb %al, %al - jz bs_die - movb $0xe, %ah - movw $7, %bx - int $0x10 - jmp msg_loop - -bs_die: - # Allow the user to press a key, then reboot - xorw %ax, %ax - int $0x16 - int $0x19 - - # int 0x19 should never return. In case it does anyway, - # invoke the BIOS reset code... - ljmp $0xf000,$0xfff0 - -#ifdef CONFIG_EFI_STUB .org 0x3c # # Offset to the PE header. # .long pe_header -#endif /* CONFIG_EFI_STUB */ - - .section ".bsdata", "a" -bugger_off_msg: - .ascii "Use a boot loader.\r\n" - .ascii "\n" - .ascii "Remove disk and press any key to reboot...\r\n" - .byte 0 - -#ifdef CONFIG_EFI_STUB pe_header: .long PE_MAGIC
diff --git a/arch/x86/boot/setup.ld b/arch/x86/boot/setup.ld index 49546c247ae2..b11c45b9e51e 100644 --- a/arch/x86/boot/setup.ld +++ b/arch/x86/boot/setup.ld @@ -10,10 +10,11 @@ ENTRY(_start) SECTIONS { . = 0; - .bstext : { *(.bstext) } - .bsdata : { *(.bsdata) } + .bstext : { + *(.bstext) + . = 495; + } =0xffffffff
- . = 495; .header : { *(.header) } .entrytext : { *(.entrytext) } .inittext : { *(.inittext) }
From: Ard Biesheuvel ardb@kernel.org
[ Commit 8eace5b3555606e684739bef5bcdfcfe68235257 upstream ]
Now that the EFI stub decompresses the kernel and hands over to the decompressed image directly, there is no longer a need to provide a decompression buffer as part of the .BSS allocation of the PE/COFF image. It also means the PE/COFF image can be loaded anywhere in memory, and setting the preferred image base is unnecessary. So drop the handling of this from the header and from the build tool.
Signed-off-by: Ard Biesheuvel ardb@kernel.org Signed-off-by: Ingo Molnar mingo@kernel.org Link: https://lore.kernel.org/r/20230912090051.4014114-22-ardb@google.com Signed-off-by: Ard Biesheuvel ardb@kernel.org --- arch/x86/boot/header.S | 6 +-- arch/x86/boot/tools/build.c | 50 +++----------------- 2 files changed, 8 insertions(+), 48 deletions(-)
diff --git a/arch/x86/boot/header.S b/arch/x86/boot/header.S index b8d241e57b49..98dd4c36ccca 100644 --- a/arch/x86/boot/header.S +++ b/arch/x86/boot/header.S @@ -89,12 +89,10 @@ optional_header: #endif
extra_header_fields: - # PE specification requires ImageBase to be 64k aligned - .set image_base, (LOAD_PHYSICAL_ADDR + 0xffff) & ~0xffff #ifdef CONFIG_X86_32 - .long image_base # ImageBase + .long 0 # ImageBase #else - .quad image_base # ImageBase + .quad 0 # ImageBase #endif .long 0x20 # SectionAlignment .long 0x20 # FileAlignment diff --git a/arch/x86/boot/tools/build.c b/arch/x86/boot/tools/build.c index bd247692b701..0354c223e354 100644 --- a/arch/x86/boot/tools/build.c +++ b/arch/x86/boot/tools/build.c @@ -65,7 +65,6 @@ static unsigned long efi_pe_entry; static unsigned long efi32_pe_entry; static unsigned long kernel_info; static unsigned long startup_64; -static unsigned long _ehead; static unsigned long _end;
/*----------------------------------------------------------------------*/ @@ -229,27 +228,14 @@ static void update_pecoff_setup_and_reloc(unsigned int size) #endif }
-static void update_pecoff_text(unsigned int text_start, unsigned int file_sz, - unsigned int init_sz) +static void update_pecoff_text(unsigned int text_start, unsigned int file_sz) { unsigned int pe_header; unsigned int text_sz = file_sz - text_start; - unsigned int bss_sz = init_sz - file_sz; + unsigned int bss_sz = _end - text_sz;
pe_header = get_unaligned_le32(&buf[0x3c]);
- /* - * The PE/COFF loader may load the image at an address which is - * misaligned with respect to the kernel_alignment field in the setup - * header. - * - * In order to avoid relocating the kernel to correct the misalignment, - * add slack to allow the buffer to be aligned within the declared size - * of the image. - */ - bss_sz += CONFIG_PHYSICAL_ALIGN; - init_sz += CONFIG_PHYSICAL_ALIGN; - /* * Size of code: Subtract the size of the first sector (512 bytes) * which includes the header. @@ -257,7 +243,7 @@ static void update_pecoff_text(unsigned int text_start, unsigned int file_sz, put_unaligned_le32(file_sz - 512 + bss_sz, &buf[pe_header + 0x1c]);
/* Size of image */ - put_unaligned_le32(init_sz, &buf[pe_header + 0x50]); + put_unaligned_le32(file_sz + bss_sz, &buf[pe_header + 0x50]);
/* * Address of entry point for PE/COFF executable @@ -308,8 +294,7 @@ static void efi_stub_entry_update(void)
static inline void update_pecoff_setup_and_reloc(unsigned int size) {} static inline void update_pecoff_text(unsigned int text_start, - unsigned int file_sz, - unsigned int init_sz) {} + unsigned int file_sz) {} static inline void efi_stub_defaults(void) {} static inline void efi_stub_entry_update(void) {}
@@ -360,7 +345,6 @@ static void parse_zoffset(char *fname) PARSE_ZOFS(p, efi32_pe_entry); PARSE_ZOFS(p, kernel_info); PARSE_ZOFS(p, startup_64); - PARSE_ZOFS(p, _ehead); PARSE_ZOFS(p, _end);
p = strchr(p, '\n'); @@ -371,7 +355,7 @@ static void parse_zoffset(char *fname)
int main(int argc, char ** argv) { - unsigned int i, sz, setup_sectors, init_sz; + unsigned int i, sz, setup_sectors; int c; u32 sys_size; struct stat sb; @@ -442,31 +426,9 @@ int main(int argc, char ** argv) buf[0x1f1] = setup_sectors-1; put_unaligned_le32(sys_size, &buf[0x1f4]);
- init_sz = get_unaligned_le32(&buf[0x260]); -#ifdef CONFIG_EFI_STUB - /* - * The decompression buffer will start at ImageBase. When relocating - * the compressed kernel to its end, we must ensure that the head - * section does not get overwritten. The head section occupies - * [i, i + _ehead), and the destination is [init_sz - _end, init_sz). - * - * At present these should never overlap, because 'i' is at most 32k - * because of SETUP_SECT_MAX, '_ehead' is less than 1k, and the - * calculation of INIT_SIZE in boot/header.S ensures that - * 'init_sz - _end' is at least 64k. - * - * For future-proofing, increase init_sz if necessary. - */ - - if (init_sz - _end < i + _ehead) { - init_sz = (i + _ehead + _end + 4095) & ~4095; - put_unaligned_le32(init_sz, &buf[0x260]); - } -#endif - update_pecoff_text(setup_sectors * 512, i + (sys_size * 16), init_sz); + update_pecoff_text(setup_sectors * 512, i + (sys_size * 16));
efi_stub_entry_update(); - /* Update kernel_info offset. */ put_unaligned_le32(kernel_info, &buf[0x268]);
From: Ard Biesheuvel ardb@kernel.org
[ Commit 7448e8e5d15a3c4df649bf6d6d460f78396f7e1e upstream ]
The root device defaults to 0,0 and is no longer configurable at build time [0], so there is no need for the build tool to ever write to this field.
[0] 079f85e624189292 ("x86, build: Do not set the root_dev field in bzImage")
This change has no impact on the resulting bzImage binary.
Signed-off-by: Ard Biesheuvel ardb@kernel.org Signed-off-by: Ingo Molnar mingo@kernel.org Link: https://lore.kernel.org/r/20230912090051.4014114-23-ardb@google.com Signed-off-by: Ard Biesheuvel ardb@kernel.org --- arch/x86/boot/header.S | 2 +- arch/x86/boot/tools/build.c | 7 ------- 2 files changed, 1 insertion(+), 8 deletions(-)
diff --git a/arch/x86/boot/header.S b/arch/x86/boot/header.S index 98dd4c36ccca..f63bf3ec6869 100644 --- a/arch/x86/boot/header.S +++ b/arch/x86/boot/header.S @@ -235,7 +235,7 @@ root_flags: .word ROOT_RDONLY syssize: .long 0 /* Filled in by build.c */ ram_size: .word 0 /* Obsolete */ vid_mode: .word SVGA_MODE -root_dev: .word 0 /* Filled in by build.c */ +root_dev: .word 0 /* Default to major/minor 0/0 */ boot_flag: .word 0xAA55
# offset 512, entry point diff --git a/arch/x86/boot/tools/build.c b/arch/x86/boot/tools/build.c index 0354c223e354..efa4e9c7d713 100644 --- a/arch/x86/boot/tools/build.c +++ b/arch/x86/boot/tools/build.c @@ -40,10 +40,6 @@ typedef unsigned char u8; typedef unsigned short u16; typedef unsigned int u32;
-#define DEFAULT_MAJOR_ROOT 0 -#define DEFAULT_MINOR_ROOT 0 -#define DEFAULT_ROOT_DEV (DEFAULT_MAJOR_ROOT << 8 | DEFAULT_MINOR_ROOT) - /* Minimal number of setup sectors */ #define SETUP_SECT_MIN 5 #define SETUP_SECT_MAX 64 @@ -399,9 +395,6 @@ int main(int argc, char ** argv)
update_pecoff_setup_and_reloc(i);
- /* Set the default root device */ - put_unaligned_le16(DEFAULT_ROOT_DEV, &buf[508]); - /* Open and stat the kernel file */ fd = open(argv[2], O_RDONLY); if (fd < 0)
From: Ard Biesheuvel ardb@kernel.org
[ Commit b618d31f112bea3d2daea19190d63e567f32a4db upstream ]
The x86 boot image generation tool assign a default value to startup_64 and subsequently parses the actual value from zoffset.h but it never actually uses the value anywhere. So remove this code.
This change has no impact on the resulting bzImage binary.
Signed-off-by: Ard Biesheuvel ardb@kernel.org Signed-off-by: Ingo Molnar mingo@kernel.org Link: https://lore.kernel.org/r/20230912090051.4014114-25-ardb@google.com Signed-off-by: Ard Biesheuvel ardb@kernel.org --- arch/x86/boot/Makefile | 2 +- arch/x86/boot/tools/build.c | 3 --- 2 files changed, 1 insertion(+), 4 deletions(-)
diff --git a/arch/x86/boot/Makefile b/arch/x86/boot/Makefile index 9e38ffaadb5d..10ea28469788 100644 --- a/arch/x86/boot/Makefile +++ b/arch/x86/boot/Makefile @@ -91,7 +91,7 @@ $(obj)/vmlinux.bin: $(obj)/compressed/vmlinux FORCE
SETUP_OBJS = $(addprefix $(obj)/,$(setup-y))
-sed-zoffset := -e 's/^([0-9a-fA-F]*) [a-zA-Z] (startup_32|startup_64|efi32_stub_entry|efi64_stub_entry|efi_pe_entry|efi32_pe_entry|input_data|kernel_info|_end|_ehead|_text|z_.*)$$/#define ZO_\2 0x\1/p' +sed-zoffset := -e 's/^([0-9a-fA-F]*) [a-zA-Z] (startup_32|efi32_stub_entry|efi64_stub_entry|efi_pe_entry|efi32_pe_entry|input_data|kernel_info|_end|_ehead|_text|z_.*)$$/#define ZO_\2 0x\1/p'
quiet_cmd_zoffset = ZOFFSET $@ cmd_zoffset = $(NM) $< | sed -n $(sed-zoffset) > $@ diff --git a/arch/x86/boot/tools/build.c b/arch/x86/boot/tools/build.c index efa4e9c7d713..10b0207a6b18 100644 --- a/arch/x86/boot/tools/build.c +++ b/arch/x86/boot/tools/build.c @@ -60,7 +60,6 @@ static unsigned long efi64_stub_entry; static unsigned long efi_pe_entry; static unsigned long efi32_pe_entry; static unsigned long kernel_info; -static unsigned long startup_64; static unsigned long _end;
/*----------------------------------------------------------------------*/ @@ -264,7 +263,6 @@ static void efi_stub_defaults(void) efi_pe_entry = 0x10; #else efi_pe_entry = 0x210; - startup_64 = 0x200; #endif }
@@ -340,7 +338,6 @@ static void parse_zoffset(char *fname) PARSE_ZOFS(p, efi_pe_entry); PARSE_ZOFS(p, efi32_pe_entry); PARSE_ZOFS(p, kernel_info); - PARSE_ZOFS(p, startup_64); PARSE_ZOFS(p, _end);
p = strchr(p, '\n');
From: Ard Biesheuvel ardb@kernel.org
[ Commit 2e765c02dcbfc2a8a4527c621a84b9502f6b9bd2 upstream ]
Instead of parsing zoffset.h and poking the kernel_info offset value into the header from the build tool, just grab the value directly in the asm file that describes this header.
This change has no impact on the resulting bzImage binary.
Signed-off-by: Ard Biesheuvel ardb@kernel.org Signed-off-by: Ingo Molnar mingo@kernel.org Link: https://lore.kernel.org/r/20230915171623.655440-11-ardb@google.com Signed-off-by: Ard Biesheuvel ardb@kernel.org --- arch/x86/boot/header.S | 2 +- arch/x86/boot/tools/build.c | 4 ---- 2 files changed, 1 insertion(+), 5 deletions(-)
diff --git a/arch/x86/boot/header.S b/arch/x86/boot/header.S index f63bf3ec6869..becf39d8115c 100644 --- a/arch/x86/boot/header.S +++ b/arch/x86/boot/header.S @@ -525,7 +525,7 @@ pref_address: .quad LOAD_PHYSICAL_ADDR # preferred load addr
init_size: .long INIT_SIZE # kernel initialization size handover_offset: .long 0 # Filled in by build.c -kernel_info_offset: .long 0 # Filled in by build.c +kernel_info_offset: .long ZO_kernel_info
# End of setup header #####################################################
diff --git a/arch/x86/boot/tools/build.c b/arch/x86/boot/tools/build.c index 10b0207a6b18..14ef13fe7ab0 100644 --- a/arch/x86/boot/tools/build.c +++ b/arch/x86/boot/tools/build.c @@ -59,7 +59,6 @@ static unsigned long efi32_stub_entry; static unsigned long efi64_stub_entry; static unsigned long efi_pe_entry; static unsigned long efi32_pe_entry; -static unsigned long kernel_info; static unsigned long _end;
/*----------------------------------------------------------------------*/ @@ -337,7 +336,6 @@ static void parse_zoffset(char *fname) PARSE_ZOFS(p, efi64_stub_entry); PARSE_ZOFS(p, efi_pe_entry); PARSE_ZOFS(p, efi32_pe_entry); - PARSE_ZOFS(p, kernel_info); PARSE_ZOFS(p, _end);
p = strchr(p, '\n'); @@ -419,8 +417,6 @@ int main(int argc, char ** argv) update_pecoff_text(setup_sectors * 512, i + (sys_size * 16));
efi_stub_entry_update(); - /* Update kernel_info offset. */ - put_unaligned_le32(kernel_info, &buf[0x268]);
crc = partial_crc32(buf, i, crc); if (fwrite(buf, 1, i, dest) != i)
From: Ard Biesheuvel ardb@kernel.org
[ Commit eac956345f99dda3d68f4ae6cf7b494105e54780 upstream ]
The offsets of the EFI handover entrypoints are available to the assembler when constructing the header, so there is no need to set them from the build tool afterwards.
This change has no impact on the resulting bzImage binary.
Signed-off-by: Ard Biesheuvel ardb@kernel.org Signed-off-by: Ingo Molnar mingo@kernel.org Link: https://lore.kernel.org/r/20230915171623.655440-12-ardb@google.com Signed-off-by: Ard Biesheuvel ardb@kernel.org --- arch/x86/boot/header.S | 18 ++++++++++++++- arch/x86/boot/tools/build.c | 24 -------------------- 2 files changed, 17 insertions(+), 25 deletions(-)
diff --git a/arch/x86/boot/header.S b/arch/x86/boot/header.S index becf39d8115c..34ab46b891e3 100644 --- a/arch/x86/boot/header.S +++ b/arch/x86/boot/header.S @@ -523,8 +523,24 @@ pref_address: .quad LOAD_PHYSICAL_ADDR # preferred load addr # define INIT_SIZE VO_INIT_SIZE #endif
+ .macro __handover_offset +#ifndef CONFIG_EFI_HANDOVER_PROTOCOL + .long 0 +#elif !defined(CONFIG_X86_64) + .long ZO_efi32_stub_entry +#else + /* Yes, this is really how we defined it :( */ + .long ZO_efi64_stub_entry - 0x200 +#ifdef CONFIG_EFI_MIXED + .if ZO_efi32_stub_entry != ZO_efi64_stub_entry - 0x200 + .error "32-bit and 64-bit EFI entry points do not match" + .endif +#endif +#endif + .endm + init_size: .long INIT_SIZE # kernel initialization size -handover_offset: .long 0 # Filled in by build.c +handover_offset: __handover_offset kernel_info_offset: .long ZO_kernel_info
# End of setup header ##################################################### diff --git a/arch/x86/boot/tools/build.c b/arch/x86/boot/tools/build.c index 14ef13fe7ab0..069497543164 100644 --- a/arch/x86/boot/tools/build.c +++ b/arch/x86/boot/tools/build.c @@ -55,8 +55,6 @@ u8 buf[SETUP_SECT_MAX*512]; #define PECOFF_COMPAT_RESERVE 0x0 #endif
-static unsigned long efi32_stub_entry; -static unsigned long efi64_stub_entry; static unsigned long efi_pe_entry; static unsigned long efi32_pe_entry; static unsigned long _end; @@ -265,31 +263,12 @@ static void efi_stub_defaults(void) #endif }
-static void efi_stub_entry_update(void) -{ - unsigned long addr = efi32_stub_entry; - -#ifdef CONFIG_EFI_HANDOVER_PROTOCOL -#ifdef CONFIG_X86_64 - /* Yes, this is really how we defined it :( */ - addr = efi64_stub_entry - 0x200; -#endif - -#ifdef CONFIG_EFI_MIXED - if (efi32_stub_entry != addr) - die("32-bit and 64-bit EFI entry points do not match\n"); -#endif -#endif - put_unaligned_le32(addr, &buf[0x264]); -} - #else
static inline void update_pecoff_setup_and_reloc(unsigned int size) {} static inline void update_pecoff_text(unsigned int text_start, unsigned int file_sz) {} static inline void efi_stub_defaults(void) {} -static inline void efi_stub_entry_update(void) {}
static inline int reserve_pecoff_reloc_section(int c) { @@ -332,8 +311,6 @@ static void parse_zoffset(char *fname) p = (char *)buf;
while (p && *p) { - PARSE_ZOFS(p, efi32_stub_entry); - PARSE_ZOFS(p, efi64_stub_entry); PARSE_ZOFS(p, efi_pe_entry); PARSE_ZOFS(p, efi32_pe_entry); PARSE_ZOFS(p, _end); @@ -416,7 +393,6 @@ int main(int argc, char ** argv)
update_pecoff_text(setup_sectors * 512, i + (sys_size * 16));
- efi_stub_entry_update();
crc = partial_crc32(buf, i, crc); if (fwrite(buf, 1, i, dest) != i)
From: Ard Biesheuvel ardb@kernel.org
[ Commit 093ab258e3fb1d1d3afdfd4a69403d44ce90e360 upstream ]
The setup block contains the real mode startup code that is used when booting from a legacy BIOS, along with the boot_params/setup_data that is used by legacy x86 bootloaders to pass the command line and initial ramdisk parameters, among other things.
The setup block also contains the PE/COFF header of the entire combined image, which includes the compressed kernel image, the decompressor and the EFI stub.
This PE header describes the layout of the executable image in memory, and currently, the fact that the setup block precedes it makes it rather fiddly to get the right values into the right place in the final image.
Let's make things a bit easier by defining the setup_size in the linker script so it can be referenced from the asm code directly, rather than having to rely on the build tool to calculate it. For the time being, add 64 bytes of fixed padding for the .reloc and .compat sections - this will be removed in a subsequent patch after the PE/COFF header has been reorganized.
This change has no impact on the resulting bzImage binary when configured with CONFIG_EFI_MIXED=y.
Signed-off-by: Ard Biesheuvel ardb@kernel.org Signed-off-by: Ingo Molnar mingo@kernel.org Link: https://lore.kernel.org/r/20230915171623.655440-13-ardb@google.com Signed-off-by: Ard Biesheuvel ardb@kernel.org --- arch/x86/boot/header.S | 2 +- arch/x86/boot/setup.ld | 4 ++++ arch/x86/boot/tools/build.c | 6 ------ 3 files changed, 5 insertions(+), 7 deletions(-)
diff --git a/arch/x86/boot/header.S b/arch/x86/boot/header.S index 34ab46b891e3..6dddf469ca60 100644 --- a/arch/x86/boot/header.S +++ b/arch/x86/boot/header.S @@ -230,7 +230,7 @@ sentinel: .byte 0xff, 0xff /* Used to detect broken loaders */
.globl hdr hdr: -setup_sects: .byte 0 /* Filled in by build.c */ + .byte setup_sects - 1 root_flags: .word ROOT_RDONLY syssize: .long 0 /* Filled in by build.c */ ram_size: .word 0 /* Obsolete */ diff --git a/arch/x86/boot/setup.ld b/arch/x86/boot/setup.ld index b11c45b9e51e..9bd5c1ada599 100644 --- a/arch/x86/boot/setup.ld +++ b/arch/x86/boot/setup.ld @@ -39,6 +39,10 @@ SECTIONS .signature : { setup_sig = .; LONG(0x5a5aaa55) + + /* Reserve some extra space for the reloc and compat sections */ + setup_size = ALIGN(ABSOLUTE(.) + 64, 512); + setup_sects = ABSOLUTE(setup_size / 512); }
diff --git a/arch/x86/boot/tools/build.c b/arch/x86/boot/tools/build.c index 069497543164..745d64b6d930 100644 --- a/arch/x86/boot/tools/build.c +++ b/arch/x86/boot/tools/build.c @@ -48,12 +48,7 @@ typedef unsigned int u32; u8 buf[SETUP_SECT_MAX*512];
#define PECOFF_RELOC_RESERVE 0x20 - -#ifdef CONFIG_EFI_MIXED #define PECOFF_COMPAT_RESERVE 0x20 -#else -#define PECOFF_COMPAT_RESERVE 0x0 -#endif
static unsigned long efi_pe_entry; static unsigned long efi32_pe_entry; @@ -388,7 +383,6 @@ int main(int argc, char ** argv) #endif
/* Patch the setup code with the appropriate size parameters */ - buf[0x1f1] = setup_sectors-1; put_unaligned_le32(sys_size, &buf[0x1f4]);
update_pecoff_text(setup_sectors * 512, i + (sys_size * 16));
From: Ard Biesheuvel ardb@kernel.org
[ Commit aeb92067f6ae994b541d7f9752fe54ed3d108bcc upstream ]
Tweak the linker script so that the value of _edata represents the decompressor binary's file size rounded up to the appropriate alignment. This removes the need to calculate it in the build tool, and will make it easier to refer to the file size from the header directly in subsequent changes to the PE header layout.
While adding _edata to the sed regex that parses the compressed vmlinux's symbol list, tweak the regex a bit for conciseness.
This change has no impact on the resulting bzImage binary when configured with CONFIG_EFI_STUB=y.
Signed-off-by: Ard Biesheuvel ardb@kernel.org Signed-off-by: Ingo Molnar mingo@kernel.org Link: https://lore.kernel.org/r/20230915171623.655440-14-ardb@google.com Signed-off-by: Ard Biesheuvel ardb@kernel.org --- arch/x86/boot/Makefile | 2 +- arch/x86/boot/compressed/vmlinux.lds.S | 3 ++ arch/x86/boot/header.S | 2 +- arch/x86/boot/tools/build.c | 30 +++++--------------- 4 files changed, 12 insertions(+), 25 deletions(-)
diff --git a/arch/x86/boot/Makefile b/arch/x86/boot/Makefile index 10ea28469788..b92e00836f69 100644 --- a/arch/x86/boot/Makefile +++ b/arch/x86/boot/Makefile @@ -91,7 +91,7 @@ $(obj)/vmlinux.bin: $(obj)/compressed/vmlinux FORCE
SETUP_OBJS = $(addprefix $(obj)/,$(setup-y))
-sed-zoffset := -e 's/^([0-9a-fA-F]*) [a-zA-Z] (startup_32|efi32_stub_entry|efi64_stub_entry|efi_pe_entry|efi32_pe_entry|input_data|kernel_info|_end|_ehead|_text|z_.*)$$/#define ZO_\2 0x\1/p' +sed-zoffset := -e 's/^([0-9a-fA-F]*) [a-zA-Z] (startup_32|efi.._stub_entry|efi(32)?_pe_entry|input_data|kernel_info|_end|_ehead|_text|_edata|z_.*)$$/#define ZO_\2 0x\1/p'
quiet_cmd_zoffset = ZOFFSET $@ cmd_zoffset = $(NM) $< | sed -n $(sed-zoffset) > $@ diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S index 32892e81bf61..aa7e9848cd8f 100644 --- a/arch/x86/boot/compressed/vmlinux.lds.S +++ b/arch/x86/boot/compressed/vmlinux.lds.S @@ -46,6 +46,9 @@ SECTIONS _data = . ; *(.data) *(.data.*) + + /* Add 4 bytes of extra space for a CRC-32 checksum */ + . = ALIGN(. + 4, 0x20); _edata = . ; } . = ALIGN(L1_CACHE_BYTES); diff --git a/arch/x86/boot/header.S b/arch/x86/boot/header.S index 6dddf469ca60..b43b55130855 100644 --- a/arch/x86/boot/header.S +++ b/arch/x86/boot/header.S @@ -232,7 +232,7 @@ sentinel: .byte 0xff, 0xff /* Used to detect broken loaders */ hdr: .byte setup_sects - 1 root_flags: .word ROOT_RDONLY -syssize: .long 0 /* Filled in by build.c */ +syssize: .long ZO__edata / 16 ram_size: .word 0 /* Obsolete */ vid_mode: .word SVGA_MODE root_dev: .word 0 /* Default to major/minor 0/0 */ diff --git a/arch/x86/boot/tools/build.c b/arch/x86/boot/tools/build.c index 745d64b6d930..e792c6c5a634 100644 --- a/arch/x86/boot/tools/build.c +++ b/arch/x86/boot/tools/build.c @@ -52,6 +52,7 @@ u8 buf[SETUP_SECT_MAX*512];
static unsigned long efi_pe_entry; static unsigned long efi32_pe_entry; +static unsigned long _edata; static unsigned long _end;
/*----------------------------------------------------------------------*/ @@ -308,6 +309,7 @@ static void parse_zoffset(char *fname) while (p && *p) { PARSE_ZOFS(p, efi_pe_entry); PARSE_ZOFS(p, efi32_pe_entry); + PARSE_ZOFS(p, _edata); PARSE_ZOFS(p, _end);
p = strchr(p, '\n'); @@ -320,7 +322,6 @@ int main(int argc, char ** argv) { unsigned int i, sz, setup_sectors; int c; - u32 sys_size; struct stat sb; FILE *file, *dest; int fd; @@ -368,24 +369,14 @@ int main(int argc, char ** argv) die("Unable to open `%s': %m", argv[2]); if (fstat(fd, &sb)) die("Unable to stat `%s': %m", argv[2]); - sz = sb.st_size; + if (_edata != sb.st_size) + die("Unexpected file size `%s': %u != %u", argv[2], _edata, + sb.st_size); + sz = _edata - 4; kernel = mmap(NULL, sz, PROT_READ, MAP_SHARED, fd, 0); if (kernel == MAP_FAILED) die("Unable to mmap '%s': %m", argv[2]); - /* Number of 16-byte paragraphs, including space for a 4-byte CRC */ - sys_size = (sz + 15 + 4) / 16; -#ifdef CONFIG_EFI_STUB - /* - * COFF requires minimum 32-byte alignment of sections, and - * adding a signature is problematic without that alignment. - */ - sys_size = (sys_size + 1) & ~1; -#endif - - /* Patch the setup code with the appropriate size parameters */ - put_unaligned_le32(sys_size, &buf[0x1f4]); - - update_pecoff_text(setup_sectors * 512, i + (sys_size * 16)); + update_pecoff_text(setup_sectors * 512, i + _edata);
crc = partial_crc32(buf, i, crc); @@ -397,13 +388,6 @@ int main(int argc, char ** argv) if (fwrite(kernel, 1, sz, dest) != sz) die("Writing kernel failed");
- /* Add padding leaving 4 bytes for the checksum */ - while (sz++ < (sys_size*16) - 4) { - crc = partial_crc32_one('\0', crc); - if (fwrite("\0", 1, 1, dest) != 1) - die("Writing padding failed"); - } - /* Write the CRC */ put_unaligned_le32(crc, buf); if (fwrite(buf, 1, 4, dest) != 4)
From: Ard Biesheuvel ardb@kernel.org
[ Commit efa089e63b56bdc5eca754b995cb039dd7a5457e upstream ]
Now that the size of the setup block is visible to the assembler, it is possible to populate the PE/COFF header fields from the asm code directly, instead of poking the values into the binary using the build tool. This will make it easier to reorganize the section layout without having to tweak the build tool in lockstep.
This change has no impact on the resulting bzImage binary.
Signed-off-by: Ard Biesheuvel ardb@kernel.org Signed-off-by: Ingo Molnar mingo@kernel.org Link: https://lore.kernel.org/r/20230915171623.655440-15-ardb@google.com Signed-off-by: Ard Biesheuvel ardb@kernel.org --- arch/x86/boot/header.S | 22 +++------ arch/x86/boot/tools/build.c | 47 -------------------- 2 files changed, 7 insertions(+), 62 deletions(-)
diff --git a/arch/x86/boot/header.S b/arch/x86/boot/header.S index b43b55130855..f8f609fb8709 100644 --- a/arch/x86/boot/header.S +++ b/arch/x86/boot/header.S @@ -74,14 +74,12 @@ optional_header: .byte 0x02 # MajorLinkerVersion .byte 0x14 # MinorLinkerVersion
- # Filled in by build.c - .long 0 # SizeOfCode + .long setup_size + ZO__end - 0x200 # SizeOfCode
.long 0 # SizeOfInitializedData .long 0 # SizeOfUninitializedData
- # Filled in by build.c - .long 0x0000 # AddressOfEntryPoint + .long setup_size + ZO_efi_pe_entry # AddressOfEntryPoint
.long 0x0200 # BaseOfCode #ifdef CONFIG_X86_32 @@ -104,10 +102,7 @@ extra_header_fields: .word 0 # MinorSubsystemVersion .long 0 # Win32VersionValue
- # - # The size of the bzImage is written in tools/build.c - # - .long 0 # SizeOfImage + .long setup_size + ZO__end # SizeOfImage
.long 0x200 # SizeOfHeaders .long 0 # CheckSum @@ -198,18 +193,15 @@ section_table: IMAGE_SCN_MEM_DISCARDABLE # Characteristics #endif
- # - # The offset & size fields are filled in by build.c. - # .ascii ".text" .byte 0 .byte 0 .byte 0 - .long 0 - .long 0x0 # startup_{32,64} - .long 0 # Size of initialized data + .long ZO__end + .long setup_size + .long ZO__edata # Size of initialized data # on disk - .long 0x0 # startup_{32,64} + .long setup_size .long 0 # PointerToRelocations .long 0 # PointerToLineNumbers .word 0 # NumberOfRelocations diff --git a/arch/x86/boot/tools/build.c b/arch/x86/boot/tools/build.c index e792c6c5a634..9712f27e32c1 100644 --- a/arch/x86/boot/tools/build.c +++ b/arch/x86/boot/tools/build.c @@ -50,10 +50,8 @@ u8 buf[SETUP_SECT_MAX*512]; #define PECOFF_RELOC_RESERVE 0x20 #define PECOFF_COMPAT_RESERVE 0x20
-static unsigned long efi_pe_entry; static unsigned long efi32_pe_entry; static unsigned long _edata; -static unsigned long _end;
/*----------------------------------------------------------------------*/
@@ -216,32 +214,6 @@ static void update_pecoff_setup_and_reloc(unsigned int size) #endif }
-static void update_pecoff_text(unsigned int text_start, unsigned int file_sz) -{ - unsigned int pe_header; - unsigned int text_sz = file_sz - text_start; - unsigned int bss_sz = _end - text_sz; - - pe_header = get_unaligned_le32(&buf[0x3c]); - - /* - * Size of code: Subtract the size of the first sector (512 bytes) - * which includes the header. - */ - put_unaligned_le32(file_sz - 512 + bss_sz, &buf[pe_header + 0x1c]); - - /* Size of image */ - put_unaligned_le32(file_sz + bss_sz, &buf[pe_header + 0x50]); - - /* - * Address of entry point for PE/COFF executable - */ - put_unaligned_le32(text_start + efi_pe_entry, &buf[pe_header + 0x28]); - - update_pecoff_section_header_fields(".text", text_start, text_sz + bss_sz, - text_sz, text_start); -} - static int reserve_pecoff_reloc_section(int c) { /* Reserve 0x20 bytes for .reloc section */ @@ -249,22 +221,9 @@ static int reserve_pecoff_reloc_section(int c) return PECOFF_RELOC_RESERVE; }
-static void efi_stub_defaults(void) -{ - /* Defaults for old kernel */ -#ifdef CONFIG_X86_32 - efi_pe_entry = 0x10; -#else - efi_pe_entry = 0x210; -#endif -} - #else
static inline void update_pecoff_setup_and_reloc(unsigned int size) {} -static inline void update_pecoff_text(unsigned int text_start, - unsigned int file_sz) {} -static inline void efi_stub_defaults(void) {}
static inline int reserve_pecoff_reloc_section(int c) { @@ -307,10 +266,8 @@ static void parse_zoffset(char *fname) p = (char *)buf;
while (p && *p) { - PARSE_ZOFS(p, efi_pe_entry); PARSE_ZOFS(p, efi32_pe_entry); PARSE_ZOFS(p, _edata); - PARSE_ZOFS(p, _end);
p = strchr(p, '\n'); while (p && (*p == '\r' || *p == '\n')) @@ -328,8 +285,6 @@ int main(int argc, char ** argv) void *kernel; u32 crc = 0xffffffffUL;
- efi_stub_defaults(); - if (argc != 5) usage(); parse_zoffset(argv[3]); @@ -376,8 +331,6 @@ int main(int argc, char ** argv) kernel = mmap(NULL, sz, PROT_READ, MAP_SHARED, fd, 0); if (kernel == MAP_FAILED) die("Unable to mmap '%s': %m", argv[2]); - update_pecoff_text(setup_sectors * 512, i + _edata); -
crc = partial_crc32(buf, i, crc); if (fwrite(buf, 1, i, dest) != i)
From: Ard Biesheuvel ardb@kernel.org
[ Commit fa5750521e0a4efbc1af05223da9c4bbd6c21c83 upstream ]
Ancient buggy EFI loaders may have required a .reloc section to be present at some point in time, but this has not been true for a long time so the .reloc section can just be dropped.
Signed-off-by: Ard Biesheuvel ardb@kernel.org Signed-off-by: Ingo Molnar mingo@kernel.org Link: https://lore.kernel.org/r/20230915171623.655440-16-ardb@google.com Signed-off-by: Ard Biesheuvel ardb@kernel.org --- arch/x86/boot/header.S | 20 ------------ arch/x86/boot/setup.ld | 4 +-- arch/x86/boot/tools/build.c | 34 +++----------------- 3 files changed, 7 insertions(+), 51 deletions(-)
diff --git a/arch/x86/boot/header.S b/arch/x86/boot/header.S index f8f609fb8709..a01e55ce506f 100644 --- a/arch/x86/boot/header.S +++ b/arch/x86/boot/header.S @@ -154,26 +154,6 @@ section_table: IMAGE_SCN_MEM_READ | \ IMAGE_SCN_MEM_EXECUTE # Characteristics
- # - # The EFI application loader requires a relocation section - # because EFI applications must be relocatable. The .reloc - # offset & size fields are filled in by build.c. - # - .ascii ".reloc" - .byte 0 - .byte 0 - .long 0 - .long 0 - .long 0 # SizeOfRawData - .long 0 # PointerToRawData - .long 0 # PointerToRelocations - .long 0 # PointerToLineNumbers - .word 0 # NumberOfRelocations - .word 0 # NumberOfLineNumbers - .long IMAGE_SCN_CNT_INITIALIZED_DATA | \ - IMAGE_SCN_MEM_READ | \ - IMAGE_SCN_MEM_DISCARDABLE # Characteristics - #ifdef CONFIG_EFI_MIXED # # The offset & size fields are filled in by build.c. diff --git a/arch/x86/boot/setup.ld b/arch/x86/boot/setup.ld index 9bd5c1ada599..6d389499565c 100644 --- a/arch/x86/boot/setup.ld +++ b/arch/x86/boot/setup.ld @@ -40,8 +40,8 @@ SECTIONS setup_sig = .; LONG(0x5a5aaa55)
- /* Reserve some extra space for the reloc and compat sections */ - setup_size = ALIGN(ABSOLUTE(.) + 64, 512); + /* Reserve some extra space for the compat section */ + setup_size = ALIGN(ABSOLUTE(.) + 32, 512); setup_sects = ABSOLUTE(setup_size / 512); }
diff --git a/arch/x86/boot/tools/build.c b/arch/x86/boot/tools/build.c index 9712f27e32c1..faccff9743a3 100644 --- a/arch/x86/boot/tools/build.c +++ b/arch/x86/boot/tools/build.c @@ -47,7 +47,6 @@ typedef unsigned int u32; /* This must be large enough to hold the entire setup */ u8 buf[SETUP_SECT_MAX*512];
-#define PECOFF_RELOC_RESERVE 0x20 #define PECOFF_COMPAT_RESERVE 0x20
static unsigned long efi32_pe_entry; @@ -180,24 +179,13 @@ static void update_pecoff_section_header(char *section_name, u32 offset, u32 siz update_pecoff_section_header_fields(section_name, offset, size, size, offset); }
-static void update_pecoff_setup_and_reloc(unsigned int size) +static void update_pecoff_setup(unsigned int size) { u32 setup_offset = 0x200; - u32 reloc_offset = size - PECOFF_RELOC_RESERVE - PECOFF_COMPAT_RESERVE; -#ifdef CONFIG_EFI_MIXED - u32 compat_offset = reloc_offset + PECOFF_RELOC_RESERVE; -#endif - u32 setup_size = reloc_offset - setup_offset; + u32 compat_offset = size - PECOFF_COMPAT_RESERVE; + u32 setup_size = compat_offset - setup_offset;
update_pecoff_section_header(".setup", setup_offset, setup_size); - update_pecoff_section_header(".reloc", reloc_offset, PECOFF_RELOC_RESERVE); - - /* - * Modify .reloc section contents with a single entry. The - * relocation is applied to offset 10 of the relocation section. - */ - put_unaligned_le32(reloc_offset + 10, &buf[reloc_offset]); - put_unaligned_le32(10, &buf[reloc_offset + 4]);
#ifdef CONFIG_EFI_MIXED update_pecoff_section_header(".compat", compat_offset, PECOFF_COMPAT_RESERVE); @@ -214,21 +202,10 @@ static void update_pecoff_setup_and_reloc(unsigned int size) #endif }
-static int reserve_pecoff_reloc_section(int c) -{ - /* Reserve 0x20 bytes for .reloc section */ - memset(buf+c, 0, PECOFF_RELOC_RESERVE); - return PECOFF_RELOC_RESERVE; -} - #else
-static inline void update_pecoff_setup_and_reloc(unsigned int size) {} +static inline void update_pecoff_setup(unsigned int size) {}
-static inline int reserve_pecoff_reloc_section(int c) -{ - return 0; -} #endif /* CONFIG_EFI_STUB */
static int reserve_pecoff_compat_section(int c) @@ -307,7 +284,6 @@ int main(int argc, char ** argv) fclose(file);
c += reserve_pecoff_compat_section(c); - c += reserve_pecoff_reloc_section(c);
/* Pad unused space with zeros */ setup_sectors = (c + 511) / 512; @@ -316,7 +292,7 @@ int main(int argc, char ** argv) i = setup_sectors*512; memset(buf+c, 0, i-c);
- update_pecoff_setup_and_reloc(i); + update_pecoff_setup(i);
/* Open and stat the kernel file */ fd = open(argv[2], O_RDONLY);
From: Ard Biesheuvel ardb@kernel.org
[ Commit 34951f3c28bdf6481d949a20413b2ce7693687b2 upstream ]
Describe the code and data of the decompressor binary using separate .text and .data PE/COFF sections, so that we will be able to map them using restricted permissions once we increase the section and file alignment sufficiently. This avoids the need for memory mappings that are writable and executable at the same time, which is something that is best avoided for security reasons.
Signed-off-by: Ard Biesheuvel ardb@kernel.org Signed-off-by: Ingo Molnar mingo@kernel.org Link: https://lore.kernel.org/r/20230915171623.655440-17-ardb@google.com Signed-off-by: Ard Biesheuvel ardb@kernel.org --- arch/x86/boot/Makefile | 2 +- arch/x86/boot/header.S | 19 +++++++++++++++---- 2 files changed, 16 insertions(+), 5 deletions(-)
diff --git a/arch/x86/boot/Makefile b/arch/x86/boot/Makefile index b92e00836f69..3c1b3520361c 100644 --- a/arch/x86/boot/Makefile +++ b/arch/x86/boot/Makefile @@ -91,7 +91,7 @@ $(obj)/vmlinux.bin: $(obj)/compressed/vmlinux FORCE
SETUP_OBJS = $(addprefix $(obj)/,$(setup-y))
-sed-zoffset := -e 's/^([0-9a-fA-F]*) [a-zA-Z] (startup_32|efi.._stub_entry|efi(32)?_pe_entry|input_data|kernel_info|_end|_ehead|_text|_edata|z_.*)$$/#define ZO_\2 0x\1/p' +sed-zoffset := -e 's/^([0-9a-fA-F]*) [a-zA-Z] (startup_32|efi.._stub_entry|efi(32)?_pe_entry|input_data|kernel_info|_end|_ehead|_text|_e?data|z_.*)$$/#define ZO_\2 0x\1/p'
quiet_cmd_zoffset = ZOFFSET $@ cmd_zoffset = $(NM) $< | sed -n $(sed-zoffset) > $@ diff --git a/arch/x86/boot/header.S b/arch/x86/boot/header.S index a01e55ce506f..178252cdccf5 100644 --- a/arch/x86/boot/header.S +++ b/arch/x86/boot/header.S @@ -74,9 +74,9 @@ optional_header: .byte 0x02 # MajorLinkerVersion .byte 0x14 # MinorLinkerVersion
- .long setup_size + ZO__end - 0x200 # SizeOfCode + .long ZO__data # SizeOfCode
- .long 0 # SizeOfInitializedData + .long ZO__end - ZO__data # SizeOfInitializedData .long 0 # SizeOfUninitializedData
.long setup_size + ZO_efi_pe_entry # AddressOfEntryPoint @@ -177,9 +177,9 @@ section_table: .byte 0 .byte 0 .byte 0 - .long ZO__end + .long ZO__data .long setup_size - .long ZO__edata # Size of initialized data + .long ZO__data # Size of initialized data # on disk .long setup_size .long 0 # PointerToRelocations @@ -190,6 +190,17 @@ section_table: IMAGE_SCN_MEM_READ | \ IMAGE_SCN_MEM_EXECUTE # Characteristics
+ .ascii ".data\0\0\0" + .long ZO__end - ZO__data # VirtualSize + .long setup_size + ZO__data # VirtualAddress + .long ZO__edata - ZO__data # SizeOfRawData + .long setup_size + ZO__data # PointerToRawData + + .long 0, 0, 0 + .long IMAGE_SCN_CNT_INITIALIZED_DATA | \ + IMAGE_SCN_MEM_READ | \ + IMAGE_SCN_MEM_WRITE # Characteristics + .set section_count, (. - section_table) / 40 #endif /* CONFIG_EFI_STUB */
From: Ard Biesheuvel ardb@kernel.org
[ Commit 3e3eabe26dc88692d34cf76ca0e0dd331481cc15 upstream ]
Align x86 with other EFI architectures, and increase the section alignment to the EFI page size (4k), so that firmware is able to honour the section permission attributes and map code read-only and data non-executable.
There are a number of requirements that have to be taken into account: - the sign tools get cranky when there are gaps between sections in the file view of the image - the virtual offset of each section must be aligned to the image's section alignment - the file offset *and size* of each section must be aligned to the image's file alignment - the image size must be aligned to the section alignment - each section's virtual offset must be greater than or equal to the size of the headers.
In order to meet all these requirements, while avoiding the need for lots of padding to accommodate the .compat section, the latter is placed at an arbitrary offset towards the end of the image, but aligned to the minimum file alignment (512 bytes). The space before the .text section is therefore distributed between the PE header, the .setup section and the .compat section, leaving no gaps in the file coverage, making the signing tools happy.
Signed-off-by: Ard Biesheuvel ardb@kernel.org Signed-off-by: Ingo Molnar mingo@kernel.org Link: https://lore.kernel.org/r/20230915171623.655440-18-ardb@google.com Signed-off-by: Ard Biesheuvel ardb@kernel.org --- arch/x86/boot/compressed/vmlinux.lds.S | 4 +- arch/x86/boot/header.S | 75 +++++++++------- arch/x86/boot/setup.ld | 7 +- arch/x86/boot/tools/build.c | 90 +------------------- 4 files changed, 51 insertions(+), 125 deletions(-)
diff --git a/arch/x86/boot/compressed/vmlinux.lds.S b/arch/x86/boot/compressed/vmlinux.lds.S index aa7e9848cd8f..bcf0e4e4c98e 100644 --- a/arch/x86/boot/compressed/vmlinux.lds.S +++ b/arch/x86/boot/compressed/vmlinux.lds.S @@ -42,13 +42,13 @@ SECTIONS *(.rodata.*) _erodata = . ; } - .data : { + .data : ALIGN(0x1000) { _data = . ; *(.data) *(.data.*)
/* Add 4 bytes of extra space for a CRC-32 checksum */ - . = ALIGN(. + 4, 0x20); + . = ALIGN(. + 4, 0x200); _edata = . ; } . = ALIGN(L1_CACHE_BYTES); diff --git a/arch/x86/boot/header.S b/arch/x86/boot/header.S index 178252cdccf5..6264bbf54fbc 100644 --- a/arch/x86/boot/header.S +++ b/arch/x86/boot/header.S @@ -36,6 +36,9 @@ SYSSEG = 0x1000 /* historical load address >> 4 */ #define ROOT_RDONLY 1 #endif
+ .set salign, 0x1000 + .set falign, 0x200 + .code16 .section ".bstext", "ax" #ifdef CONFIG_EFI_STUB @@ -81,7 +84,7 @@ optional_header:
.long setup_size + ZO_efi_pe_entry # AddressOfEntryPoint
- .long 0x0200 # BaseOfCode + .long setup_size # BaseOfCode #ifdef CONFIG_X86_32 .long 0 # data #endif @@ -92,8 +95,8 @@ extra_header_fields: #else .quad 0 # ImageBase #endif - .long 0x20 # SectionAlignment - .long 0x20 # FileAlignment + .long salign # SectionAlignment + .long falign # FileAlignment .word 0 # MajorOperatingSystemVersion .word 0 # MinorOperatingSystemVersion .word LINUX_EFISTUB_MAJOR_VERSION # MajorImageVersion @@ -102,9 +105,10 @@ extra_header_fields: .word 0 # MinorSubsystemVersion .long 0 # Win32VersionValue
- .long setup_size + ZO__end # SizeOfImage + .long setup_size + ZO__end + pecompat_vsize + # SizeOfImage
- .long 0x200 # SizeOfHeaders + .long salign # SizeOfHeaders .long 0 # CheckSum .word IMAGE_SUBSYSTEM_EFI_APPLICATION # Subsystem (EFI application) #ifdef CONFIG_EFI_DXE_MEM_ATTRIBUTES @@ -135,44 +139,51 @@ extra_header_fields:
# Section table section_table: - # - # The offset & size fields are filled in by build.c. - # .ascii ".setup" .byte 0 .byte 0 - .long 0 - .long 0x0 # startup_{32,64} - .long 0 # Size of initialized data - # on disk - .long 0x0 # startup_{32,64} - .long 0 # PointerToRelocations - .long 0 # PointerToLineNumbers - .word 0 # NumberOfRelocations - .word 0 # NumberOfLineNumbers - .long IMAGE_SCN_CNT_CODE | \ + .long setup_size - salign # VirtualSize + .long salign # VirtualAddress + .long pecompat_fstart - salign # SizeOfRawData + .long salign # PointerToRawData + + .long 0, 0, 0 + .long IMAGE_SCN_CNT_INITIALIZED_DATA | \ IMAGE_SCN_MEM_READ | \ - IMAGE_SCN_MEM_EXECUTE # Characteristics + IMAGE_SCN_MEM_DISCARDABLE # Characteristics
#ifdef CONFIG_EFI_MIXED - # - # The offset & size fields are filled in by build.c. - # .asciz ".compat" - .long 0 - .long 0x0 - .long 0 # Size of initialized data - # on disk - .long 0x0 - .long 0 # PointerToRelocations - .long 0 # PointerToLineNumbers - .word 0 # NumberOfRelocations - .word 0 # NumberOfLineNumbers + + .long 8 # VirtualSize + .long setup_size + ZO__end # VirtualAddress + .long pecompat_fsize # SizeOfRawData + .long pecompat_fstart # PointerToRawData + + .long 0, 0, 0 .long IMAGE_SCN_CNT_INITIALIZED_DATA | \ IMAGE_SCN_MEM_READ | \ IMAGE_SCN_MEM_DISCARDABLE # Characteristics -#endif
+ /* + * Put the IA-32 machine type and the associated entry point address in + * the .compat section, so loaders can figure out which other execution + * modes this image supports. + */ + .pushsection ".pecompat", "a", @progbits + .balign falign + .set pecompat_vsize, salign + .globl pecompat_fstart +pecompat_fstart: + .byte 0x1 # Version + .byte 8 # Size + .word IMAGE_FILE_MACHINE_I386 # PE machine type + .long setup_size + ZO_efi32_pe_entry # Entrypoint + .popsection +#else + .set pecompat_vsize, 0 + .set pecompat_fstart, setup_size +#endif .ascii ".text" .byte 0 .byte 0 diff --git a/arch/x86/boot/setup.ld b/arch/x86/boot/setup.ld index 6d389499565c..83bb7efad8ae 100644 --- a/arch/x86/boot/setup.ld +++ b/arch/x86/boot/setup.ld @@ -36,16 +36,17 @@ SECTIONS . = ALIGN(16); .data : { *(.data*) }
+ .pecompat : { *(.pecompat) } + PROVIDE(pecompat_fsize = setup_size - pecompat_fstart); + .signature : { setup_sig = .; LONG(0x5a5aaa55)
- /* Reserve some extra space for the compat section */ - setup_size = ALIGN(ABSOLUTE(.) + 32, 512); + setup_size = ALIGN(ABSOLUTE(.), 4096); setup_sects = ABSOLUTE(setup_size / 512); }
- . = ALIGN(16); .bss : { diff --git a/arch/x86/boot/tools/build.c b/arch/x86/boot/tools/build.c index faccff9743a3..10311d77c67f 100644 --- a/arch/x86/boot/tools/build.c +++ b/arch/x86/boot/tools/build.c @@ -47,9 +47,6 @@ typedef unsigned int u32; /* This must be large enough to hold the entire setup */ u8 buf[SETUP_SECT_MAX*512];
-#define PECOFF_COMPAT_RESERVE 0x20 - -static unsigned long efi32_pe_entry; static unsigned long _edata;
/*----------------------------------------------------------------------*/ @@ -136,85 +133,6 @@ static void usage(void) die("Usage: build setup system zoffset.h image"); }
-#ifdef CONFIG_EFI_STUB - -static void update_pecoff_section_header_fields(char *section_name, u32 vma, u32 size, u32 datasz, u32 offset) -{ - unsigned int pe_header; - unsigned short num_sections; - u8 *section; - - pe_header = get_unaligned_le32(&buf[0x3c]); - num_sections = get_unaligned_le16(&buf[pe_header + 6]); - -#ifdef CONFIG_X86_32 - section = &buf[pe_header + 0xa8]; -#else - section = &buf[pe_header + 0xb8]; -#endif - - while (num_sections > 0) { - if (strncmp((char*)section, section_name, 8) == 0) { - /* section header size field */ - put_unaligned_le32(size, section + 0x8); - - /* section header vma field */ - put_unaligned_le32(vma, section + 0xc); - - /* section header 'size of initialised data' field */ - put_unaligned_le32(datasz, section + 0x10); - - /* section header 'file offset' field */ - put_unaligned_le32(offset, section + 0x14); - - break; - } - section += 0x28; - num_sections--; - } -} - -static void update_pecoff_section_header(char *section_name, u32 offset, u32 size) -{ - update_pecoff_section_header_fields(section_name, offset, size, size, offset); -} - -static void update_pecoff_setup(unsigned int size) -{ - u32 setup_offset = 0x200; - u32 compat_offset = size - PECOFF_COMPAT_RESERVE; - u32 setup_size = compat_offset - setup_offset; - - update_pecoff_section_header(".setup", setup_offset, setup_size); - -#ifdef CONFIG_EFI_MIXED - update_pecoff_section_header(".compat", compat_offset, PECOFF_COMPAT_RESERVE); - - /* - * Put the IA-32 machine type (0x14c) and the associated entry point - * address in the .compat section, so loaders can figure out which other - * execution modes this image supports. - */ - buf[compat_offset] = 0x1; - buf[compat_offset + 1] = 0x8; - put_unaligned_le16(0x14c, &buf[compat_offset + 2]); - put_unaligned_le32(efi32_pe_entry + size, &buf[compat_offset + 4]); -#endif -} - -#else - -static inline void update_pecoff_setup(unsigned int size) {} - -#endif /* CONFIG_EFI_STUB */ - -static int reserve_pecoff_compat_section(int c) -{ - /* Reserve 0x20 bytes for .compat section */ - memset(buf+c, 0, PECOFF_COMPAT_RESERVE); - return PECOFF_COMPAT_RESERVE; -} - /* * Parse zoffset.h and find the entry points. We could just #include zoffset.h * but that would mean tools/build would have to be rebuilt every time. It's @@ -243,7 +161,6 @@ static void parse_zoffset(char *fname) p = (char *)buf;
while (p && *p) { - PARSE_ZOFS(p, efi32_pe_entry); PARSE_ZOFS(p, _edata);
p = strchr(p, '\n'); @@ -283,17 +200,14 @@ int main(int argc, char ** argv) die("Boot block hasn't got boot flag (0xAA55)"); fclose(file);
- c += reserve_pecoff_compat_section(c); - /* Pad unused space with zeros */ - setup_sectors = (c + 511) / 512; + setup_sectors = (c + 4095) / 4096; + setup_sectors *= 8; if (setup_sectors < SETUP_SECT_MIN) setup_sectors = SETUP_SECT_MIN; i = setup_sectors*512; memset(buf+c, 0, i-c);
- update_pecoff_setup(i); - /* Open and stat the kernel file */ fd = open(argv[2], O_RDONLY); if (fd < 0)
From: Ard Biesheuvel ardb@kernel.org
[ Commit 1ad55cecf22f05f1c884adf63cc09d3c3e609ebf upstream ]
The .compat section is a dummy PE section that contains the address of the 32-bit entrypoint of the 64-bit kernel image if it is bootable from 32-bit firmware (i.e., CONFIG_EFI_MIXED=y)
This section is only 8 bytes in size and is only referenced from the loader, and so it is placed at the end of the memory view of the image, to avoid the need for padding it to 4k, which is required for sections appearing in the middle of the image.
Unfortunately, this violates the PE/COFF spec, and even if most EFI loaders will work correctly (including the Tianocore reference implementation), PE loaders do exist that reject such images, on the basis that both the file and memory views of the file contents should be described by the section headers in a monotonically increasing manner without leaving any gaps.
So reorganize the sections to avoid this issue. This results in a slight padding overhead (< 4k) which can be avoided if desired by disabling CONFIG_EFI_MIXED (which is only needed in rare cases these days)
Fixes: 3e3eabe26dc8 ("x86/boot: Increase section and file alignment to 4k/512") Reported-by: Mike Beaton mjsbeaton@gmail.com Link: https://lkml.kernel.org/r/CAHzAAWQ6srV6LVNdmfbJhOwhBw5ZzxxZZ07aHt9oKkfYAdvuQ... Signed-off-by: Ard Biesheuvel ardb@kernel.org --- arch/x86/boot/header.S | 14 ++++++-------- arch/x86/boot/setup.ld | 6 +++--- 2 files changed, 9 insertions(+), 11 deletions(-)
diff --git a/arch/x86/boot/header.S b/arch/x86/boot/header.S index 6264bbf54fbc..7593339b529a 100644 --- a/arch/x86/boot/header.S +++ b/arch/x86/boot/header.S @@ -105,8 +105,7 @@ extra_header_fields: .word 0 # MinorSubsystemVersion .long 0 # Win32VersionValue
- .long setup_size + ZO__end + pecompat_vsize - # SizeOfImage + .long setup_size + ZO__end # SizeOfImage
.long salign # SizeOfHeaders .long 0 # CheckSum @@ -142,7 +141,7 @@ section_table: .ascii ".setup" .byte 0 .byte 0 - .long setup_size - salign # VirtualSize + .long pecompat_fstart - salign # VirtualSize .long salign # VirtualAddress .long pecompat_fstart - salign # SizeOfRawData .long salign # PointerToRawData @@ -155,8 +154,8 @@ section_table: #ifdef CONFIG_EFI_MIXED .asciz ".compat"
- .long 8 # VirtualSize - .long setup_size + ZO__end # VirtualAddress + .long pecompat_fsize # VirtualSize + .long pecompat_fstart # VirtualAddress .long pecompat_fsize # SizeOfRawData .long pecompat_fstart # PointerToRawData
@@ -171,17 +170,16 @@ section_table: * modes this image supports. */ .pushsection ".pecompat", "a", @progbits - .balign falign - .set pecompat_vsize, salign + .balign salign .globl pecompat_fstart pecompat_fstart: .byte 0x1 # Version .byte 8 # Size .word IMAGE_FILE_MACHINE_I386 # PE machine type .long setup_size + ZO_efi32_pe_entry # Entrypoint + .byte 0x0 # Sentinel .popsection #else - .set pecompat_vsize, 0 .set pecompat_fstart, setup_size #endif .ascii ".text" diff --git a/arch/x86/boot/setup.ld b/arch/x86/boot/setup.ld index 83bb7efad8ae..3a2d1360abb0 100644 --- a/arch/x86/boot/setup.ld +++ b/arch/x86/boot/setup.ld @@ -24,6 +24,9 @@ SECTIONS .text : { *(.text .text.*) } .text32 : { *(.text32) }
+ .pecompat : { *(.pecompat) } + PROVIDE(pecompat_fsize = setup_size - pecompat_fstart); + . = ALIGN(16); .rodata : { *(.rodata*) }
@@ -36,9 +39,6 @@ SECTIONS . = ALIGN(16); .data : { *(.data*) }
- .pecompat : { *(.pecompat) } - PROVIDE(pecompat_fsize = setup_size - pecompat_fstart); - .signature : { setup_sig = .; LONG(0x5a5aaa55)
From: Pasha Tatashin pasha.tatashin@soleen.com
[ Commit 82328227db8f0b9b5f77bb5afcd47e59d0e4d08f upstream ]
Other architectures and the common mm/ use P*D_MASK, and P*D_SIZE. Remove the duplicated P*D_PAGE_MASK and P*D_PAGE_SIZE which are only used in x86/*.
Signed-off-by: Pasha Tatashin pasha.tatashin@soleen.com Signed-off-by: Borislav Petkov bp@suse.de Reviewed-by: Anshuman Khandual anshuman.khandual@arm.com Acked-by: Mike Rapoport rppt@linux.ibm.com Link: https://lore.kernel.org/r/20220516185202.604654-1-tatashin@google.com Signed-off-by: Ard Biesheuvel ardb@kernel.org --- arch/x86/include/asm/page_types.h | 12 +++--------- arch/x86/kernel/amd_gart_64.c | 2 +- arch/x86/kernel/head64.c | 2 +- arch/x86/mm/mem_encrypt_boot.S | 4 ++-- arch/x86/mm/mem_encrypt_identity.c | 18 +++++++++--------- arch/x86/mm/pat/set_memory.c | 6 +++--- arch/x86/mm/pti.c | 2 +- 7 files changed, 20 insertions(+), 26 deletions(-)
diff --git a/arch/x86/include/asm/page_types.h b/arch/x86/include/asm/page_types.h index a506a411474d..86bd4311daf8 100644 --- a/arch/x86/include/asm/page_types.h +++ b/arch/x86/include/asm/page_types.h @@ -11,20 +11,14 @@ #define PAGE_SIZE (_AC(1,UL) << PAGE_SHIFT) #define PAGE_MASK (~(PAGE_SIZE-1))
-#define PMD_PAGE_SIZE (_AC(1, UL) << PMD_SHIFT) -#define PMD_PAGE_MASK (~(PMD_PAGE_SIZE-1)) - -#define PUD_PAGE_SIZE (_AC(1, UL) << PUD_SHIFT) -#define PUD_PAGE_MASK (~(PUD_PAGE_SIZE-1)) - #define __VIRTUAL_MASK ((1UL << __VIRTUAL_MASK_SHIFT) - 1)
-/* Cast *PAGE_MASK to a signed type so that it is sign-extended if +/* Cast P*D_MASK to a signed type so that it is sign-extended if virtual addresses are 32-bits but physical addresses are larger (ie, 32-bit PAE). */ #define PHYSICAL_PAGE_MASK (((signed long)PAGE_MASK) & __PHYSICAL_MASK) -#define PHYSICAL_PMD_PAGE_MASK (((signed long)PMD_PAGE_MASK) & __PHYSICAL_MASK) -#define PHYSICAL_PUD_PAGE_MASK (((signed long)PUD_PAGE_MASK) & __PHYSICAL_MASK) +#define PHYSICAL_PMD_PAGE_MASK (((signed long)PMD_MASK) & __PHYSICAL_MASK) +#define PHYSICAL_PUD_PAGE_MASK (((signed long)PUD_MASK) & __PHYSICAL_MASK)
#define HPAGE_SHIFT PMD_SHIFT #define HPAGE_SIZE (_AC(1,UL) << HPAGE_SHIFT) diff --git a/arch/x86/kernel/amd_gart_64.c b/arch/x86/kernel/amd_gart_64.c index 19a0207e529f..56a917df410d 100644 --- a/arch/x86/kernel/amd_gart_64.c +++ b/arch/x86/kernel/amd_gart_64.c @@ -504,7 +504,7 @@ static __init unsigned long check_iommu_size(unsigned long aper, u64 aper_size) }
a = aper + iommu_size; - iommu_size -= round_up(a, PMD_PAGE_SIZE) - a; + iommu_size -= round_up(a, PMD_SIZE) - a;
if (iommu_size < 64*1024*1024) { pr_warn("PCI-DMA: Warning: Small IOMMU %luMB." diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c index 84adf12a76d3..34580e1a4cdb 100644 --- a/arch/x86/kernel/head64.c +++ b/arch/x86/kernel/head64.c @@ -203,7 +203,7 @@ unsigned long __head __startup_64(unsigned long physaddr, load_delta = physaddr - (unsigned long)(_text - __START_KERNEL_map);
/* Is the address not 2M aligned? */ - if (load_delta & ~PMD_PAGE_MASK) + if (load_delta & ~PMD_MASK) for (;;);
/* Include the SME encryption mask in the fixup value */ diff --git a/arch/x86/mm/mem_encrypt_boot.S b/arch/x86/mm/mem_encrypt_boot.S index 9de3d900bc92..e25288ee33c2 100644 --- a/arch/x86/mm/mem_encrypt_boot.S +++ b/arch/x86/mm/mem_encrypt_boot.S @@ -26,7 +26,7 @@ SYM_FUNC_START(sme_encrypt_execute) * RCX - virtual address of the encryption workarea, including: * - stack page (PAGE_SIZE) * - encryption routine page (PAGE_SIZE) - * - intermediate copy buffer (PMD_PAGE_SIZE) + * - intermediate copy buffer (PMD_SIZE) * R8 - physical address of the pagetables to use for encryption */
@@ -123,7 +123,7 @@ SYM_FUNC_START(__enc_copy) wbinvd /* Invalidate any cache entries */
/* Copy/encrypt up to 2MB at a time */ - movq $PMD_PAGE_SIZE, %r12 + movq $PMD_SIZE, %r12 1: cmpq %r12, %r9 jnb 2f diff --git a/arch/x86/mm/mem_encrypt_identity.c b/arch/x86/mm/mem_encrypt_identity.c index 06ccbd36e8dc..f7dfc3f89921 100644 --- a/arch/x86/mm/mem_encrypt_identity.c +++ b/arch/x86/mm/mem_encrypt_identity.c @@ -93,7 +93,7 @@ struct sme_populate_pgd_data { * section is 2MB aligned to allow for simple pagetable setup using only * PMD entries (see vmlinux.lds.S). */ -static char sme_workarea[2 * PMD_PAGE_SIZE] __section(".init.scratch"); +static char sme_workarea[2 * PMD_SIZE] __section(".init.scratch");
static char sme_cmdline_arg[] __initdata = "mem_encrypt"; static char sme_cmdline_on[] __initdata = "on"; @@ -197,8 +197,8 @@ static void __init __sme_map_range_pmd(struct sme_populate_pgd_data *ppd) while (ppd->vaddr < ppd->vaddr_end) { sme_populate_pgd_large(ppd);
- ppd->vaddr += PMD_PAGE_SIZE; - ppd->paddr += PMD_PAGE_SIZE; + ppd->vaddr += PMD_SIZE; + ppd->paddr += PMD_SIZE; } }
@@ -224,11 +224,11 @@ static void __init __sme_map_range(struct sme_populate_pgd_data *ppd, vaddr_end = ppd->vaddr_end;
/* If start is not 2MB aligned, create PTE entries */ - ppd->vaddr_end = ALIGN(ppd->vaddr, PMD_PAGE_SIZE); + ppd->vaddr_end = ALIGN(ppd->vaddr, PMD_SIZE); __sme_map_range_pte(ppd);
/* Create PMD entries */ - ppd->vaddr_end = vaddr_end & PMD_PAGE_MASK; + ppd->vaddr_end = vaddr_end & PMD_MASK; __sme_map_range_pmd(ppd);
/* If end is not 2MB aligned, create PTE entries */ @@ -325,7 +325,7 @@ void __init sme_encrypt_kernel(struct boot_params *bp)
/* Physical addresses gives us the identity mapped virtual addresses */ kernel_start = __pa_symbol(_text); - kernel_end = ALIGN(__pa_symbol(_end), PMD_PAGE_SIZE); + kernel_end = ALIGN(__pa_symbol(_end), PMD_SIZE); kernel_len = kernel_end - kernel_start;
initrd_start = 0; @@ -355,12 +355,12 @@ void __init sme_encrypt_kernel(struct boot_params *bp) * executable encryption area size: * stack page (PAGE_SIZE) * encryption routine page (PAGE_SIZE) - * intermediate copy buffer (PMD_PAGE_SIZE) + * intermediate copy buffer (PMD_SIZE) * pagetable structures for the encryption of the kernel * pagetable structures for workarea (in case not currently mapped) */ execute_start = workarea_start; - execute_end = execute_start + (PAGE_SIZE * 2) + PMD_PAGE_SIZE; + execute_end = execute_start + (PAGE_SIZE * 2) + PMD_SIZE; execute_len = execute_end - execute_start;
/* @@ -383,7 +383,7 @@ void __init sme_encrypt_kernel(struct boot_params *bp) * before it is mapped. */ workarea_len = execute_len + pgtable_area_len; - workarea_end = ALIGN(workarea_start + workarea_len, PMD_PAGE_SIZE); + workarea_end = ALIGN(workarea_start + workarea_len, PMD_SIZE);
/* * Set the address to the start of where newly created pagetable diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c index 5f0ce77a259d..68d4f328f169 100644 --- a/arch/x86/mm/pat/set_memory.c +++ b/arch/x86/mm/pat/set_memory.c @@ -747,11 +747,11 @@ phys_addr_t slow_virt_to_phys(void *__virt_addr) switch (level) { case PG_LEVEL_1G: phys_addr = (phys_addr_t)pud_pfn(*(pud_t *)pte) << PAGE_SHIFT; - offset = virt_addr & ~PUD_PAGE_MASK; + offset = virt_addr & ~PUD_MASK; break; case PG_LEVEL_2M: phys_addr = (phys_addr_t)pmd_pfn(*(pmd_t *)pte) << PAGE_SHIFT; - offset = virt_addr & ~PMD_PAGE_MASK; + offset = virt_addr & ~PMD_MASK; break; default: phys_addr = (phys_addr_t)pte_pfn(*pte) << PAGE_SHIFT; @@ -1041,7 +1041,7 @@ __split_large_page(struct cpa_data *cpa, pte_t *kpte, unsigned long address, case PG_LEVEL_1G: ref_prot = pud_pgprot(*(pud_t *)kpte); ref_pfn = pud_pfn(*(pud_t *)kpte); - pfninc = PMD_PAGE_SIZE >> PAGE_SHIFT; + pfninc = PMD_SIZE >> PAGE_SHIFT; lpaddr = address & PUD_MASK; lpinc = PMD_SIZE; /* diff --git a/arch/x86/mm/pti.c b/arch/x86/mm/pti.c index ffe3b3a087fe..78414c6d1b5e 100644 --- a/arch/x86/mm/pti.c +++ b/arch/x86/mm/pti.c @@ -592,7 +592,7 @@ static void pti_set_kernel_image_nonglobal(void) * of the image. */ unsigned long start = PFN_ALIGN(_text); - unsigned long end = ALIGN((unsigned long)_end, PMD_PAGE_SIZE); + unsigned long end = ALIGN((unsigned long)_end, PMD_SIZE);
/* * This clears _PAGE_GLOBAL from the entire kernel image.
From: Hou Wenlong houwenlong.hwl@antgroup.com
[ Commit 7f6874eddd81cb2ed784642a7a4321671e158ffe upstream ]
This function is currently only used in the head code and is only called from startup_64_setup_env(). Although it would be inlined by the compiler, it would be better to mark it as __head too in case it doesn't.
Signed-off-by: Hou Wenlong houwenlong.hwl@antgroup.com Signed-off-by: Ingo Molnar mingo@kernel.org Link: https://lore.kernel.org/r/efcc5b5e18af880e415d884e072bf651c1fa7c34.168913031... Signed-off-by: Ard Biesheuvel ardb@kernel.org --- arch/x86/kernel/head64.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c index 34580e1a4cdb..78f3f6756538 100644 --- a/arch/x86/kernel/head64.c +++ b/arch/x86/kernel/head64.c @@ -588,7 +588,7 @@ static void set_bringup_idt_handler(gate_desc *idt, int n, void *handler) }
/* This runs while still in the direct mapping */ -static void startup_64_load_idt(unsigned long physbase) +static void __head startup_64_load_idt(unsigned long physbase) { struct desc_ptr *desc = fixup_pointer(&bringup_idt_descr, physbase); gate_desc *idt = fixup_pointer(bringup_idt_table, physbase);
From: Hou Wenlong houwenlong.hwl@antgroup.com
[ Commit d2a285d65bfde3218fd0c3b88794d0135ced680b upstream ]
Move the __head section definition to a header to widen its use.
An upcoming patch will mark the code as __head in mem_encrypt_identity.c too.
Signed-off-by: Hou Wenlong houwenlong.hwl@antgroup.com Signed-off-by: Ingo Molnar mingo@kernel.org Link: https://lore.kernel.org/r/0583f57977be184689c373fe540cbd7d85ca2047.169752540... Signed-off-by: Ard Biesheuvel ardb@kernel.org --- arch/x86/include/asm/init.h | 2 ++ arch/x86/kernel/head64.c | 3 +-- 2 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/init.h b/arch/x86/include/asm/init.h index 5f1d3c421f68..cc9ccf61b6bd 100644 --- a/arch/x86/include/asm/init.h +++ b/arch/x86/include/asm/init.h @@ -2,6 +2,8 @@ #ifndef _ASM_X86_INIT_H #define _ASM_X86_INIT_H
+#define __head __section(".head.text") + struct x86_mapping_info { void *(*alloc_pgt_page)(void *); /* allocate buf for page table */ void *context; /* context for alloc_pgt_page */ diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c index 78f3f6756538..4fae511b2e2b 100644 --- a/arch/x86/kernel/head64.c +++ b/arch/x86/kernel/head64.c @@ -41,6 +41,7 @@ #include <asm/trapnr.h> #include <asm/sev.h> #include <asm/tdx.h> +#include <asm/init.h>
/* * Manage page tables very early on. @@ -84,8 +85,6 @@ static struct desc_ptr startup_gdt_descr = { .address = 0, };
-#define __head __section(".head.text") - static void __head *fixup_pointer(void *ptr, unsigned long physaddr) { return ptr - (void *)_text + (void *)physaddr;
From: Ard Biesheuvel ardb@kernel.org
[ Commit 48204aba801f1b512b3abed10b8e1a63e03f3dd1 upstream ]
The .head.text section is the initial primary entrypoint of the core kernel, and is entered with the CPU executing from a 1:1 mapping of memory. Such code must never access global variables using absolute references, as these are based on the kernel virtual mapping which is not active yet at this point.
Given that the SME startup code is also called from this early execution context, move it into .head.text as well. This will allow more thorough build time checks in the future to ensure that early startup code only uses RIP-relative references to global variables.
Also replace some occurrences of __pa_symbol() [which relies on the compiler generating an absolute reference, which is not guaranteed] and an open coded RIP-relative access with RIP_REL_REF().
Signed-off-by: Ard Biesheuvel ardb@kernel.org Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Tested-by: Tom Lendacky thomas.lendacky@amd.com Link: https://lore.kernel.org/r/20240227151907.387873-18-ardb+git@google.com Signed-off-by: Ard Biesheuvel ardb@kernel.org --- arch/x86/include/asm/mem_encrypt.h | 8 ++-- arch/x86/mm/mem_encrypt_identity.c | 42 ++++++++------------ 2 files changed, 21 insertions(+), 29 deletions(-)
diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h index 41d06822bc8c..853f423b1d13 100644 --- a/arch/x86/include/asm/mem_encrypt.h +++ b/arch/x86/include/asm/mem_encrypt.h @@ -46,8 +46,8 @@ void __init sme_unmap_bootdata(char *real_mode_data); void __init sme_early_init(void); void __init sev_setup_arch(void);
-void __init sme_encrypt_kernel(struct boot_params *bp); -void __init sme_enable(struct boot_params *bp); +void sme_encrypt_kernel(struct boot_params *bp); +void sme_enable(struct boot_params *bp);
int __init early_set_memory_decrypted(unsigned long vaddr, unsigned long size); int __init early_set_memory_encrypted(unsigned long vaddr, unsigned long size); @@ -80,8 +80,8 @@ static inline void __init sme_unmap_bootdata(char *real_mode_data) { } static inline void __init sme_early_init(void) { } static inline void __init sev_setup_arch(void) { }
-static inline void __init sme_encrypt_kernel(struct boot_params *bp) { } -static inline void __init sme_enable(struct boot_params *bp) { } +static inline void sme_encrypt_kernel(struct boot_params *bp) { } +static inline void sme_enable(struct boot_params *bp) { }
static inline void sev_es_init_vc_handling(void) { }
diff --git a/arch/x86/mm/mem_encrypt_identity.c b/arch/x86/mm/mem_encrypt_identity.c index f7dfc3f89921..f17609884874 100644 --- a/arch/x86/mm/mem_encrypt_identity.c +++ b/arch/x86/mm/mem_encrypt_identity.c @@ -41,6 +41,7 @@ #include <linux/mem_encrypt.h> #include <linux/cc_platform.h>
+#include <asm/init.h> #include <asm/setup.h> #include <asm/sections.h> #include <asm/cmdline.h> @@ -98,7 +99,7 @@ static char sme_workarea[2 * PMD_SIZE] __section(".init.scratch"); static char sme_cmdline_arg[] __initdata = "mem_encrypt"; static char sme_cmdline_on[] __initdata = "on";
-static void __init sme_clear_pgd(struct sme_populate_pgd_data *ppd) +static void __head sme_clear_pgd(struct sme_populate_pgd_data *ppd) { unsigned long pgd_start, pgd_end, pgd_size; pgd_t *pgd_p; @@ -113,7 +114,7 @@ static void __init sme_clear_pgd(struct sme_populate_pgd_data *ppd) memset(pgd_p, 0, pgd_size); }
-static pud_t __init *sme_prepare_pgd(struct sme_populate_pgd_data *ppd) +static pud_t __head *sme_prepare_pgd(struct sme_populate_pgd_data *ppd) { pgd_t *pgd; p4d_t *p4d; @@ -150,7 +151,7 @@ static pud_t __init *sme_prepare_pgd(struct sme_populate_pgd_data *ppd) return pud; }
-static void __init sme_populate_pgd_large(struct sme_populate_pgd_data *ppd) +static void __head sme_populate_pgd_large(struct sme_populate_pgd_data *ppd) { pud_t *pud; pmd_t *pmd; @@ -166,7 +167,7 @@ static void __init sme_populate_pgd_large(struct sme_populate_pgd_data *ppd) set_pmd(pmd, __pmd(ppd->paddr | ppd->pmd_flags)); }
-static void __init sme_populate_pgd(struct sme_populate_pgd_data *ppd) +static void __head sme_populate_pgd(struct sme_populate_pgd_data *ppd) { pud_t *pud; pmd_t *pmd; @@ -192,7 +193,7 @@ static void __init sme_populate_pgd(struct sme_populate_pgd_data *ppd) set_pte(pte, __pte(ppd->paddr | ppd->pte_flags)); }
-static void __init __sme_map_range_pmd(struct sme_populate_pgd_data *ppd) +static void __head __sme_map_range_pmd(struct sme_populate_pgd_data *ppd) { while (ppd->vaddr < ppd->vaddr_end) { sme_populate_pgd_large(ppd); @@ -202,7 +203,7 @@ static void __init __sme_map_range_pmd(struct sme_populate_pgd_data *ppd) } }
-static void __init __sme_map_range_pte(struct sme_populate_pgd_data *ppd) +static void __head __sme_map_range_pte(struct sme_populate_pgd_data *ppd) { while (ppd->vaddr < ppd->vaddr_end) { sme_populate_pgd(ppd); @@ -212,7 +213,7 @@ static void __init __sme_map_range_pte(struct sme_populate_pgd_data *ppd) } }
-static void __init __sme_map_range(struct sme_populate_pgd_data *ppd, +static void __head __sme_map_range(struct sme_populate_pgd_data *ppd, pmdval_t pmd_flags, pteval_t pte_flags) { unsigned long vaddr_end; @@ -236,22 +237,22 @@ static void __init __sme_map_range(struct sme_populate_pgd_data *ppd, __sme_map_range_pte(ppd); }
-static void __init sme_map_range_encrypted(struct sme_populate_pgd_data *ppd) +static void __head sme_map_range_encrypted(struct sme_populate_pgd_data *ppd) { __sme_map_range(ppd, PMD_FLAGS_ENC, PTE_FLAGS_ENC); }
-static void __init sme_map_range_decrypted(struct sme_populate_pgd_data *ppd) +static void __head sme_map_range_decrypted(struct sme_populate_pgd_data *ppd) { __sme_map_range(ppd, PMD_FLAGS_DEC, PTE_FLAGS_DEC); }
-static void __init sme_map_range_decrypted_wp(struct sme_populate_pgd_data *ppd) +static void __head sme_map_range_decrypted_wp(struct sme_populate_pgd_data *ppd) { __sme_map_range(ppd, PMD_FLAGS_DEC_WP, PTE_FLAGS_DEC_WP); }
-static unsigned long __init sme_pgtable_calc(unsigned long len) +static unsigned long __head sme_pgtable_calc(unsigned long len) { unsigned long entries = 0, tables = 0;
@@ -288,7 +289,7 @@ static unsigned long __init sme_pgtable_calc(unsigned long len) return entries + tables; }
-void __init sme_encrypt_kernel(struct boot_params *bp) +void __head sme_encrypt_kernel(struct boot_params *bp) { unsigned long workarea_start, workarea_end, workarea_len; unsigned long execute_start, execute_end, execute_len; @@ -323,9 +324,8 @@ void __init sme_encrypt_kernel(struct boot_params *bp) * memory from being cached. */
- /* Physical addresses gives us the identity mapped virtual addresses */ - kernel_start = __pa_symbol(_text); - kernel_end = ALIGN(__pa_symbol(_end), PMD_SIZE); + kernel_start = (unsigned long)RIP_REL_REF(_text); + kernel_end = ALIGN((unsigned long)RIP_REL_REF(_end), PMD_SIZE); kernel_len = kernel_end - kernel_start;
initrd_start = 0; @@ -342,14 +342,6 @@ void __init sme_encrypt_kernel(struct boot_params *bp) } #endif
- /* - * We're running identity mapped, so we must obtain the address to the - * SME encryption workarea using rip-relative addressing. - */ - asm ("lea sme_workarea(%%rip), %0" - : "=r" (workarea_start) - : "p" (sme_workarea)); - /* * Calculate required number of workarea bytes needed: * executable encryption area size: @@ -359,7 +351,7 @@ void __init sme_encrypt_kernel(struct boot_params *bp) * pagetable structures for the encryption of the kernel * pagetable structures for workarea (in case not currently mapped) */ - execute_start = workarea_start; + execute_start = workarea_start = (unsigned long)RIP_REL_REF(sme_workarea); execute_end = execute_start + (PAGE_SIZE * 2) + PMD_SIZE; execute_len = execute_end - execute_start;
@@ -502,7 +494,7 @@ void __init sme_encrypt_kernel(struct boot_params *bp) native_write_cr3(__native_read_cr3()); }
-void __init sme_enable(struct boot_params *bp) +void __head sme_enable(struct boot_params *bp) { const char *cmdline_ptr, *cmdline_arg, *cmdline_on; unsigned int eax, ebx, ecx, edx;
From: Ard Biesheuvel ardb@kernel.org
[ Commit 428080c9b19bfda37c478cd626dbd3851db1aff9 upstream ]
In preparation for implementing rigorous build time checks to enforce that only code that can support it will be called from the early 1:1 mapping of memory, move SEV init code that is called in this manner to the .head.text section.
Signed-off-by: Ard Biesheuvel ardb@kernel.org Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Tested-by: Tom Lendacky thomas.lendacky@amd.com Link: https://lore.kernel.org/r/20240227151907.387873-19-ardb+git@google.com Signed-off-by: Ard Biesheuvel ardb@kernel.org --- arch/x86/boot/compressed/sev.c | 3 +++ arch/x86/include/asm/sev.h | 10 ++++----- arch/x86/kernel/sev-shared.c | 23 +++++++++----------- arch/x86/kernel/sev.c | 11 +++++----- 4 files changed, 24 insertions(+), 23 deletions(-)
diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c index d07e665bb265..3c5d5c97f8f7 100644 --- a/arch/x86/boot/compressed/sev.c +++ b/arch/x86/boot/compressed/sev.c @@ -118,6 +118,9 @@ static bool fault_in_kernel_space(unsigned long address) #define __init #define __pa(x) ((unsigned long)(x))
+#undef __head +#define __head + #define __BOOT_COMPRESSED
/* Basic instruction decoding support needed */ diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h index c57dd21155bd..bcac2e53d50b 100644 --- a/arch/x86/include/asm/sev.h +++ b/arch/x86/include/asm/sev.h @@ -192,15 +192,15 @@ static inline int pvalidate(unsigned long vaddr, bool rmp_psize, bool validate) struct snp_guest_request_ioctl;
void setup_ghcb(void); -void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr, - unsigned long npages); -void __init early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, - unsigned long npages); +void early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr, + unsigned long npages); +void early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, + unsigned long npages); void snp_set_memory_shared(unsigned long vaddr, unsigned long npages); void snp_set_memory_private(unsigned long vaddr, unsigned long npages); void snp_set_wakeup_secondary_cpu(void); bool snp_init(struct boot_params *bp); -void __init __noreturn snp_abort(void); +void __noreturn snp_abort(void); void snp_dmi_setup(void); int snp_issue_guest_request(u64 exit_code, struct snp_req_data *input, struct snp_guest_request_ioctl *rio); u64 snp_get_unsupported_features(u64 status); diff --git a/arch/x86/kernel/sev-shared.c b/arch/x86/kernel/sev-shared.c index 271e70d5748e..3fe76bf17d95 100644 --- a/arch/x86/kernel/sev-shared.c +++ b/arch/x86/kernel/sev-shared.c @@ -86,7 +86,8 @@ static bool __init sev_es_check_cpu_features(void) return true; }
-static void __noreturn sev_es_terminate(unsigned int set, unsigned int reason) +static void __head __noreturn +sev_es_terminate(unsigned int set, unsigned int reason) { u64 val = GHCB_MSR_TERM_REQ;
@@ -323,13 +324,7 @@ static int sev_cpuid_hv(struct ghcb *ghcb, struct es_em_ctxt *ctxt, struct cpuid */ static const struct snp_cpuid_table *snp_cpuid_get_table(void) { - void *ptr; - - asm ("lea cpuid_table_copy(%%rip), %0" - : "=r" (ptr) - : "p" (&cpuid_table_copy)); - - return ptr; + return &RIP_REL_REF(cpuid_table_copy); }
/* @@ -388,7 +383,7 @@ static u32 snp_cpuid_calc_xsave_size(u64 xfeatures_en, bool compacted) return xsave_size; }
-static bool +static bool __head snp_cpuid_get_validated_func(struct cpuid_leaf *leaf) { const struct snp_cpuid_table *cpuid_table = snp_cpuid_get_table(); @@ -525,7 +520,8 @@ static int snp_cpuid_postprocess(struct ghcb *ghcb, struct es_em_ctxt *ctxt, * Returns -EOPNOTSUPP if feature not enabled. Any other non-zero return value * should be treated as fatal by caller. */ -static int snp_cpuid(struct ghcb *ghcb, struct es_em_ctxt *ctxt, struct cpuid_leaf *leaf) +static int __head +snp_cpuid(struct ghcb *ghcb, struct es_em_ctxt *ctxt, struct cpuid_leaf *leaf) { const struct snp_cpuid_table *cpuid_table = snp_cpuid_get_table();
@@ -567,7 +563,7 @@ static int snp_cpuid(struct ghcb *ghcb, struct es_em_ctxt *ctxt, struct cpuid_le * page yet, so it only supports the MSR based communication with the * hypervisor and only the CPUID exit-code. */ -void __init do_vc_no_ghcb(struct pt_regs *regs, unsigned long exit_code) +void __head do_vc_no_ghcb(struct pt_regs *regs, unsigned long exit_code) { unsigned int subfn = lower_bits(regs->cx, 32); unsigned int fn = lower_bits(regs->ax, 32); @@ -1013,7 +1009,8 @@ struct cc_setup_data { * Search for a Confidential Computing blob passed in as a setup_data entry * via the Linux Boot Protocol. */ -static struct cc_blob_sev_info *find_cc_blob_setup_data(struct boot_params *bp) +static __head +struct cc_blob_sev_info *find_cc_blob_setup_data(struct boot_params *bp) { struct cc_setup_data *sd = NULL; struct setup_data *hdr; @@ -1040,7 +1037,7 @@ static struct cc_blob_sev_info *find_cc_blob_setup_data(struct boot_params *bp) * mapping needs to be updated in sync with all the changes to virtual memory * layout and related mapping facilities throughout the boot process. */ -static void __init setup_cpuid_table(const struct cc_blob_sev_info *cc_info) +static void __head setup_cpuid_table(const struct cc_blob_sev_info *cc_info) { const struct snp_cpuid_table *cpuid_table_fw, *cpuid_table; int i; diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c index e35fcc8d4bae..f8a8249ae117 100644 --- a/arch/x86/kernel/sev.c +++ b/arch/x86/kernel/sev.c @@ -26,6 +26,7 @@ #include <linux/dmi.h> #include <uapi/linux/sev-guest.h>
+#include <asm/init.h> #include <asm/cpu_entry_area.h> #include <asm/stacktrace.h> #include <asm/sev.h> @@ -690,7 +691,7 @@ static void pvalidate_pages(unsigned long vaddr, unsigned long npages, bool vali } }
-static void __init early_set_pages_state(unsigned long paddr, unsigned long npages, enum psc_op op) +static void __head early_set_pages_state(unsigned long paddr, unsigned long npages, enum psc_op op) { unsigned long paddr_end; u64 val; @@ -728,7 +729,7 @@ static void __init early_set_pages_state(unsigned long paddr, unsigned long npag sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PSC); }
-void __init early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr, +void __head early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr, unsigned long npages) { /* @@ -2085,7 +2086,7 @@ bool __init handle_vc_boot_ghcb(struct pt_regs *regs) * * Scan for the blob in that order. */ -static __init struct cc_blob_sev_info *find_cc_blob(struct boot_params *bp) +static __head struct cc_blob_sev_info *find_cc_blob(struct boot_params *bp) { struct cc_blob_sev_info *cc_info;
@@ -2111,7 +2112,7 @@ static __init struct cc_blob_sev_info *find_cc_blob(struct boot_params *bp) return cc_info; }
-bool __init snp_init(struct boot_params *bp) +bool __head snp_init(struct boot_params *bp) { struct cc_blob_sev_info *cc_info;
@@ -2133,7 +2134,7 @@ bool __init snp_init(struct boot_params *bp) return true; }
-void __init __noreturn snp_abort(void) +void __head __noreturn snp_abort(void) { sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SNP_UNSUPPORTED); }
From: Ard Biesheuvel ardb@kernel.org
[ Commit 9c55461040a9264b7e44444c53d26480b438eda6 upstream ]
Currently, the EFI stub invokes the EFI memory attributes protocol to strip any NX restrictions from the entire loaded kernel, resulting in all code and data being mapped read-write-execute.
The point of the EFI memory attributes protocol is to remove the need for all memory allocations to be mapped with both write and execute permissions by default, and make it the OS loader's responsibility to transition data mappings to code mappings where appropriate.
Even though the UEFI specification does not appear to leave room for denying memory attribute changes based on security policy, let's be cautious and avoid relying on the ability to create read-write-execute mappings. This is trivially achievable, given that the amount of kernel code executing via the firmware's 1:1 mapping is rather small and limited to the .head.text region. So let's drop the NX restrictions only on that subregion, but not before remapping it as read-only first.
Signed-off-by: Ard Biesheuvel ardb@kernel.org --- arch/x86/boot/compressed/Makefile | 2 +- arch/x86/boot/compressed/misc.c | 1 + arch/x86/include/asm/boot.h | 1 + drivers/firmware/efi/libstub/x86-stub.c | 11 ++++++++++- 4 files changed, 13 insertions(+), 2 deletions(-)
diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile index 3965b2c9efee..6e61baff223f 100644 --- a/arch/x86/boot/compressed/Makefile +++ b/arch/x86/boot/compressed/Makefile @@ -84,7 +84,7 @@ LDFLAGS_vmlinux += -T hostprogs := mkpiggy HOST_EXTRACFLAGS += -I$(srctree)/tools/include
-sed-voffset := -e 's/^([0-9a-fA-F]*) [ABCDGRSTVW] (_text|__bss_start|_end)$$/#define VO_\2 _AC(0x\1,UL)/p' +sed-voffset := -e 's/^([0-9a-fA-F]*) [ABCDGRSTVW] (_text|__start_rodata|__bss_start|_end)$$/#define VO_\2 _AC(0x\1,UL)/p'
quiet_cmd_voffset = VOFFSET $@ cmd_voffset = $(NM) $< | sed -n $(sed-voffset) > $@ diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c index 8ae7893d712f..45435ff88363 100644 --- a/arch/x86/boot/compressed/misc.c +++ b/arch/x86/boot/compressed/misc.c @@ -330,6 +330,7 @@ static size_t parse_elf(void *output) return ehdr.e_entry - LOAD_PHYSICAL_ADDR; }
+const unsigned long kernel_text_size = VO___start_rodata - VO__text; const unsigned long kernel_total_size = VO__end - VO__text;
static u8 boot_heap[BOOT_HEAP_SIZE] __aligned(4); diff --git a/arch/x86/include/asm/boot.h b/arch/x86/include/asm/boot.h index a38cc0afc90a..a3e0be0470a4 100644 --- a/arch/x86/include/asm/boot.h +++ b/arch/x86/include/asm/boot.h @@ -81,6 +81,7 @@
#ifndef __ASSEMBLY__ extern unsigned int output_len; +extern const unsigned long kernel_text_size; extern const unsigned long kernel_total_size;
unsigned long decompress_kernel(unsigned char *outbuf, unsigned long virt_addr, diff --git a/drivers/firmware/efi/libstub/x86-stub.c b/drivers/firmware/efi/libstub/x86-stub.c index 1f5edcb6339a..55468debd55d 100644 --- a/drivers/firmware/efi/libstub/x86-stub.c +++ b/drivers/firmware/efi/libstub/x86-stub.c @@ -227,6 +227,15 @@ efi_status_t efi_adjust_memory_range_protection(unsigned long start, rounded_end = roundup(start + size, EFI_PAGE_SIZE);
if (memattr != NULL) { + status = efi_call_proto(memattr, set_memory_attributes, + rounded_start, + rounded_end - rounded_start, + EFI_MEMORY_RO); + if (status != EFI_SUCCESS) { + efi_warn("Failed to set EFI_MEMORY_RO attribute\n"); + return status; + } + status = efi_call_proto(memattr, clear_memory_attributes, rounded_start, rounded_end - rounded_start, @@ -778,7 +787,7 @@ static efi_status_t efi_decompress_kernel(unsigned long *kernel_entry)
*kernel_entry = addr + entry;
- return efi_adjust_memory_range_protection(addr, kernel_total_size); + return efi_adjust_memory_range_protection(addr, kernel_text_size); }
static void __noreturn enter_kernel(unsigned long kernel_addr,
On Fri, Apr 19, 2024 at 10:11:06AM +0200, Ard Biesheuvel wrote:
From: Ard Biesheuvel ardb@kernel.org
This is the final batch of changes to bring linux-6.1.y in sync with 6.6 and later in terms of compatibility with tightened boot security requirements imposed by MicroSoft, compliance with which is a prerequisite for them to be willing to resume signing distro shim images with the MS 3rd party secure boot certificate.
Without this, distros can only boot on off-the-shelf x86 PCs after disabling secure boot explicitly.
Most of these changes appeared in v6.8 and have been backported to v6.6 already.
All now queued up, thanks.
greg k-h
linux-stable-mirror@lists.linaro.org