3.16.59-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Linus Torvalds torvalds@linux-foundation.org
commit 2f22b4cd45b67b3496f4aa4c7180a1271c6452f6 upstream.
With L1 terminal fault the CPU speculates into unmapped PTEs, and resulting side effects allow to read the memory the PTE is pointing too, if its values are still in the L1 cache.
For swapped out pages Linux uses unmapped PTEs and stores a swap entry into them.
To protect against L1TF it must be ensured that the swap entry is not pointing to valid memory, which requires setting higher bits (between bit 36 and bit 45) that are inside the CPUs physical address space, but outside any real memory.
To do this invert the offset to make sure the higher bits are always set, as long as the swap file is not too big.
Note there is no workaround for 32bit !PAE, or on systems which have more than MAX_PA/2 worth of memory. The later case is very unlikely to happen on real systems.
[AK: updated description and minor tweaks by. Split out from the original patch ]
Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Andi Kleen ak@linux.intel.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Andi Kleen ak@linux.intel.com Reviewed-by: Josh Poimboeuf jpoimboe@redhat.com Acked-by: Michal Hocko mhocko@suse.com Acked-by: Vlastimil Babka vbabka@suse.cz Acked-by: Dave Hansen dave.hansen@intel.com [bwh: Backported to 3.16: Bit 9 may be reserved for PAGE_BIT_NUMA here] Signed-off-by: Ben Hutchings ben@decadent.org.uk --- arch/x86/include/asm/pgtable_64.h | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
--- a/arch/x86/include/asm/pgtable_64.h +++ b/arch/x86/include/asm/pgtable_64.h @@ -167,7 +167,7 @@ static inline int pgd_large(pgd_t pgd) { * * | ... | 11| 10| 9|8|7|6|5| 4| 3|2| 1|0| <- bit number * | ... |SW3|SW2|SW1|G|L|D|A|CD|WT|U| W|P| <- bit names - * | TYPE (59-63) | OFFSET (10-58) | 0 |0|0|X|X| X| X|X|SD|0| <- swp entry + * | TYPE (59-63) | ~OFFSET (10-58) | 0 |0|0|X|X| X| X|X|SD|0| <- swp entry * * G (8) is aliased and used as a PROT_NONE indicator for * !present ptes. We need to start storing swap entries above @@ -180,6 +180,9 @@ static inline int pgd_large(pgd_t pgd) { * * Bit 7 in swp entry should be 0 because pmd_present checks not only P, * but also L and G. + * + * The offset is inverted by a binary not operation to make the high + * physical bits set. */ #define SWP_TYPE_BITS 5
@@ -199,13 +202,15 @@ static inline int pgd_large(pgd_t pgd) { #define __swp_type(x) ((x).val >> (64 - SWP_TYPE_BITS))
/* Shift up (to get rid of type), then down to get value */ -#define __swp_offset(x) ((x).val << SWP_TYPE_BITS >> SWP_OFFSET_SHIFT) +#define __swp_offset(x) (~(x).val << SWP_TYPE_BITS >> SWP_OFFSET_SHIFT)
/* * Shift the offset up "too far" by TYPE bits, then down again + * The offset is inverted by a binary not operation to make the high + * physical bits set. */ #define __swp_entry(type, offset) ((swp_entry_t) { \ - ((unsigned long)(offset) << SWP_OFFSET_SHIFT >> SWP_TYPE_BITS) \ + (~(unsigned long)(offset) << SWP_OFFSET_SHIFT >> SWP_TYPE_BITS) \ | ((unsigned long)(type) << (64-SWP_TYPE_BITS)) })
#define __pte_to_swp_entry(pte) ((swp_entry_t) { pte_val((pte)) })