Thomas,
Andi and I have made an update to our draft of the Spectre admin guide. We may be out on Christmas vacation for a while. But we want to send it out for everyone to take a look.
Thanks.
Tim
From: Andi Kleen ak@linux.intel.com
There are no document in admin guides describing Spectre v1 and v2 side channels and their mitigations in Linux.
Create a document to describe Spectre and the mitigation methods used in the kernel.
Signed-off-by: Andi Kleen ak@linux.intel.com Signed-off-by: Tim Chen tim.c.chen@linux.intel.com --- Documentation/admin-guide/spectre.rst | 502 ++++++++++++++++++++++++++++++++++ 1 file changed, 502 insertions(+) create mode 100644 Documentation/admin-guide/spectre.rst
diff --git a/Documentation/admin-guide/spectre.rst b/Documentation/admin-guide/spectre.rst new file mode 100644 index 0000000..0ba708e --- /dev/null +++ b/Documentation/admin-guide/spectre.rst @@ -0,0 +1,502 @@ +Spectre side channels +===================== + +Spectre is a class of side channel attacks against modern CPUs that +exploit branch prediction and speculative execution to read memory, +possibly bypassing access controls. These exploits do not modify memory. + +This document covers Spectre variant 1 and 2. + +Affected processors +------------------- + +The vulnerability affects a wide range of modern high performance +processors, since most modern high speed processors use branch prediction +and speculative execution. + +The following CPUs are vulnerable: + + - Intel Core, Atom, Pentium, Xeon CPUs + - AMD CPUs like Phenom, EPYC, Zen. + - IBM processors like POWER and zSeries + - Higher end ARM processors + - Apple CPUs + - Higher end MIPS CPUs + - Likely most other high performance CPUs. Contact your CPU vendor for details. + +This document describes the mitigations on Intel CPUs. Mitigations +on other architectures may be different. + +Related CVEs +------------ + +The following CVE entries describe Spectre variants: + + ============= ======================= ========== + CVE-2017-5753 Bounds check bypass Spectre-V1 + CVE-2017-5715 Branch target injection Spectre-V2 + +Problem +------- + +CPUs have shared caches, such as buffers for branch prediction, which are +later used to guide speculative execution. These buffers are not flushed +over context switches or change in privilege levels. Malicious software +might influence these buffers and trigger specific speculative execution +in the kernel or different user processes. This speculative execution can +then be used to read data in memory and cause side effects, such as displacing +data in a data cache. The side effect can then later be measured by the +malicious software, and used to determine the memory values read speculatively. + +Spectre attacks allow tricking other software to disclose +values in their memory. + +In a typical Spectre variant 1 attack, the attacker passes an parameter +to a victim. The victim boundary checks the parameter and rejects illegal +values. However due to speculation over branch prediction the code path +for correct values might be speculatively executed, then reference memory +controlled by the input parameter and leave measurable side effects in +the caches. The attacker could then measure these side effects +and determine the leaked value. + +There are some extensions of Spectre variant 1 attacks for reading +data over the network, see [2]. However the attacks are very +difficult, low bandwidth and fragile and considered low risk. + +For Spectre variant 2 the attacker poisons the indirect branch +predictors of the CPU. Then control is passed to the victim, which +executes indirect branches. Due to the poisoned branch predictor data +the CPU can speculatively execute arbitrary code in the victim's +address space, such as a code sequence ("disclosure gadget") that +reads arbitrary data on some input parameter and causes a measurable +cache side effect based on the value. The attacker can then measure +this side effect after gaining control again and determine the value. + +The most useful gadgets take an attacker-controlled input parameter so +that the memory read can be controlled. Gadgets without input parameters +might be possible, but the attacker would have very little control over what +memory can be read, reducing the risk of the attack revealing useful data. + +Attack scenarios +---------------- + +Here is a list of attack scenarios that have been anticipated, but +may not cover all possible attack patterns. Reduing the occurrences of +attack pre-requisites listed can reduce the risk that a spectre attack +leaks useful data. + +1. Local User process attacking kernel +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Code in system calls often enforces access controls with conditional +branches based on user data. These branches are potential targets for +Spectre v2 exploits. Interrupt handlers, on the other hand, rarely +handle user data or enforce access controls, which makes them unlikely +exploit targets. + +For typical variant 2 attack, the attacker may poison the CPU branch +buffers first, and then enter the kernel and trick it into jumping to a +disclosure gadget through an indirect branch. If the attacker wants to control the +memory addresses leaked, it would also need to pass a parameter +to the gadget, either through a register or through a known address in +memory. Finally when it executes again it can measure the side effect. + +Necessary Prequisites: +1. Malicious local process passing parameters to kernel +2. Kernel has secrets. + +2. User process attacking another user process +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +In this scenario an malicious user process wants to attack another +user process through a context switch. + +For variant 1 this generally requires passing some parameter between +the processes, which needs a data passing relationship, such a remote +procedure calls (RPC). + +For variant 2 the poisoning can happen through a context switch, or +on CPUs with simultaneous multi-threading (SMT) potentially on the +thread sibling executing in parallel on the same core. In either case, +controlling the memory leaked by the disclosure gadget also requires a data +passing relationship to the victim process, otherwise while it may +observe values through side effects, it won't know which memory +addresses they relate to. + +Necessary Prerequisites: +1. Malicious code running as local process +2. Victim processes containing secrets running on same core. + +3. User sandbox attacking runtime in process +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +A process, such as a web browser, might be running interpreted or JITed +untrusted code, such as javascript code downloaded from a website. +It uses restrictions in the JIT code generator and checks in a run time +to prevent the untrusted code from attacking the hosting process. + +The untrusted code might either use variant 1 or 2 to trick +a disclosure gadget in the run time to read memory inside the process. + +Necessary Prerequisites: +1. Sandbox in process running untrusted code. +2. Runtime in same process containing secrets. + +4. Kernel sandbox attacking kernel +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The kernel has support for running user-supplied programs within the +kernel. Specific rules (such as bounds checking) are enforced on these +programs by the kernel to ensure that they do not violate access controls. + +eBPF is a kernel sub-system that uses user-supplied program +to execute JITed untrusted byte code inside the kernel. eBPF is used +for manipulating and examining network packets, examining system call +parameters for sand boxes and other uses. + +A malicious local process could upload and trigger an malicious +eBPF script to the kernel, with the script attacking the kernel +using variant 1 or 2 and reading memory. + +Necessary Prerequisites: +1. Malicious local process +2. eBPF JIT enabled for unprivileged users, attacking kernel with secrets +on the same machine. + +5. Virtualization guest attacking host +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +An untrusted guest might attack the host through a hyper call +or other virtualization exit. + +Necessary Prerequisites: +1. Untrusted guest attacking host +2. Host has secrets on local machine. + +For variant 1 VM exits use appropriate mitigations +("bounds clipping") to prevent speculation leaking data +in kernel code. For variant 2 the kernel flushes the branch buffer. + +6. Virtualization guest attacking other guest +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +An untrusted guest attacking another guest containing +secrets. Mitigations are similar to when a guest attack +the host. + +Runtime vulnerability information +--------------------------------- + +The kernel reports the vulnerability and mitigation status in +/sys/devices/system/cpu/vulnerabilities/* + +The spectre_v1 file describes the always enabled variant 1 +mitigation: + +/sys/devices/system/cpu/vulnerabilities/spectre_v1 + +The value in this file: + + ======================================= ================================= + 'Mitigation: __user pointer sanitation' Protection in kernel on a case by + case base with explicit pointer + sanitation. + ======================================= ================================= + +The spectre_v2 kernel file reports if the kernel has been compiled with a +retpoline aware compiler, if the CPU has hardware mitigation, and if the +CPU has microcode support for additional process specific mitigations. + +It also reports CPU features enabled by microcode to mitigate attack +between user processes: + +1. Indirect Branch Prediction Barrier (IBPB) to add additional + isolation between processes of different users +2. Single Thread Indirect Branch Prediction (STIBP) to additional + isolation between CPU threads running on the same core. + +These CPU features may impact performance when used and can +be enabled per process on a case-by-case base. + +/sys/devices/system/cpu/vulnerabilities/spectre_v2 + +The values in this file: + + - Kernel status: + + ==================================== ================================= + 'Not affected' The processor is not vulnerable + 'Vulnerable' Vulnerable, no mitigation + 'Mitigation: Full generic retpoline' Software-focused mitigation + 'Mitigation: Full AMD retpoline' AMD-specific software mitigation + 'Mitigation: Enhanced IBRS' Hardware-focused mitigation + ==================================== ================================= + + - Firmware status: + + ========== ============================================================= + 'IBRS_FW' Protection against user program attacks when calling firmware + ========== ============================================================= + + - Indirect branch prediction barrier (IBPB) status for protection between + processes of different users. This feature can be controlled through + prctl per process, or through kernel command line options. For more details + see below. + + =================== ======================================================== + 'IBPB: disabled' IBPB unused + 'IBPB: always-on' Use IBPB on all tasks + 'IBPB: conditional' Use IBPB on SECCOMP or indirect branch restricted tasks + =================== ======================================================== + + - Single threaded indirect branch prediction (STIBP) status for protection + between different hyper threads. This feature can be controlled through + prctl per process, or through kernel command line options. For more details + see below. + + ==================== ======================================================== + 'STIBP: disabled' STIBP unused + 'STIBP: forced' Use STIBP on all tasks + 'STIBP: conditional' Use STIBP on SECCOMP or indirect branch restricted tasks + ==================== ======================================================== + + - Return stack buffer (RSB) protection status: + + ============= =========================================== + 'RSB filling' Protection of RSB on context switch enabled + ============= =========================================== + +Full mitigations might require an microcode update from the CPU +vendor. When the necessary microcode is not available the kernel +will report vulnerability. + +Kernel mitigation +----------------- + +The kernel has default on mitigations for Variant 1 and Variant 2 +against attacks from user programs or guests. For variant 1 it +annotates vulnerable kernel code (as determined by the sparse code +scanning tool and code audits) to use "bounds clipping" to avoid any +usable disclosure gadgets. + +For variant 2 the kernel employs "retpoline" with compiler help to secure +the indirect branches inside the kernel, when CONFIG_RETPOLINE is enabled +and the compiler supports retpoline. On Intel Skylake-era systems the +mitigation covers most, but not all, cases, see [1] for more details. + +On CPUs with hardware mitigations for variant 2, retpoline is +automatically disabled at runtime. + +Using kernel address space randomization (CONFIG_RANDOMIZE_SLAB=y +and CONFIG_SLAB_FREELIST_RANDOM=y in the kernel configuration) +makes attacks on the kernel generally more difficult. + +Host mitigation +--------------- + +The Linux kernel uses retpoline to eliminate attacks on indirect +branches. It also flushes the Return Branch Stack on every VM exit to +prevent guests from attacking the host kernel when retpoline is +enabled. + +Variant 1 attacks are mitigated unconditionally. + +The kernel also allows guests to use any microcode based mitigations +they chose to use (such as IBPB or STIBP), assuming the +host has an updated microcode and reports the feature in +/sys/devices/system/cpu/vulnerabilities/spectre_v2. + +Mitigation control at kernel build time +--------------------------------------- + +When the CONFIG_RETPOLINE option is enabled the kernel uses special +code sequences to avoid attacks on indirect branches through +Variant 2 attacks. + +The compiler also needs to support retpoline and support the +-mindirect-branch=thunk-extern -mindirect-branch-register options +for gcc, or -mretpoline-external-thunk option for clang. + +When the compiler doesn't support these options the kernel +will report that it is vulnerable. + +Variant 1 mitigations and other side channel related user APIs are +enabled unconditionally. + +Hardware mitigation +------------------- + +Some CPUs have hardware mitigations (e.g. enhanced IBRS) for Spectre +variant 2. The 4.19 kernel has support for detecting this capability +and automatically disable any unnecessary workarounds at runtime. + +User program mitigation +----------------------- + +For variant 1 user programs can use LFENCE or bounds clipping. For more +details see [3]. + +For variant 2 user programs can be compiled with retpoline or +restricting its indirect branch speculation via prctl. (See +Documenation/speculation.txt for detailed API.) + +User programs should use address space randomization +(/proc/sys/kernel/randomize_va_space = 1 or 2) to make any attacks +more difficult. + +Mitigation control on the kernel command line +--------------------------------------------- + +Spectre v2 mitigations can be disabled and force enabled at the kernel +command line. + + nospectre_v2 [X86] Disable all mitigations for the Spectre variant 2 + (indirect branch prediction) vulnerability. System may + allow data leaks with this option, which is equivalent + to spectre_v2=off. + + + spectre_v2= [X86] Control mitigation of Spectre variant 2 + (indirect branch speculation) vulnerability. + The default operation protects the kernel from + user space attacks. + + on - unconditionally enable, implies + spectre_v2_user=on + off - unconditionally disable, implies + spectre_v2_user=off + auto - kernel detects whether your CPU model is + vulnerable + + Selecting 'on' will, and 'auto' may, choose a + mitigation method at run time according to the + CPU, the available microcode, the setting of the + CONFIG_RETPOLINE configuration option, and the + compiler with which the kernel was built. + + Selecting 'on' will also enable the mitigation + against user space to user space task attacks. + + Selecting 'off' will disable both the kernel and + the user space protections. + + Specific mitigations can also be selected manually: + + retpoline - replace indirect branches + retpoline,generic - google's original retpoline + retpoline,amd - AMD-specific minimal thunk + + Not specifying this option is equivalent to + spectre_v2=auto. + +For user space mitigation: + + spectre_v2_user= + [X86] Control mitigation of Spectre variant 2 + (indirect branch speculation) vulnerability between + user space tasks + + on - Unconditionally enable mitigations. Is + enforced by spectre_v2=on + + off - Unconditionally disable mitigations. Is + enforced by spectre_v2=off + + prctl - Indirect branch speculation is enabled, + but mitigation can be enabled via prctl + per thread. The mitigation control state + is inherited on fork. + + prctl,ibpb + - Like "prctl" above, but only STIBP is + controlled per thread. IBPB is issued + always when switching between different user + space processes. + + seccomp + - Same as "prctl" above, but all seccomp + threads will enable the mitigation unless + they explicitly opt out. + + seccomp,ibpb + - Like "seccomp" above, but only STIBP is + controlled per thread. IBPB is issued + always when switching between different + user space processes. + + auto - Kernel selects the mitigation depending on + the available CPU features and vulnerability. + + Default mitigation: + If CONFIG_SECCOMP=y then "seccomp", otherwise "prctl" + + Not specifying this option is equivalent to + spectre_v2_user=auto. + + In general the kernel by default selects + reasonable mitigations for the current CPU. To + disable Spectre v2 mitigations boot with + spectre_v2=off. Spectre v1 mitigations cannot + be disabled. + +APIs for mitigation control of user process +------------------------------------------- + +When enabling the "prctl" option for spectre_v2_user boot parameter, +prctl can be used to restrict indirect branch speculation on a process. +See Documenation/speculation.txt for detailed API. + +Processes containing secrets, such as cryptographic keys, may invoke +this prctl for extra protection against Spectre v2. + +Before running untrusted processes, restricting their indirect branch +speculation will prevent such processes from launching Spectre v2 attacks. + +Restricting indirect branch speuclation on a process should be only used +as needed, as restricting speculation reduces both performance of the +process, and also process running on the sibling CPU thread. + +Under the "seccomp" option, the processes sandboxed with SECCOMP will +have indirect branch speculation restricted automatically. + +References +---------- + +Intel white papers and documents on Spectre: + +https://newsroom.intel.com/wp-content/uploads/sites/11/2018/01/Intel-Analysi... + +[1] +https://software.intel.com/security-software-guidance/api-app/sites/default/... + +https://www.intel.com/content/www/us/en/architecture-and-technology/facts-ab... + +[3] https://software.intel.com/security-software-guidance/ + +https://software.intel.com/security-software-guidance/insights/deep-dive-sin... + +AMD white papers: + +https://developer.amd.com/wp-content/resources/90343-B_SoftwareTechniquesfor... + +https://www.amd.com/en/corporate/security-updates + +ARM white papers: + +https://developer.arm.com/support/arm-security-updates/speculative-processor... + +https://developer.arm.com/support/arm-security-updates/speculative-processor... + +MIPS: + +https://www.mips.com/blog/mips-response-on-speculative-execution-and-side-ch... + +Academic papers: + +https://spectreattack.com/spectre.pdf [original spectre paper] + +[2] https://arxiv.org/abs/1807.10535 [NetSpectre] + +https://arxiv.org/abs/1811.05441 [generalization of Spectre] + +https://arxiv.org/abs/1807.07940 [Spectre RSB, a variant of Spectre v2]
On 12/21/18 9:44 AM, Tim Chen wrote:
Thomas,
Andi and I have made an update to our draft of the Spectre admin guide. We may be out on Christmas vacation for a while. But we want to send it out for everyone to take a look.
Can you add a section on how to compile out all mitigations that have anything beyond negligible performance impact for those running systems where performance is more important than security?
Thanks, Ben
On 12/21/18 1:59 PM, Ben Greear wrote:
On 12/21/18 9:44 AM, Tim Chen wrote:
Thomas,
Andi and I have made an update to our draft of the Spectre admin guide. We may be out on Christmas vacation for a while. But we want to send it out for everyone to take a look.
Can you add a section on how to compile out all mitigations that have anything beyond negligible performance impact for those running systems where performance is more important than security?
If you don't worry about security and performance is paramount, then boot with "nospectre_v2". That's explained in the document.
Tim
On 12/21/2018 05:17 PM, Tim Chen wrote:
On 12/21/18 1:59 PM, Ben Greear wrote:
On 12/21/18 9:44 AM, Tim Chen wrote:
Thomas,
Andi and I have made an update to our draft of the Spectre admin guide. We may be out on Christmas vacation for a while. But we want to send it out for everyone to take a look.
Can you add a section on how to compile out all mitigations that have anything beyond negligible performance impact for those running systems where performance is more important than security?
If you don't worry about security and performance is paramount, then boot with "nospectre_v2". That's explained in the document.
There seem to be lots of different variants of this type of problem. It was not clear to me that just doing nospectre_v2 would be sufficient to get back full performance.
And anyway, I would like to compile the kernel to not need that command-line option, so I am still interesting in what compile options need to be set to what values...
Thanks, Ben
On 12/31/2018 8:22 AM, Ben Greear wrote:
On 12/21/2018 05:17 PM, Tim Chen wrote:
On 12/21/18 1:59 PM, Ben Greear wrote:
On 12/21/18 9:44 AM, Tim Chen wrote:
Thomas,
Andi and I have made an update to our draft of the Spectre admin guide. We may be out on Christmas vacation for a while. But we want to send it out for everyone to take a look.
Can you add a section on how to compile out all mitigations that have anything beyond negligible performance impact for those running systems where performance is more important than security?
If you don't worry about security and performance is paramount, then boot with "nospectre_v2". That's explained in the document.
There seem to be lots of different variants of this type of problem. It was not clear to me that just doing nospectre_v2 would be sufficient to get back full performance.
And anyway, I would like to compile the kernel to not need that command-line option, so I am still interesting in what compile options need to be set to what values...
the cloud people call this scenario "single tenant".. there might be different "users" in the uid sense, but they're all owned by the same folks
it would not be insane to make a CONFIG_SINGLE_TENANT kind of option under which we can group thse kind of things (and likely others)
On 12/31/18 8:22 AM, Ben Greear wrote:
On 12/21/2018 05:17 PM, Tim Chen wrote:
If you don't worry about security and performance is paramount, then boot with "nospectre_v2". That's explained in the document.
There seem to be lots of different variants of this type of problem. It was not clear to me that just doing nospectre_v2 would be sufficient to get back full performance.
The performance penalty comes from retpoline penalizing indirect branch predictions in kernel. With nospectre_v2, retpoline is disabled so you should get all the performance back from spectre mitigation.
This does not disable kernel page table isolation for meltdown mitigation, which also needs to be turned off if you want to get the full performance back. That's somewhat beyond the scope of this doc on Spectre.
And anyway, I would like to compile the kernel to not need that command-line option, so I am still interesting in what compile options need to be set to what values...
If you just want to disable spectre mitigation, setting CONFIG_RETPOLINE=n should do the trick. If you also want to disable meltdown mitigation, set CONFIG_PAGE_TABLE_ISOLATION=n.
Thanks.
Tim
On 1/7/19 9:57 AM, Tim Chen wrote:
On 12/31/18 8:22 AM, Ben Greear wrote:
On 12/21/2018 05:17 PM, Tim Chen wrote:
If you don't worry about security and performance is paramount, then boot with "nospectre_v2". That's explained in the document.
There seem to be lots of different variants of this type of problem. It was not clear to me that just doing nospectre_v2 would be sufficient to get back full performance.
The performance penalty comes from retpoline penalizing indirect branch predictions in kernel. With nospectre_v2, retpoline is disabled so you should get all the performance back from spectre mitigation.
This does not disable kernel page table isolation for meltdown mitigation, which also needs to be turned off if you want to get the full performance back. That's somewhat beyond the scope of this doc on Spectre.
The two bug families (spectre and meltdown) are conflated in my mind, at least.
For those of us who do not really understand this stuff in detail, it would be good to at least mention some notes about Meltdown I think.
And anyway, I would like to compile the kernel to not need that command-line option, so I am still interesting in what compile options need to be set to what values...
If you just want to disable spectre mitigation, setting CONFIG_RETPOLINE=n should do the trick. If you also want to disable meltdown mitigation, set CONFIG_PAGE_TABLE_ISOLATION=n.
Ok, are there any other CONFIG options that relate to fixing security bugs that have noticeable performance impacts or are these two the complete list?
Thanks, Ben
On 1/8/19 4:58 PM, Ben Greear wrote:
On 1/7/19 9:57 AM, Tim Chen wrote:
On 12/31/18 8:22 AM, Ben Greear wrote:
On 12/21/2018 05:17 PM, Tim Chen wrote:
If you don't worry about security and performance is paramount, then boot with "nospectre_v2". That's explained in the document.
There seem to be lots of different variants of this type of problem. It was not clear to me that just doing nospectre_v2 would be sufficient to get back full performance.
The performance penalty comes from retpoline penalizing indirect branch predictions in kernel. With nospectre_v2, retpoline is disabled so you should get all the performance back from spectre mitigation.
This does not disable kernel page table isolation for meltdown mitigation, which also needs to be turned off if you want to get the full performance back. That's somewhat beyond the scope of this doc on Spectre.
The two bug families (spectre and meltdown) are conflated in my mind, at least.
For those of us who do not really understand this stuff in detail, it would be good to at least mention some notes about Meltdown I think.
Probably Meltdown deserves its own meltdown.rst, I think.
And anyway, I would like to compile the kernel to not need that command-line option, so I am still interesting in what compile options need to be set to what values...
If you just want to disable spectre mitigation, setting CONFIG_RETPOLINE=n should do the trick. If you also want to disable meltdown mitigation, set CONFIG_PAGE_TABLE_ISOLATION=n.
Ok, are there any other CONFIG options that relate to fixing security bugs that have noticeable performance impacts or are these two the complete list?
There are those related to Speculative Store Bypass Disable (SSBD) and L1 Terminal Fault (L1TF). SSBD affects mostly sandboxed code so you should not have performance impact unless you are running code sandboxed with SECCOMP. L1TF has its own explanation in l1tf.rst and affects performance mostly of VM.
So you should be good if you turn off retpoline and page table isolation in your config if those things don't affect you.
If we want a single CONFIG to turn all these off, like what Arjan suggested, that will be a separate topic and discussions.
Tim
On Fri, Dec 21, 2018 at 09:44:44AM -0800, Tim Chen wrote:
+4. Kernel sandbox attacking kernel +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+The kernel has support for running user-supplied programs within the +kernel. Specific rules (such as bounds checking) are enforced on these +programs by the kernel to ensure that they do not violate access controls.
+eBPF is a kernel sub-system that uses user-supplied program +to execute JITed untrusted byte code inside the kernel. eBPF is used +for manipulating and examining network packets, examining system call +parameters for sand boxes and other uses.
+A malicious local process could upload and trigger an malicious +eBPF script to the kernel, with the script attacking the kernel +using variant 1 or 2 and reading memory.
Above is not correct. The exploit for var2 does not load bpf progs into kernel. Instead the bpf interpreter is speculatively executing bpf prog that was never loaded. Hence CONFIG_BPF_JIT_ALWAYS_ON=y is necessary to make var2 harder to exploit. Same goes for other in kernel interpreters and state machines.
+Necessary Prerequisites: +1. Malicious local process +2. eBPF JIT enabled for unprivileged users, attacking kernel with secrets +on the same machine.
This is not quite correct either. Var 1 could have been exploited with and without JIT. Also above sounds like that var1 is still exploitable through bpf which is not the case.
On 12/23/18 3:11 PM, Alexei Starovoitov wrote:
On Fri, Dec 21, 2018 at 09:44:44AM -0800, Tim Chen wrote:
+4. Kernel sandbox attacking kernel +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+The kernel has support for running user-supplied programs within the +kernel. Specific rules (such as bounds checking) are enforced on these +programs by the kernel to ensure that they do not violate access controls.
+eBPF is a kernel sub-system that uses user-supplied program +to execute JITed untrusted byte code inside the kernel. eBPF is used +for manipulating and examining network packets, examining system call +parameters for sand boxes and other uses.
+A malicious local process could upload and trigger an malicious +eBPF script to the kernel, with the script attacking the kernel +using variant 1 or 2 and reading memory.
Above is not correct. The exploit for var2 does not load bpf progs into kernel. Instead the bpf interpreter is speculatively executing bpf prog that was never loaded. Hence CONFIG_BPF_JIT_ALWAYS_ON=y is necessary to make var2 harder to exploit. Same goes for other in kernel interpreters and state machines.
+Necessary Prerequisites: +1. Malicious local process +2. eBPF JIT enabled for unprivileged users, attacking kernel with secrets +on the same machine.
This is not quite correct either. Var 1 could have been exploited with and without JIT. Also above sounds like that var1 is still exploitable through bpf which is not the case.
Alexi,
Do you have any suggestions on how to rewrite this two paragraphs? You are probably the best person to update content for this section.
Thanks.
Tim
On Tue, Jan 08, 2019 at 01:12:45PM -0800, Tim Chen wrote:
On 12/23/18 3:11 PM, Alexei Starovoitov wrote:
On Fri, Dec 21, 2018 at 09:44:44AM -0800, Tim Chen wrote:
+4. Kernel sandbox attacking kernel +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+The kernel has support for running user-supplied programs within the +kernel. Specific rules (such as bounds checking) are enforced on these +programs by the kernel to ensure that they do not violate access controls.
+eBPF is a kernel sub-system that uses user-supplied program +to execute JITed untrusted byte code inside the kernel. eBPF is used +for manipulating and examining network packets, examining system call +parameters for sand boxes and other uses.
+A malicious local process could upload and trigger an malicious +eBPF script to the kernel, with the script attacking the kernel +using variant 1 or 2 and reading memory.
Above is not correct. The exploit for var2 does not load bpf progs into kernel. Instead the bpf interpreter is speculatively executing bpf prog that was never loaded. Hence CONFIG_BPF_JIT_ALWAYS_ON=y is necessary to make var2 harder to exploit. Same goes for other in kernel interpreters and state machines.
+Necessary Prerequisites: +1. Malicious local process +2. eBPF JIT enabled for unprivileged users, attacking kernel with secrets +on the same machine.
This is not quite correct either. Var 1 could have been exploited with and without JIT. Also above sounds like that var1 is still exploitable through bpf which is not the case.
Alexi,
Do you have any suggestions on how to rewrite this two paragraphs? You are probably the best person to update content for this section.
how about moving bpf bits out of this doc and placing them under Documentation/bpf/ ? We can create bpf_security.rst there with specdown mitigations, best practices, useful sysctl and config knobs, etc.
On 1/8/19 5:11 PM, Alexei Starovoitov wrote:
Alexi,
Do you have any suggestions on how to rewrite this two paragraphs? You are probably the best person to update content for this section.
how about moving bpf bits out of this doc and placing them under Documentation/bpf/ ? We can create bpf_security.rst there with specdown mitigations, best practices, useful sysctl and config knobs, etc.
Maybe we can provide some minimum but accurate info here on this category of Spectre attack for completeness. We can later provide a link to bpf_security.rst here with more details when that becomes available.
Otherwise, I can remove it if you prefer. But people concerned about Spectre will most likely read this doc first. I want them to be pointed to the detailed BPF security doc.
Tim
On Tue, Jan 08, 2019 at 05:41:37PM -0800, Tim Chen wrote:
On 1/8/19 5:11 PM, Alexei Starovoitov wrote:
Alexi,
Do you have any suggestions on how to rewrite this two paragraphs? You are probably the best person to update content for this section.
how about moving bpf bits out of this doc and placing them under Documentation/bpf/ ? We can create bpf_security.rst there with specdown mitigations, best practices, useful sysctl and config knobs, etc.
Maybe we can provide some minimum but accurate info here on this category of Spectre attack for completeness. We can later provide a link to bpf_security.rst here with more details when that becomes available.
Otherwise, I can remove it if you prefer. But people concerned about Spectre will most likely read this doc first. I want them to be pointed to the detailed BPF security doc.
since Documentation/ got converted to .rst, the links made it easy to follow from one doc into another. I think splitting big doc makes it easier for users to read and for us to maintain/update.
On Fri, 21 Dec 2018 09:44:44 -0800 Tim Chen tim.c.chen@linux.intel.com wrote:
Andi and I have made an update to our draft of the Spectre admin guide. We may be out on Christmas vacation for a while. But we want to send it out for everyone to take a look.
Thanks.
Tim
From: Andi Kleen ak@linux.intel.com
There are no document in admin guides describing Spectre v1 and v2 side channels and their mitigations in Linux.
Create a document to describe Spectre and the mitigation methods used in the kernel.
Signed-off-by: Andi Kleen ak@linux.intel.com Signed-off-by: Tim Chen tim.c.chen@linux.intel.com
Documentation/admin-guide/spectre.rst | 502 ++++++++++++++++++++++++++++++++++ 1 file changed, 502 insertions(+) create mode 100644 Documentation/admin-guide/spectre.rst
I only saw this now, seems I wasn't copied... I'll take a deeper look, but I have a couple of meta comments:
- This could arguably go in the security book rather than the admin guide. I don't really have a strong opinion on which is right at the moment, but others might.
- Wherever it ends up, can you also please add it to the appropriate index.rst file so it actually gets built with the rest of the docs?
Thanks,
jon
On 12/28/18 9:34 AM, Jonathan Corbet wrote:
Signed-off-by: Andi Kleen ak@linux.intel.com Signed-off-by: Tim Chen tim.c.chen@linux.intel.com
Documentation/admin-guide/spectre.rst | 502 ++++++++++++++++++++++++++++++++++ 1 file changed, 502 insertions(+) create mode 100644 Documentation/admin-guide/spectre.rst
I only saw this now, seems I wasn't copied... I'll take a deeper look, but
Thanks for taking a look. I'll make sure you are copied on the updates.
I have a couple of meta comments:
- This could arguably go in the security book rather than the admin guide. I don't really have a strong opinion on which is right at the moment, but others might.
Since l1tf.rst is already here in admin guide, that's why I keep spectre.rst here as well.
- Wherever it ends up, can you also please add it to the appropriate index.rst file so it actually gets built with the rest of the docs?
Yes, index.rst needs update too.
Thanks.
Tim
Hi!
Signed-off-by: Andi Kleen ak@linux.intel.com Signed-off-by: Tim Chen tim.c.chen@linux.intel.com
Documentation/admin-guide/spectre.rst | 502 ++++++++++++++++++++++++++++++++++ 1 file changed, 502 insertions(+) create mode 100644 Documentation/admin-guide/spectre.rst
I only saw this now, seems I wasn't copied... I'll take a deeper look, but
Thanks for taking a look. I'll make sure you are copied on the updates.
I have a couple of meta comments:
- This could arguably go in the security book rather than the admin guide. I don't really have a strong opinion on which is right at the moment, but others might.
Since l1tf.rst is already here in admin guide, that's why I keep spectre.rst here as well.
I believe l1tf is misplaced. That one really is Intel-specific (not even all x86s are affectd). Same for Meltdown. Pavel
On Mon 2019-01-14 00:12:59, Jiri Kosina wrote:
On Mon, 14 Jan 2019, Pavel Machek wrote:
That one really is Intel-specific (not even all x86s are affectd). Same for Meltdown.
At least for Meltdown, your claim is simply not correct.
You are right, there may be few ARM chips affected by meltdown.
I don't know about any non-Intel affected by l1tf.
...and its documentation is just plain wrong, explaining I'm protected when I'm not...
commit f372cd79be31382ae6030a1f15638cc7fe9eeb9f Author: Pavel pavel@ucw.cz Date: Thu Jan 3 00:48:40 2019 +0100
Ok, I guess L1TF was a lot of fun, and there was not time for a good documentation.
There's admin guide that is written as an advertisment, and unfortunately is slightly "inaccurate" at places (to the point of lying).
Plus, I believe it should go to x86/ directory, as this is really Intel issue, and not anything ARM (or RISC-V) people need to know.
Signed-off-by: Pavel Machek pavel@ucw.cz
diff --git a/Documentation/admin-guide/l1tf.rst b/Documentation/admin-guide/l1tf.rst index 9af9773..05c5422 100644 --- a/Documentation/admin-guide/l1tf.rst +++ b/Documentation/admin-guide/l1tf.rst @@ -1,10 +1,11 @@ L1TF - L1 Terminal Fault ========================
-L1 Terminal Fault is a hardware vulnerability which allows unprivileged -speculative access to data which is available in the Level 1 Data Cache -when the page table entry controlling the virtual address, which is used -for the access, has the Present bit cleared or other reserved bits set. +L1 Terminal Fault is a hardware vulnerability on most recent Intel x86 +CPUs which allows unprivileged speculative access to data which is +available in the Level 1 Data Cache when the page table entry +controlling the virtual address, which is used for the access, has the +Present bit cleared or other reserved bits set.
Affected processors ------------------- @@ -76,12 +77,14 @@ Attack scenarios deterministic and more practical.
The Linux kernel contains a mitigation for this attack vector, PTE - inversion, which is permanently enabled and has no performance - impact. The kernel ensures that the address bits of PTEs, which are not - marked present, never point to cacheable physical memory space. - - A system with an up to date kernel is protected against attacks from - malicious user space applications. + inversion, which is permanently enabled and has no measurable + performance impact in most configurations. The kernel ensures that + the address bits of PTEs, which are not marked present, never point + to cacheable physical memory space. On x86-32, this physical memory + needs to be limited to 2GiB to make mitigation effective. + + Mitigation is present in kernels v4.19 and newer, and in + recent -stable kernels.
2. Malicious guest in a virtual machine ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
On Mon, 14 Jan 2019, Pavel Machek wrote:
That one really is Intel-specific (not even all x86s are affectd). Same for Meltdown.
At least for Meltdown, your claim is simply not correct.
You are right, there may be few ARM chips affected by meltdown.
And some of the powerpc64s as well.
On Mon 2019-01-14 13:06:24, Jiri Kosina wrote:
On Mon, 14 Jan 2019, Pavel Machek wrote:
That one really is Intel-specific (not even all x86s are affectd). Same for Meltdown.
At least for Meltdown, your claim is simply not correct.
You are right, there may be few ARM chips affected by meltdown.
And some of the powerpc64s as well.
Do you mean this?
https://lkml.org/lkml/2018/1/8/649
Frankly I'd not call it Meltdown, as it works only on data in the cache, so the defense is completely different. Seems more like a l1tf :-).
Pavel
On Mon, 14 Jan 2019, Pavel Machek wrote:
Frankly I'd not call it Meltdown, as it works only on data in the cache, so the defense is completely different. Seems more like a l1tf :-).
Meltdown on x86 also seems to work only for data in L1D, but the pipeline could be constructed in a way that data are actually fetched into L1D before speculation gives up, which is not the case on ppc (speculation aborts on L2->L1 propagation IIRC). That's why flushing L1D on ppc is sufficient, but on x86 it's not.
On 1/14/2019 5:06 AM, Jiri Kosina wrote:
On Mon, 14 Jan 2019, Pavel Machek wrote:
Frankly I'd not call it Meltdown, as it works only on data in the cache, so the defense is completely different. Seems more like a l1tf :-).
Meltdown on x86 also seems to work only for data in L1D, but the pipeline could be constructed in a way that data are actually fetched into L1D before speculation gives up, which is not the case on ppc (speculation aborts on L2->L1 propagation IIRC). That's why flushing L1D on ppc is sufficient, but on x86 it's not.
assuming L1D is not shared between SMT threads obviously :)
Tim,
On Fri, 21 Dec 2018, Tim Chen wrote:
Andi and I have made an update to our draft of the Spectre admin guide. We may be out on Christmas vacation for a while. But we want to send it out for everyone to take a look.
Yup, it fell through my Christmas cracks as well.
Documentation/admin-guide/spectre.rst | 502 ++++++++++++++++++++++++++++++++++
I agree with Jonathan that this wants to be placed differently. Sorry, I set the precedence with the l1tf document, but I didn't come up with a good place either.
Something like admin-guide/hardware-vulnerabilities/... might work.
+The following CPUs are vulnerable:
- Intel Core, Atom, Pentium, Xeon CPUs
- AMD CPUs like Phenom, EPYC, Zen.
- IBM processors like POWER and zSeries
- Higher end ARM processors
- Apple CPUs
- Higher end MIPS CPUs
- Likely most other high performance CPUs. Contact your CPU vendor for details.
+This document describes the mitigations on Intel CPUs. Mitigations +on other architectures may be different.
No. A lot of the information is the same for all other CPU vendors. So sharing that document makes a lot of sense. Intel is not the center of the universe.
+Problem +-------
+CPUs have shared caches, such as buffers for branch prediction, which are +later used to guide speculative execution. These buffers are not flushed +over context switches or change in privilege levels. Malicious software
change of privilege levels
+might influence these buffers and trigger specific speculative execution +in the kernel or different user processes. This speculative execution can +then be used to read data in memory and cause side effects, such as displacing +data in a data cache. The side effect can then later be measured by the +malicious software, and used to determine the memory values read speculatively.
+Spectre attacks allow tricking other software to disclose +values in their memory.
No. Spectre attacks do not allow that. It's the hardware properties which allow attackers to exploit the side effects of speculative execution.
+In a typical Spectre variant 1 attack, the attacker passes an parameter
Please explain first what the fundamental difference between variant 1 and variant 2 is. Then go into details of each variant.
+to a victim. The victim boundary checks the parameter and rejects illegal +values. However due to speculation over branch prediction the code path +for correct values might be speculatively executed, then reference memory
reference memory?
+controlled by the input parameter and leave measurable side effects in +the caches.
This really is not describing it properly. Please spell out the most obvious (at least for this who know) attack vector, i.e. array access based on the input parameter. That's where the bound check is bypassed.
The attacker could then measure these side effects
+and determine the leaked value.
+There are some extensions of Spectre variant 1 attacks for reading +data over the network, see [2]. However the attacks are very +difficult, low bandwidth and fragile and considered low risk.
+For Spectre variant 2 the attacker poisons the indirect branch +predictors of the CPU.
At least some high level explanation how that poisoning happens would be appropriate.
... Then control is passed to the victim, which +executes indirect branches. Due to the poisoned branch predictor data +the CPU can speculatively execute arbitrary code in the victim's +address space, such as a code sequence ("disclosure gadget") that +reads arbitrary data on some input parameter and causes a measurable +cache side effect based on the value. The attacker can then measure +this side effect after gaining control again and determine the value.
+The most useful gadgets take an attacker-controlled input parameter so +that the memory read can be controlled. Gadgets without input parameters +might be possible, but the attacker would have very little control over what +memory can be read, reducing the risk of the attack revealing useful data.
Makes sense.
+Attack scenarios +----------------
+Here is a list of attack scenarios that have been anticipated, but +may not cover all possible attack patterns. Reduing the occurrences of +attack pre-requisites listed can reduce the risk that a spectre attack +leaks useful data.
+1. Local User process attacking kernel +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+Code in system calls often enforces access controls with conditional +branches based on user data. These branches are potential targets for +Spectre v2 exploits. Interrupt handlers, on the other hand, rarely +handle user data or enforce access controls, which makes them unlikely +exploit targets.
+For typical variant 2 attack, the attacker may poison the CPU branch +buffers first, and then enter the kernel and trick it into jumping to a
No, this is imprecise. The attacker may not poison the branch buffer, it poisons the branch prediction buffer. Then it enters the kernel. After entering the kernel it cannot trick it (the kernel) to do anything. No, the poisoned branch prediction buffer causes the hardware speculation unit to go down the wrong path.
Please be precise. Fairy tales are not useful for anyone.
+disclosure gadget through an indirect branch. If the attacker wants to control the +memory addresses leaked, it would also need to pass a parameter +to the gadget, either through a register or through a known address in +memory. Finally when it executes again it can measure the side effect.
+Necessary Prequisites: +1. Malicious local process passing parameters to kernel +2. Kernel has secrets.
2) is silly. Of course has the kernel secrets. Everything which should not accessible by the attacker due to priviledge separation etc. are secrets in the view of the attacker. Whether they are useful or not is a different story.
+2. User process attacking another user process +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+In this scenario an malicious user process wants to attack another
s/an/a/
s/wants to attack/tries to attacks/
+user process through a context switch.
+For variant 1 this generally requires passing some parameter between +the processes, which needs a data passing relationship, such a remote
such as
+procedure calls (RPC).
+For variant 2 the poisoning can happen through a context switch, or +on CPUs with simultaneous multi-threading (SMT) potentially on the +thread sibling executing in parallel on the same core. In either case, +controlling the memory leaked by the disclosure gadget also requires a data +passing relationship to the victim process, otherwise while it may
s/it/the attacker/ otherwise the reference is not conclusive
+observe values through side effects, it won't know which memory +addresses they relate to.
+Necessary Prerequisites: +1. Malicious code running as local process +2. Victim processes containing secrets running on same core.
Again. All memory of the victim has to be considered as secret simply because it should not be accessible for the attacker in the first place. That's a fundamental guarantee of address space separation which is violated by the hardware.
+3. User sandbox attacking runtime in process +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+A process, such as a web browser, might be running interpreted or JITed +untrusted code, such as javascript code downloaded from a website. +It uses restrictions in the JIT code generator and checks in a run time +to prevent the untrusted code from attacking the hosting process.
Confusing use of 'might be' and present tense. It's not about 'might be'. You are describing a scenario, so wants to be:
If a process runs interpreted or JITed untrusted code,...., it uses restrictions ....
Hmm?
+The untrusted code might either use variant 1 or 2 to trick +a disclosure gadget in the run time to read memory inside the process.
to trick a disclosure gadget?
to trick the hardware into executing a disclosure gadget...
describes it correctly. Please be more careful.
+Necessary Prerequisites: +1. Sandbox in process running untrusted code. +2. Runtime in same process containing secrets.
Oh well.
+4. Kernel sandbox attacking kernel +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+The kernel has support for running user-supplied programs within the +kernel. Specific rules (such as bounds checking) are enforced on these +programs by the kernel to ensure that they do not violate access controls.
+eBPF is a kernel sub-system that uses user-supplied program +to execute JITed untrusted byte code inside the kernel. eBPF is used +for manipulating and examining network packets, examining system call +parameters for sand boxes and other uses.
+A malicious local process could upload and trigger an malicious +eBPF script to the kernel, with the script attacking the kernel +using variant 1 or 2 and reading memory.
+Necessary Prerequisites: +1. Malicious local process +2. eBPF JIT enabled for unprivileged users, attacking kernel with secrets +on the same machine.
Alexey already commented on that one, but in general the above remarks vs. precise description apply as well.
+5. Virtualization guest attacking host +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+An untrusted guest might attack the host through a hyper call +or other virtualization exit.
exit mechanisms ?
+Necessary Prerequisites: +1. Untrusted guest attacking host +2. Host has secrets on local machine.
+For variant 1 VM exits use appropriate mitigations
VM exits use?
+("bounds clipping") to prevent speculation leaking data +in kernel code. For variant 2 the kernel flushes the branch buffer.
+6. Virtualization guest attacking other guest +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+An untrusted guest attacking another guest containing
s/containing secrets// Stop this secret thing please. It's not helpful in any way.
+secrets. Mitigations are similar to when a guest attack +the host.
That's not a proper sentence.
The (host?) kernel has mitigations for this in place which are similar to the mitigations which are used to prevent guest to host attacks.
+Runtime vulnerability information +---------------------------------
+The kernel reports the vulnerability and mitigation status in +/sys/devices/system/cpu/vulnerabilities/*
Can we please align that with the wording in the L1TF document?
The Linux kernel provides a sysfs interface to enumerate the current L1TF status of the system: whether the system is vulnerable, and which mitigations are active. The relevant sysfs file is:
+The spectre_v1 file describes the always enabled variant 1 +mitigation:
+/sys/devices/system/cpu/vulnerabilities/spectre_v1
+The value in this file:
- ======================================= =================================
- 'Mitigation: __user pointer sanitation' Protection in kernel on a case by
case base with explicit pointer
sanitation.
- ======================================= =================================
This fails to mention that these protections are on a case by case basis and there is no guarantee that all possible attack vectors are covered.
+The spectre_v2 kernel file reports if the kernel has been compiled with a +retpoline aware compiler, if the CPU has hardware mitigation, and if the
How on earth should anyone who is not familiar with the inner workings of all this know what a retpoline compiler is?
+CPU has microcode support for additional process specific mitigations.
+It also reports CPU features enabled by microcode to mitigate attack +between user processes:
+1. Indirect Branch Prediction Barrier (IBPB) to add additional
- isolation between processes of different users
+2. Single Thread Indirect Branch Prediction (STIBP) to additional
- isolation between CPU threads running on the same core.
+These CPU features may impact performance when used and can +be enabled per process on a case-by-case base.
+/sys/devices/system/cpu/vulnerabilities/spectre_v2
+The values in this file:
- Kernel status:
- ==================================== =================================
- 'Not affected' The processor is not vulnerable
- 'Vulnerable' Vulnerable, no mitigation
- 'Mitigation: Full generic retpoline' Software-focused mitigation
- 'Mitigation: Full AMD retpoline' AMD-specific software mitigation
- 'Mitigation: Enhanced IBRS' Hardware-focused mitigation
- ==================================== =================================
- Firmware status:
- ========== =============================================================
- 'IBRS_FW' Protection against user program attacks when calling firmware
- ========== =============================================================
- Indirect branch prediction barrier (IBPB) status for protection between
- processes of different users. This feature can be controlled through
- prctl per process, or through kernel command line options. For more details
- see below.
rst supports hyperlinks and 'see below' is a lame reference. The title of that chapter is known, right?
- =================== ========================================================
- 'IBPB: disabled' IBPB unused
- 'IBPB: always-on' Use IBPB on all tasks
- 'IBPB: conditional' Use IBPB on SECCOMP or indirect branch restricted tasks
- =================== ========================================================
- Single threaded indirect branch prediction (STIBP) status for protection
- between different hyper threads. This feature can be controlled through
- prctl per process, or through kernel command line options. For more details
- see below.
Ditto.
- ==================== ========================================================
- 'STIBP: disabled' STIBP unused
- 'STIBP: forced' Use STIBP on all tasks
- 'STIBP: conditional' Use STIBP on SECCOMP or indirect branch restricted tasks
- ==================== ========================================================
- Return stack buffer (RSB) protection status:
- ============= ===========================================
- 'RSB filling' Protection of RSB on context switch enabled
- ============= ===========================================
+Full mitigations might require an microcode update from the CPU +vendor. When the necessary microcode is not available the kernel +will report vulnerability.
+Kernel mitigation +-----------------
+The kernel has default on mitigations for Variant 1 and Variant 2
Wrong. V1 is default on. V2 is only available when there is a retpoline capable compiler used to build the kernel.
+against attacks from user programs or guests. For variant 1 it +annotates vulnerable kernel code (as determined by the sparse code +scanning tool and code audits) to use "bounds clipping" to avoid any +usable disclosure gadgets.
+For variant 2 the kernel employs "retpoline" with compiler help to secure
Again. There needs to be a paragraph which explains what retpoline is about. Then you can spare all the repeating (and different) explanations in these contexts.
+the indirect branches inside the kernel, when CONFIG_RETPOLINE is enabled +and the compiler supports retpoline. On Intel Skylake-era systems the +mitigation covers most, but not all, cases, see [1] for more details.
+On CPUs with hardware mitigations for variant 2, retpoline is +automatically disabled at runtime.
+Using kernel address space randomization (CONFIG_RANDOMIZE_SLAB=y +and CONFIG_SLAB_FREELIST_RANDOM=y in the kernel configuration) +makes attacks on the kernel generally more difficult.
+Host mitigation +---------------
+The Linux kernel uses retpoline to eliminate attacks on indirect +branches. It also flushes the Return Branch Stack on every VM exit to +prevent guests from attacking the host kernel when retpoline is +enabled.
+Variant 1 attacks are mitigated unconditionally.
As far as covered ....
+The kernel also allows guests to use any microcode based mitigations +they chose to use (such as IBPB or STIBP), assuming the +host has an updated microcode and reports the feature in +/sys/devices/system/cpu/vulnerabilities/spectre_v2.
What has the sysfs file to do with that? The guest can only use it when the host reports the feature to the guest. In fact the host allows the guest more features to use than the host kernel uses itself.
+Mitigation control at kernel build time +---------------------------------------
+When the CONFIG_RETPOLINE option is enabled the kernel uses special +code sequences to avoid attacks on indirect branches through +Variant 2 attacks.
+The compiler also needs to support retpoline and support the +-mindirect-branch=thunk-extern -mindirect-branch-register options +for gcc, or -mretpoline-external-thunk option for clang.
+When the compiler doesn't support these options the kernel +will report that it is vulnerable.
+Variant 1 mitigations and other side channel related user APIs are
side channel related user APIs ???
+enabled unconditionally.
+Hardware mitigation +-------------------
+Some CPUs have hardware mitigations (e.g. enhanced IBRS) for Spectre +variant 2. The 4.19 kernel has support for detecting this capability
That has been backported ....
+and automatically disable any unnecessary workarounds at runtime.
+User program mitigation +-----------------------
+For variant 1 user programs can use LFENCE or bounds clipping. For more +details see [3].
+For variant 2 user programs can be compiled with retpoline or +restricting its indirect branch speculation via prctl. (See
s/its/their/
+Documenation/speculation.txt for detailed API.)
Huch? What has that file to do with the prctl?
+User programs should use address space randomization +(/proc/sys/kernel/randomize_va_space = 1 or 2) to make any attacks
s/any//
+more difficult.
+APIs for mitigation control of user process +-------------------------------------------
+When enabling the "prctl" option for spectre_v2_user boot parameter, +prctl can be used to restrict indirect branch speculation on a process. +See Documenation/speculation.txt for detailed API.
See above.
+Processes containing secrets, such as cryptographic keys, may invoke +this prctl for extra protection against Spectre v2.
+Before running untrusted processes, restricting their indirect branch +speculation will prevent such processes from launching Spectre v2 attacks.
+Restricting indirect branch speuclation on a process should be only used +as needed, as restricting speculation reduces both performance of the +process, and also process running on the sibling CPU thread.
+Under the "seccomp" option, the processes sandboxed with SECCOMP will +have indirect branch speculation restricted automatically.
This whole section needs a lot of care and is incomplete and partially misleading.
Also please follow the L1TF documentation which explains for each of the mitigation modes which kind of attacks are prevented and which holes remain.
It's a good start but far from where it should be.
Thanks,
tglx
Tim,
On Wed, 30 Jan 2019, Thomas Gleixner wrote:
Also please follow the L1TF documentation which explains for each of the mitigation modes which kind of attacks are prevented and which holes remain.
It's a good start but far from where it should be.
what's the state of this?
Thanks,
tglx
On 2/12/19 4:00 AM, Thomas Gleixner wrote:
Tim,
On Wed, 30 Jan 2019, Thomas Gleixner wrote:
Also please follow the L1TF documentation which explains for each of the mitigation modes which kind of attacks are prevented and which holes remain.
It's a good start but far from where it should be.
what's the state of this?
I was pulled to work on some other tasks. Will come back to updating the document later in the week.
Sorry for the delay.
Thanks.
Tim
linux-stable-mirror@lists.linaro.org