The code for detecting CPUs that are vulnerable to Spectre BHB was based on a hardcoded list of CPU IDs that were known to be affected. Unfortunately, the list mostly only contained the IDs of standard ARM cores. The IDs for many cores that are minor variants of the standard ARM cores (like many Qualcomm Kyro CPUs) weren't listed. This led the code to assume that those variants were not affected.
Flip the code on its head and instead list CPU IDs for cores that are known to be _not_ affected. Now CPUs will be assumed vulnerable until added to the list saying that they're safe.
As of right now, the only CPU IDs added to the "unaffected" list are ARM Cortex A35, A53, and A55. This list was created by looking at older cores listed in cputype.h that weren't listed in the "affected" list previously.
Unfortunately, while this solution is better than what we had before, it's still an imperfect solution. Specifically there are two ways to mitigate Spectre BHB and one of those ways is parameterized with a "k" value indicating how many loops are needed to mitigate. If we have an unknown CPU ID then we've got to guess about how to mitigate it. Since more cores seem to be mitigated by looping (and because it's unlikely that the needed FW code will be in place for FW mitigation for unknown cores), we'll choose looping for unknown CPUs and choose the highest "k" value of 32.
The downside of our guessing is that some CPUs may now report as "mitigated" when in reality they should need a firmware mitigation. We'll choose to put a WARN_ON splat in the logs in this case any time we had to make a guess since guessing the right mitigation is pretty awful. Hopefully this will encourage CPU vendors to add their CPU IDs to the list.
Fixes: 558c303c9734 ("arm64: Mitigate spectre style branch history side channels") Cc: stable@vger.kernel.org Signed-off-by: Douglas Anderson dianders@chromium.org ---
Changes in v2: - New
arch/arm64/kernel/proton-pack.c | 46 +++++++++++++++++++++++++++------ 1 file changed, 38 insertions(+), 8 deletions(-)
diff --git a/arch/arm64/kernel/proton-pack.c b/arch/arm64/kernel/proton-pack.c index da53722f95d4..39c5573c7527 100644 --- a/arch/arm64/kernel/proton-pack.c +++ b/arch/arm64/kernel/proton-pack.c @@ -841,13 +841,31 @@ enum bhb_mitigation_bits { }; static unsigned long system_bhb_mitigations;
+static const struct midr_range spectre_bhb_firmware_mitigated_list[] = { + MIDR_ALL_VERSIONS(MIDR_CORTEX_A73), + MIDR_ALL_VERSIONS(MIDR_CORTEX_A75), + {}, +}; + +static const struct midr_range spectre_bhb_safe_list[] = { + MIDR_ALL_VERSIONS(MIDR_CORTEX_A35), + MIDR_ALL_VERSIONS(MIDR_CORTEX_A53), + MIDR_ALL_VERSIONS(MIDR_CORTEX_A55), + {}, +}; + /* * This must be called with SCOPE_LOCAL_CPU for each type of CPU, before any * SCOPE_SYSTEM call will give the right answer. + * + * NOTE: Unknown CPUs are reported as affected. In order to make this work + * and still keep the list short, only handle CPUs where: + * - supports_csv2p3() returned false + * - supports_clearbhb() returned false. */ u8 spectre_bhb_loop_affected(int scope) { - u8 k = 0; + u8 k; static u8 max_bhb_k;
if (scope == SCOPE_LOCAL_CPU) { @@ -886,6 +904,16 @@ u8 spectre_bhb_loop_affected(int scope) k = 11; else if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_k8_list)) k = 8; + else if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_safe_list) || + is_midr_in_range_list(read_cpuid_id(), spectre_bhb_firmware_mitigated_list)) + k = 0; + else { + WARN_ONCE(true, + "Unrecognized CPU %#010x, assuming Spectre BHB vulnerable\n", + read_cpuid_id()); + /* Hopefully k = 32 handles the worst case for unknown CPUs */ + k = 32; + }
max_bhb_k = max(max_bhb_k, k); } else { @@ -916,24 +944,26 @@ static enum mitigation_state spectre_bhb_get_cpu_fw_mitigation_state(void) } }
+/* + * NOTE: Unknown CPUs are reported as affected. In order to make this work + * and still keep the list short, only handle CPUs where: + * - supports_csv2p3() returned false + * - supports_clearbhb() returned false. + * - spectre_bhb_loop_affected() returned 0. + */ static bool is_spectre_bhb_fw_affected(int scope) { static bool system_affected; enum mitigation_state fw_state; bool has_smccc = arm_smccc_1_1_get_conduit() != SMCCC_CONDUIT_NONE; - static const struct midr_range spectre_bhb_firmware_mitigated_list[] = { - MIDR_ALL_VERSIONS(MIDR_CORTEX_A73), - MIDR_ALL_VERSIONS(MIDR_CORTEX_A75), - {}, - }; bool cpu_in_list = is_midr_in_range_list(read_cpuid_id(), - spectre_bhb_firmware_mitigated_list); + spectre_bhb_safe_list);
if (scope != SCOPE_LOCAL_CPU) return system_affected;
fw_state = spectre_bhb_get_cpu_fw_mitigation_state(); - if (cpu_in_list || (has_smccc && fw_state == SPECTRE_MITIGATED)) { + if (!cpu_in_list || (has_smccc && fw_state == SPECTRE_MITIGATED)) { system_affected = true; return true; }
I feel like this patch is maybe taking a bit of a wrong angle at achieving what you're trying to do. The code seems to be structured in a way that it has separate functions to test for each of the possible mitigation options one at a time, and pushing the default case into spectre_bhb_loop_affected() feels like a bit of a layering violation. I think it would work the way you wrote it, but it depends heavily on the order functions are called in is_spectre_bhb_affected(), which seems counter-intuitive with the way the functions seem to be designed as independent checks.
What do you think about an approach like this instead:
- Refactor max_bhb_k in spectre_bhb_loop_affected() to be a global instead, which starts out as zero, is updated by spectre_bhb_loop_affected(), and is directly read by spectre_bhb_patch_loop_iter() (could probably remove the `scope` argument from spectre_bhb_loop_affected() then).
- Add a function is_spectre_bhb_safe() that just checks if the MIDR is in the new list you're adding
- Add an `if (is_spectre_bhb_safe()) return false;` to the top of is_spectre_bhb_affected
- Change the `return false` into `return true` at the end of is_spectre_bhb_affected (in fact, you can probably take out some of the other calls that result in returning true as well then)
- In spectre_bhb_enable_mitigations(), at the end of the long if-else chain, put a last else block that prints your WARN_ONCE(), sets the max_bhb_k global to 32, and then does the same stuff that the `if (spectre_bhb_loop_affected())` block would have installed (maybe factoring that out into a helper function called from both cases).
I think that should implement the same "assume unsafe by default unless explicitly listed as safe, check for all other mitigations first, then default to k=32 loop if none found" approach, but makes it a bit more obvious in the code.
Hi,
On Fri, Dec 13, 2024 at 6:25 PM Julius Werner jwerner@chromium.org wrote:
I feel like this patch is maybe taking a bit of a wrong angle at achieving what you're trying to do. The code seems to be structured in a way that it has separate functions to test for each of the possible mitigation options one at a time, and pushing the default case into spectre_bhb_loop_affected() feels like a bit of a layering violation. I think it would work the way you wrote it, but it depends heavily on the order functions are called in is_spectre_bhb_affected(), which seems counter-intuitive with the way the functions seem to be designed as independent checks.
What do you think about an approach like this instead:
- Refactor max_bhb_k in spectre_bhb_loop_affected() to be a global
instead, which starts out as zero, is updated by spectre_bhb_loop_affected(), and is directly read by spectre_bhb_patch_loop_iter() (could probably remove the `scope` argument from spectre_bhb_loop_affected() then).
Refactoring "max_bhb_k" would be a general cleanup and not related to anything else here, right?
...but the function is_spectre_bhb_affected() is called from "cpu_errata.c" and has a scope. Can we guarantee that it's always "SCOPE_LOCAL_CPU"? I tried reading through the code and it's _probably_ SCOPE_LOCAL_CPU most of the time, but it doesn't seem worth it to add an assumption here for a small cleanup.
I'm not going to do this, though I will move "max_bhb_k" to be a global for the suggestion below.
- Add a function is_spectre_bhb_safe() that just checks if the MIDR is
in the new list you're adding
- Add an `if (is_spectre_bhb_safe()) return false;` to the top of
is_spectre_bhb_affected
That seems reasonable to do and lets me get rid of the "safe" checks from is_spectre_bhb_fw_affected() and spectre_bhb_loop_affected(), so it sounds good.
- Change the `return false` into `return true` at the end of
is_spectre_bhb_affected (in fact, you can probably take out some of the other calls that result in returning true as well then)
I'm not sure you can take out the other calls. Specifically, both spectre_bhb_loop_affected() and is_spectre_bhb_fw_affected() _need_ to be called for each CPU so that they update static globals, right? Maybe we could get rid of the call to supports_clearbhb(), but that _would_ change things in ways that are not obvious. Specifically I could believe that someone could have backported "clear BHB" to their core but their core is otherwise listed as "loop affected". That would affect "max_bhb_k". Maybe (?) it doesn't matter in this case, but I'd rather not mess with it unless someone really wants me to and is sure it's safe.
- In spectre_bhb_enable_mitigations(), at the end of the long if-else
chain, put a last else block that prints your WARN_ONCE(), sets the max_bhb_k global to 32, and then does the same stuff that the `if (spectre_bhb_loop_affected())` block would have installed (maybe factoring that out into a helper function called from both cases).
...or just reorder it so that the spectre_bhb_loop_affected() code is last and it can be the "else". Then I can WARN_ONCE() if spectre_bhb_loop_affected(). ...or I could just do the WARN_ONCE() when I get to the end of is_spectre_bhb_affected() and set "max_bhb_k" to 32 there. I'd actually rather do that so that "max_bhb_k" is consistently set after is_spectre_bhb_affected() is called on all cores regardless of whether or not some cores are unknown.
- Refactor max_bhb_k in spectre_bhb_loop_affected() to be a global
instead, which starts out as zero, is updated by spectre_bhb_loop_affected(), and is directly read by spectre_bhb_patch_loop_iter() (could probably remove the `scope` argument from spectre_bhb_loop_affected() then).
Refactoring "max_bhb_k" would be a general cleanup and not related to anything else here, right?
...but the function is_spectre_bhb_affected() is called from "cpu_errata.c" and has a scope. Can we guarantee that it's always "SCOPE_LOCAL_CPU"? I tried reading through the code and it's _probably_ SCOPE_LOCAL_CPU most of the time, but it doesn't seem worth it to add an assumption here for a small cleanup.
I'm not going to do this, though I will move "max_bhb_k" to be a global for the suggestion below.
If you make max_bhb_k a global, then whether you change all `spectre_bhb_loop_affected(SCOPE_SYSTEM)` calls to read the global directly or whether you keep it such that `spectre_bhb_loop_affected()` simply returns that global for SCOPE_SYSTEM makes no difference. I just think reading the global directly would look a bit cleaner. Calling a function that's called "...affected()" when you're really just trying to find out the K-value feels a bit odd.
For is_spectre_bhb_affected(), I was assuming the change below where you combine all the `return true` paths, so the scope question wouldn't matter there.
- Change the `return false` into `return true` at the end of
is_spectre_bhb_affected (in fact, you can probably take out some of the other calls that result in returning true as well then)
I'm not sure you can take out the other calls. Specifically, both spectre_bhb_loop_affected() and is_spectre_bhb_fw_affected() _need_ to be called for each CPU so that they update static globals, right? Maybe we could get rid of the call to supports_clearbhb(), but that _would_ change things in ways that are not obvious. Specifically I could believe that someone could have backported "clear BHB" to their core but their core is otherwise listed as "loop affected". That would affect "max_bhb_k". Maybe (?) it doesn't matter in this case, but I'd rather not mess with it unless someone really wants me to and is sure it's safe.
Yes, but spectre_bhb_enable_mitigation() already calls all those functions on its own again anyway, so I'm pretty sure the "must be called once for each CPU" part of spectre_bhb_loop_affected() is covered by that. (Besides, it would be really awful if they had made a function whose name starts with is_... have critical side-effects that break things when it doesn't get called.)
- In spectre_bhb_enable_mitigations(), at the end of the long if-else
chain, put a last else block that prints your WARN_ONCE(), sets the max_bhb_k global to 32, and then does the same stuff that the `if (spectre_bhb_loop_affected())` block would have installed (maybe factoring that out into a helper function called from both cases).
...or just reorder it so that the spectre_bhb_loop_affected() code is last and it can be the "else". Then I can WARN_ONCE() if spectre_bhb_loop_affected(). ...or I could just do the WARN_ONCE() when I get to the end of is_spectre_bhb_affected() and set "max_bhb_k" to 32 there. I'd actually rather do that so that "max_bhb_k" is consistently set after is_spectre_bhb_affected() is called on all cores regardless of whether or not some cores are unknown.
Yeah, you can reorder the loops too. I don't feel like moving this into is_spectre_bhb_affected() would be a good idea. Functions with a predicate name like that really shouldn't have such side effects. Besides, I think is_spectre_bhb_affected() is probably called more often per CPU, both once from spectre_bhb_enable_mitigation() and by whatever calls the `matches` pointer in the cpu_errata struct. spectre_bhb_enable_mitigation() seems to be the function that's called once for each CPU on boot to install the correct mitigation, so that feels like the best spot to put the fallback logic to me.
Hi,
On Tue, Dec 17, 2024 at 5:25 AM Julius Werner jwerner@chromium.org wrote:
- Refactor max_bhb_k in spectre_bhb_loop_affected() to be a global
instead, which starts out as zero, is updated by spectre_bhb_loop_affected(), and is directly read by spectre_bhb_patch_loop_iter() (could probably remove the `scope` argument from spectre_bhb_loop_affected() then).
Refactoring "max_bhb_k" would be a general cleanup and not related to anything else here, right?
...but the function is_spectre_bhb_affected() is called from "cpu_errata.c" and has a scope. Can we guarantee that it's always "SCOPE_LOCAL_CPU"? I tried reading through the code and it's _probably_ SCOPE_LOCAL_CPU most of the time, but it doesn't seem worth it to add an assumption here for a small cleanup.
I'm not going to do this, though I will move "max_bhb_k" to be a global for the suggestion below.
If you make max_bhb_k a global, then whether you change all `spectre_bhb_loop_affected(SCOPE_SYSTEM)` calls to read the global directly or whether you keep it such that `spectre_bhb_loop_affected()` simply returns that global for SCOPE_SYSTEM makes no difference. I just think reading the global directly would look a bit cleaner. Calling a function that's called "...affected()" when you're really just trying to find out the K-value feels a bit odd.
For is_spectre_bhb_affected(), I was assuming the change below where you combine all the `return true` paths, so the scope question wouldn't matter there.
Ah, right. OK.
- Change the `return false` into `return true` at the end of
is_spectre_bhb_affected (in fact, you can probably take out some of the other calls that result in returning true as well then)
I'm not sure you can take out the other calls. Specifically, both spectre_bhb_loop_affected() and is_spectre_bhb_fw_affected() _need_ to be called for each CPU so that they update static globals, right? Maybe we could get rid of the call to supports_clearbhb(), but that _would_ change things in ways that are not obvious. Specifically I could believe that someone could have backported "clear BHB" to their core but their core is otherwise listed as "loop affected". That would affect "max_bhb_k". Maybe (?) it doesn't matter in this case, but I'd rather not mess with it unless someone really wants me to and is sure it's safe.
Yes, but spectre_bhb_enable_mitigation() already calls all those functions on its own again anyway, so I'm pretty sure the "must be called once for each CPU" part of spectre_bhb_loop_affected() is covered by that. (Besides, it would be really awful if they had made a function whose name starts with is_... have critical side-effects that break things when it doesn't get called.)
The existing predicates already do change globals before my patch and changing that is outside of the scope of what I'm willing to entertain with my patchset
Given that I'm not going to change the way the existing predicates work, if I move the "fallback" setting `max_bhb_k` to 32 to spectre_bhb_enable_mitigation() then when we set `max_bhb_k` becomes inconsistent between recognized and unrecognized CPUs. One gets set in the predicate and one doesn't. Even if it works, this inconsistency feels like bad design to me. Also, setting `max_bhb_k` to the max at the end of is_spectre_bhb_affected() would perhaps _help_ someone realize that the predicate has side effects because they'd see it in the function itself and not have to dig down.
I would also say that having `max_bhb_k` get set in an inconsistent place opens us up for bugs in the future. Even if it works today, I imagine someone could change things in the future such that spectre_bhb_enable_mitigation() reads `max_bhb_k` and essentially caches it (maybe it constructs an instruction based on it). If that happened things could be subtly broken for the "unrecognized CPU" case because the first CPU would "cache" the value without it having been called on all CPUs.
In case you can't tell, I'm still not convinced and will plan to keep setting `max_bhb_k = 32` in is_spectre_bhb_affected().
- In spectre_bhb_enable_mitigations(), at the end of the long if-else
chain, put a last else block that prints your WARN_ONCE(), sets the max_bhb_k global to 32, and then does the same stuff that the `if (spectre_bhb_loop_affected())` block would have installed (maybe factoring that out into a helper function called from both cases).
...or just reorder it so that the spectre_bhb_loop_affected() code is last and it can be the "else". Then I can WARN_ONCE() if spectre_bhb_loop_affected(). ...or I could just do the WARN_ONCE() when I get to the end of is_spectre_bhb_affected() and set "max_bhb_k" to 32 there. I'd actually rather do that so that "max_bhb_k" is consistently set after is_spectre_bhb_affected() is called on all cores regardless of whether or not some cores are unknown.
Yeah, you can reorder the loops too. I don't feel like moving this into is_spectre_bhb_affected() would be a good idea. Functions with a predicate name like that really shouldn't have such side effects. Besides, I think is_spectre_bhb_affected() is probably called more often per CPU, both once from spectre_bhb_enable_mitigation() and by whatever calls the `matches` pointer in the cpu_errata struct. spectre_bhb_enable_mitigation() seems to be the function that's called once for each CPU on boot to install the correct mitigation, so that feels like the best spot to put the fallback logic to me.
As per above, while I agree that having predicate functions w/ side effects is not ideal, that predates my patch series and I'd rather things work consistently.
-Doug
Given that I'm not going to change the way the existing predicates work, if I move the "fallback" setting `max_bhb_k` to 32 to spectre_bhb_enable_mitigation() then when we set `max_bhb_k` becomes inconsistent between recognized and unrecognized CPUs.
A clean way to fix that could be to change spectre_bhb_loop_affected() to just return the K-value (rather than change max_bhb_k directly), and then set max_bhb_k to the max() of that return value and the existing value when it is called from spectre_bhb_enable_mitigation(). That way, max_bhb_k would only be updated from spectre_bhb_enable_mitigation().
I would also say that having `max_bhb_k` get set in an inconsistent place opens us up for bugs in the future. Even if it works today, I imagine someone could change things in the future such that spectre_bhb_enable_mitigation() reads `max_bhb_k` and essentially caches it (maybe it constructs an instruction based on it). If that happened things could be subtly broken for the "unrecognized CPU" case because the first CPU would "cache" the value without it having been called on all CPUs.
This would likely already be a problem with the current code, since spectre_bhb_enable_mitigations() is called one at a time for each CPU and the max_bhb_k value is only valid once it has been called on all CPUs. If someone tried to just immediately use the value inside the function that would just be a bug either way. (Right now this should not be a problem because max_bhb_k is only used by spectre_bhb_patch_loop_iter() which ends up being called through apply_alternatives_all(), which should only run after all those CPU errata cpu_enable callbacks have been called.)
Hi,
On Wed, Dec 18, 2024 at 8:36 AM Julius Werner jwerner@chromium.org wrote:
Given that I'm not going to change the way the existing predicates work, if I move the "fallback" setting `max_bhb_k` to 32 to spectre_bhb_enable_mitigation() then when we set `max_bhb_k` becomes inconsistent between recognized and unrecognized CPUs.
A clean way to fix that could be to change spectre_bhb_loop_affected() to just return the K-value (rather than change max_bhb_k directly), and then set max_bhb_k to the max() of that return value and the existing value when it is called from spectre_bhb_enable_mitigation(). That way, max_bhb_k would only be updated from spectre_bhb_enable_mitigation().
Sure, you could do that. ...but in my patch series I need to add _another_ static boolean that's updated in is_spectre_bhb_affected() anyway. I need to add one to the new is_spectre_bhb_safe() function that you requested. Specifically, the moment I detect any CPU ID that's not in the "safe" list then I need to note that down. If I don't do that then later calls to is_spectre_bhb_affected(SCOPE_SYSTEM) will return the wrong value. So while all the other "static" caches in is_spectre_bhb_affected() could be removed because we changed the default return to "true", the one in is_spectre_bhb_safe() (which causes the function to return "false") can't be removed.
In any case, the predicates updating the static caches predates my series and IMO my series doesn't make it worse. If you want to post a followup series changing how the predicates work and can convince others that it's worth doing then great, but I don't think it should block forward progress here.
I would also say that having `max_bhb_k` get set in an inconsistent place opens us up for bugs in the future. Even if it works today, I imagine someone could change things in the future such that spectre_bhb_enable_mitigation() reads `max_bhb_k` and essentially caches it (maybe it constructs an instruction based on it). If that happened things could be subtly broken for the "unrecognized CPU" case because the first CPU would "cache" the value without it having been called on all CPUs.
This would likely already be a problem with the current code, since spectre_bhb_enable_mitigations() is called one at a time for each CPU and the max_bhb_k value is only valid once it has been called on all CPUs. If someone tried to just immediately use the value inside the function that would just be a bug either way. (Right now this should not be a problem because max_bhb_k is only used by spectre_bhb_patch_loop_iter() which ends up being called through apply_alternatives_all(), which should only run after all those CPU errata cpu_enable callbacks have been called.)
Actually, today is_spectre_bhb_affected() is called much earlier I believe. It's installed (via cpu_errata.c) and called like this:
is_spectre_bhb_affected+0x124/0x2d8 update_cpu_capabilities+0xa0/0x158 setup_boot_cpu_capabilities+0x20/0x40 setup_boot_cpu_features+0x38/0x50 smp_prepare_boot_cpu+0x38/0x60 start_kernel+0x90/0x438
...but then spectre_bhb_enable_mitigations() is called later and by that time is_spectre_bhb_affected() should have been called for each of the CPUs:
spectre_bhb_enable_mitigation+0xbc/0x340 cpu_enable_non_boot_scope_capabilities+0x74/0xc8 multi_cpu_stop+0xf0/0x1b8 cpu_stopper_thread+0xac/0x148 smpboot_thread_fn+0xb0/0x238
I agree that it doesn't seem to be a problem today, though.
-Doug
On Fri, Dec 13, 2024 at 04:52:02PM -0800, Douglas Anderson wrote:
The code for detecting CPUs that are vulnerable to Spectre BHB was based on a hardcoded list of CPU IDs that were known to be affected. Unfortunately, the list mostly only contained the IDs of standard ARM cores. The IDs for many cores that are minor variants of the standard ARM cores (like many Qualcomm Kyro CPUs) weren't listed. This led the code to assume that those variants were not affected.
Flip the code on its head and instead list CPU IDs for cores that are known to be _not_ affected. Now CPUs will be assumed vulnerable until added to the list saying that they're safe.
As of right now, the only CPU IDs added to the "unaffected" list are ARM Cortex A35, A53, and A55. This list was created by looking at older cores listed in cputype.h that weren't listed in the "affected" list previously.
There's a list of affected CPUs from Arm here:
https://developer.arm.com/Arm%20Security%20Center/Spectre-BHB
(obviously only covers their own designs).
So it looks like A510 and A520 should be unaffected too, although I didn't check exhaustively. It also looks like A715 is affected but the whitepaper doesn't tell you what version of 'k' to use...
Unfortunately, while this solution is better than what we had before, it's still an imperfect solution. Specifically there are two ways to mitigate Spectre BHB and one of those ways is parameterized with a "k" value indicating how many loops are needed to mitigate. If we have an unknown CPU ID then we've got to guess about how to mitigate it. Since more cores seem to be mitigated by looping (and because it's unlikely that the needed FW code will be in place for FW mitigation for unknown cores), we'll choose looping for unknown CPUs and choose the highest "k" value of 32.
I don't think we should guess. Just say vulnerable.
The downside of our guessing is that some CPUs may now report as "mitigated" when in reality they should need a firmware mitigation. We'll choose to put a WARN_ON splat in the logs in this case any time we had to make a guess since guessing the right mitigation is pretty awful. Hopefully this will encourage CPU vendors to add their CPU IDs to the list.
Hmm. We shouldn't have to guess here as the firmware mitigation is discoverable. So if it's unavailable and we're running an a CPU which needs it, then we're vulnerable.
Will
Hi,
On Thu, Dec 19, 2024 at 9:51 AM Will Deacon will@kernel.org wrote:
On Fri, Dec 13, 2024 at 04:52:02PM -0800, Douglas Anderson wrote:
The code for detecting CPUs that are vulnerable to Spectre BHB was based on a hardcoded list of CPU IDs that were known to be affected. Unfortunately, the list mostly only contained the IDs of standard ARM cores. The IDs for many cores that are minor variants of the standard ARM cores (like many Qualcomm Kyro CPUs) weren't listed. This led the code to assume that those variants were not affected.
Flip the code on its head and instead list CPU IDs for cores that are known to be _not_ affected. Now CPUs will be assumed vulnerable until added to the list saying that they're safe.
As of right now, the only CPU IDs added to the "unaffected" list are ARM Cortex A35, A53, and A55. This list was created by looking at older cores listed in cputype.h that weren't listed in the "affected" list previously.
There's a list of affected CPUs from Arm here:
https://developer.arm.com/Arm%20Security%20Center/Spectre-BHB
(obviously only covers their own designs).
So it looks like A510 and A520 should be unaffected too, although I didn't check exhaustively. It also looks like A715 is affected but the whitepaper doesn't tell you what version of 'k' to use...
Unfortunately, while this solution is better than what we had before, it's still an imperfect solution. Specifically there are two ways to mitigate Spectre BHB and one of those ways is parameterized with a "k" value indicating how many loops are needed to mitigate. If we have an unknown CPU ID then we've got to guess about how to mitigate it. Since more cores seem to be mitigated by looping (and because it's unlikely that the needed FW code will be in place for FW mitigation for unknown cores), we'll choose looping for unknown CPUs and choose the highest "k" value of 32.
I don't think we should guess. Just say vulnerable.
Ah, I see. So the series won't actually _fix_ anyone, it will just properly report that we're vulnerable. I guess that works.
The downside of our guessing is that some CPUs may now report as "mitigated" when in reality they should need a firmware mitigation. We'll choose to put a WARN_ON splat in the logs in this case any time we had to make a guess since guessing the right mitigation is pretty awful. Hopefully this will encourage CPU vendors to add their CPU IDs to the list.
Hmm. We shouldn't have to guess here as the firmware mitigation is discoverable. So if it's unavailable and we're running an a CPU which needs it, then we're vulnerable.
Sure.
-Doug
Hi,
On Thu, Dec 19, 2024 at 9:51 AM Will Deacon will@kernel.org wrote:
As of right now, the only CPU IDs added to the "unaffected" list are ARM Cortex A35, A53, and A55. This list was created by looking at older cores listed in cputype.h that weren't listed in the "affected" list previously.
There's a list of affected CPUs from Arm here:
https://developer.arm.com/Arm%20Security%20Center/Spectre-BHB
(obviously only covers their own designs).
So it looks like A510 and A520 should be unaffected too, although I didn't check exhaustively.
I was hoping that newer cores would hit the supports_csv2p3() check and be considered safe, but I guess the white paper explicitly says that A510 doesn't implement it and is still considered safe. I looked up the TRM of A520 and it looks like it also doesn't set CSV2P3, so I guess I'll add that one too.
It also looks like A715 is affected but the whitepaper doesn't tell you what version of 'k' to use...
It doesn't? I see a "k" of 38 there. Wow, and Neoverse N2 needs 132!!!
...though I guess on newer steppings of those chips they'll just use "clear BHB", which seems available and is the preferred mitigation?
linux-stable-mirror@lists.linaro.org