On Mon, May 01, 2023 at 11:49:55AM -0700, Linus Torvalds wrote:
The bug goes back to commit 0db7058e8e23 ("x86/clear_user: Make it faster") from about a year ago, which made it into v6.1.
Gah, sorry about that. :-\
It only affects old hardware that doesn't have the ERMS capability flag, which *probably* means that it's mostly only triggerable in virtualization (since pretty much any CPU from the last decade has ERMS, afaik).
Borislav - opinions? This needs fixing for v6.1..v6.3, and the options are:
(1) just fix up the exception entry. I think this is literally this one-liner, but somebody should double-check me. I did *not* actually test this:
--- a/arch/x86/lib/clear_page_64.S +++ b/arch/x86/lib/clear_page_64.S @@ -142,8 +142,8 @@ SYM_FUNC_START(clear_user_rep_good) and $7, %edx jz .Lrep_good_exit -.Lrep_good_bytes: mov %edx, %ecx +.Lrep_good_bytes: rep stosb .Lrep_good_exit:
because the only use of '.Lrep_good_bytes' is that exception table entry.
(2) backport just that one commit for clear_user
In this case we should probably do commit e046fe5a36a9 ("x86: set
FSRS automatically on AMD CPUs that have FSRM") too, since that commit changes the decision to use 'rep stosb' to check FSRS.
(3) backport the entire series of commits:
git log --oneline v6.3..034ff37d3407
Or we could even revert that commit 0db7058e8e23, but it seems silly to revert when we have so many ways to fix it, including a one-line code movement.
Borislav / stable people? Opinions?
So right now I feel like (3) would be the right thing to do. Because then stable and upstream will be on the same "level" wrt user-accessing primitives. And it's not like your series depend on anything from mainline (that I know of) so backporting them should be relatively easy.
But (1) is definitely a lot easier for stable people modulo the fact that it won't be an upstream commit but a special stable-only fix.
So yeah, in that order.
I guess I'd let stable people decide here what they wanna do.
Thx.