The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 1a15bb8303b6b104e78028b6c68f76a0d4562134
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025040811-capable-unblock-8997@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1a15bb8303b6b104e78028b6c68f76a0d4562134 Mon Sep 17 00:00:00 2001
From: Shuai Xue <xueshuai(a)linux.alibaba.com>
Date: Wed, 12 Mar 2025 19:28:50 +0800
Subject: [PATCH] x86/mce: use is_copy_from_user() to determine copy-from-user
context
Patch series "mm/hwpoison: Fix regressions in memory failure handling",
v4.
## 1. What am I trying to do:
This patchset resolves two critical regressions related to memory failure
handling that have appeared in the upstream kernel since version 5.17, as
compared to 5.10 LTS.
- copyin case: poison found in user page while kernel copying from user space
- instr case: poison found while instruction fetching in user space
## 2. What is the expected outcome and why
- For copyin case:
Kernel can recover from poison found where kernel is doing get_user() or
copy_from_user() if those places get an error return and the kernel return
-EFAULT to the process instead of crashing. More specifily, MCE handler
checks the fixup handler type to decide whether an in kernel #MC can be
recovered. When EX_TYPE_UACCESS is found, the PC jumps to recovery code
specified in _ASM_EXTABLE_FAULT() and return a -EFAULT to user space.
- For instr case:
If a poison found while instruction fetching in user space, full recovery
is possible. User process takes #PF, Linux allocates a new page and fills
by reading from storage.
## 3. What actually happens and why
- For copyin case: kernel panic since v5.17
Commit 4c132d1d844a ("x86/futex: Remove .fixup usage") introduced a new
extable fixup type, EX_TYPE_EFAULT_REG, and later patches updated the
extable fixup type for copy-from-user operations, changing it from
EX_TYPE_UACCESS to EX_TYPE_EFAULT_REG. It breaks previous EX_TYPE_UACCESS
handling when posion found in get_user() or copy_from_user().
- For instr case: user process is killed by a SIGBUS signal due to #CMCI
and #MCE race
When an uncorrected memory error is consumed there is a race between the
CMCI from the memory controller reporting an uncorrected error with a UCNA
signature, and the core reporting and SRAR signature machine check when
the data is about to be consumed.
### Background: why *UN*corrected errors tied to *C*MCI in Intel platform [1]
Prior to Icelake memory controllers reported patrol scrub events that
detected a previously unseen uncorrected error in memory by signaling a
broadcast machine check with an SRAO (Software Recoverable Action
Optional) signature in the machine check bank. This was overkill because
it's not an urgent problem that no core is on the verge of consuming that
bad data. It's also found that multi SRAO UCE may cause nested MCE
interrupts and finally become an IERR.
Hence, Intel downgrades the machine check bank signature of patrol scrub
from SRAO to UCNA (Uncorrected, No Action required), and signal changed to
#CMCI. Just to add to the confusion, Linux does take an action (in
uc_decode_notifier()) to try to offline the page despite the UC*NA*
signature name.
### Background: why #CMCI and #MCE race when poison is consuming in
Intel platform [1]
Having decided that CMCI/UCNA is the best action for patrol scrub errors,
the memory controller uses it for reads too. But the memory controller is
executing asynchronously from the core, and can't tell the difference
between a "real" read and a speculative read. So it will do CMCI/UCNA if
an error is found in any read.
Thus:
1) Core is clever and thinks address A is needed soon, issues a
speculative read.
2) Core finds it is going to use address A soon after sending the read
request
3) The CMCI from the memory controller is in a race with MCE from the
core that will soon try to retire the load from address A.
Quite often (because speculation has got better) the CMCI from the memory
controller is delivered before the core is committed to the instruction
reading address A, so the interrupt is taken, and Linux offlines the page
(marking it as poison).
## Why user process is killed for instr case
Commit 046545a661af ("mm/hwpoison: fix error page recovered but reported
"not recovered"") tries to fix noise message "Memory error not recovered"
and skips duplicate SIGBUSs due to the race. But it also introduced a bug
that kill_accessing_process() return -EHWPOISON for instr case, as result,
kill_me_maybe() send a SIGBUS to user process.
# 4. The fix, in my opinion, should be:
- For copyin case:
The key point is whether the error context is in a read from user memory.
We do not care about the ex-type if we know its a MOV reading from
userspace.
is_copy_from_user() return true when both of the following two checks are
true:
- the current instruction is copy
- source address is user memory
If copy_user is true, we set
m->kflags |= MCE_IN_KERNEL_COPYIN | MCE_IN_KERNEL_RECOV;
Then do_machine_check() will try fixup_exception() first.
- For instr case: let kill_accessing_process() return 0 to prevent a SIGBUS.
- For patch 3:
The return value of memory_failure() is quite important while discussed
instr case regression with Tony and Miaohe for patch 2, so add comment
about the return value.
This patch (of 3):
Commit 4c132d1d844a ("x86/futex: Remove .fixup usage") introduced a new
extable fixup type, EX_TYPE_EFAULT_REG, and commit 4c132d1d844a
("x86/futex: Remove .fixup usage") updated the extable fixup type for
copy-from-user operations, changing it from EX_TYPE_UACCESS to
EX_TYPE_EFAULT_REG. The error context for copy-from-user operations no
longer functions as an in-kernel recovery context. Consequently, the
error context for copy-from-user operations no longer functions as an
in-kernel recovery context, resulting in kernel panics with the message:
"Machine check: Data load in unrecoverable area of kernel."
To address this, it is crucial to identify if an error context involves a
read operation from user memory. The function is_copy_from_user() can be
utilized to determine:
- the current operation is copy
- when reading user memory
When these conditions are met, is_copy_from_user() will return true,
confirming that it is indeed a direct copy from user memory. This check
is essential for correctly handling the context of errors in these
operations without relying on the extable fixup types that previously
allowed for in-kernel recovery.
So, use is_copy_from_user() to determine if a context is copy user directly.
Link: https://lkml.kernel.org/r/20250312112852.82415-1-xueshuai@linux.alibaba.com
Link: https://lkml.kernel.org/r/20250312112852.82415-2-xueshuai@linux.alibaba.com
Fixes: 4c132d1d844a ("x86/futex: Remove .fixup usage")
Signed-off-by: Shuai Xue <xueshuai(a)linux.alibaba.com>
Suggested-by: Peter Zijlstra <peterz(a)infradead.org>
Acked-by: Borislav Petkov (AMD) <bp(a)alien8.de>
Tested-by: Tony Luck <tony.luck(a)intel.com>
Cc: Baolin Wang <baolin.wang(a)linux.alibaba.com>
Cc: Borislav Betkov <bp(a)alien8.de>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: "H. Peter Anvin" <hpa(a)zytor.com>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Josh Poimboeuf <jpoimboe(a)kernel.org>
Cc: Miaohe Lin <linmiaohe(a)huawei.com>
Cc: Naoya Horiguchi <nao.horiguchi(a)gmail.com>
Cc: Ruidong Tian <tianruidong(a)linux.alibaba.com>
Cc: Thomas Gleinxer <tglx(a)linutronix.de>
Cc: Yazen Ghannam <yazen.ghannam(a)amd.com>
Cc: Jane Chu <jane.chu(a)oracle.com>
Cc: Jarkko Sakkinen <jarkko(a)kernel.org>
Cc: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/arch/x86/kernel/cpu/mce/severity.c b/arch/x86/kernel/cpu/mce/severity.c
index dac4d64dfb2a..2235a7477436 100644
--- a/arch/x86/kernel/cpu/mce/severity.c
+++ b/arch/x86/kernel/cpu/mce/severity.c
@@ -300,13 +300,12 @@ static noinstr int error_context(struct mce *m, struct pt_regs *regs)
copy_user = is_copy_from_user(regs);
instrumentation_end();
- switch (fixup_type) {
- case EX_TYPE_UACCESS:
- if (!copy_user)
- return IN_KERNEL;
- m->kflags |= MCE_IN_KERNEL_COPYIN;
- fallthrough;
+ if (copy_user) {
+ m->kflags |= MCE_IN_KERNEL_COPYIN | MCE_IN_KERNEL_RECOV;
+ return IN_KERNEL_RECOV;
+ }
+ switch (fixup_type) {
case EX_TYPE_FAULT_MCE_SAFE:
case EX_TYPE_DEFAULT_MCE_SAFE:
m->kflags |= MCE_IN_KERNEL_RECOV;
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 1a15bb8303b6b104e78028b6c68f76a0d4562134
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025040808-ammonia-petal-1583@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1a15bb8303b6b104e78028b6c68f76a0d4562134 Mon Sep 17 00:00:00 2001
From: Shuai Xue <xueshuai(a)linux.alibaba.com>
Date: Wed, 12 Mar 2025 19:28:50 +0800
Subject: [PATCH] x86/mce: use is_copy_from_user() to determine copy-from-user
context
Patch series "mm/hwpoison: Fix regressions in memory failure handling",
v4.
## 1. What am I trying to do:
This patchset resolves two critical regressions related to memory failure
handling that have appeared in the upstream kernel since version 5.17, as
compared to 5.10 LTS.
- copyin case: poison found in user page while kernel copying from user space
- instr case: poison found while instruction fetching in user space
## 2. What is the expected outcome and why
- For copyin case:
Kernel can recover from poison found where kernel is doing get_user() or
copy_from_user() if those places get an error return and the kernel return
-EFAULT to the process instead of crashing. More specifily, MCE handler
checks the fixup handler type to decide whether an in kernel #MC can be
recovered. When EX_TYPE_UACCESS is found, the PC jumps to recovery code
specified in _ASM_EXTABLE_FAULT() and return a -EFAULT to user space.
- For instr case:
If a poison found while instruction fetching in user space, full recovery
is possible. User process takes #PF, Linux allocates a new page and fills
by reading from storage.
## 3. What actually happens and why
- For copyin case: kernel panic since v5.17
Commit 4c132d1d844a ("x86/futex: Remove .fixup usage") introduced a new
extable fixup type, EX_TYPE_EFAULT_REG, and later patches updated the
extable fixup type for copy-from-user operations, changing it from
EX_TYPE_UACCESS to EX_TYPE_EFAULT_REG. It breaks previous EX_TYPE_UACCESS
handling when posion found in get_user() or copy_from_user().
- For instr case: user process is killed by a SIGBUS signal due to #CMCI
and #MCE race
When an uncorrected memory error is consumed there is a race between the
CMCI from the memory controller reporting an uncorrected error with a UCNA
signature, and the core reporting and SRAR signature machine check when
the data is about to be consumed.
### Background: why *UN*corrected errors tied to *C*MCI in Intel platform [1]
Prior to Icelake memory controllers reported patrol scrub events that
detected a previously unseen uncorrected error in memory by signaling a
broadcast machine check with an SRAO (Software Recoverable Action
Optional) signature in the machine check bank. This was overkill because
it's not an urgent problem that no core is on the verge of consuming that
bad data. It's also found that multi SRAO UCE may cause nested MCE
interrupts and finally become an IERR.
Hence, Intel downgrades the machine check bank signature of patrol scrub
from SRAO to UCNA (Uncorrected, No Action required), and signal changed to
#CMCI. Just to add to the confusion, Linux does take an action (in
uc_decode_notifier()) to try to offline the page despite the UC*NA*
signature name.
### Background: why #CMCI and #MCE race when poison is consuming in
Intel platform [1]
Having decided that CMCI/UCNA is the best action for patrol scrub errors,
the memory controller uses it for reads too. But the memory controller is
executing asynchronously from the core, and can't tell the difference
between a "real" read and a speculative read. So it will do CMCI/UCNA if
an error is found in any read.
Thus:
1) Core is clever and thinks address A is needed soon, issues a
speculative read.
2) Core finds it is going to use address A soon after sending the read
request
3) The CMCI from the memory controller is in a race with MCE from the
core that will soon try to retire the load from address A.
Quite often (because speculation has got better) the CMCI from the memory
controller is delivered before the core is committed to the instruction
reading address A, so the interrupt is taken, and Linux offlines the page
(marking it as poison).
## Why user process is killed for instr case
Commit 046545a661af ("mm/hwpoison: fix error page recovered but reported
"not recovered"") tries to fix noise message "Memory error not recovered"
and skips duplicate SIGBUSs due to the race. But it also introduced a bug
that kill_accessing_process() return -EHWPOISON for instr case, as result,
kill_me_maybe() send a SIGBUS to user process.
# 4. The fix, in my opinion, should be:
- For copyin case:
The key point is whether the error context is in a read from user memory.
We do not care about the ex-type if we know its a MOV reading from
userspace.
is_copy_from_user() return true when both of the following two checks are
true:
- the current instruction is copy
- source address is user memory
If copy_user is true, we set
m->kflags |= MCE_IN_KERNEL_COPYIN | MCE_IN_KERNEL_RECOV;
Then do_machine_check() will try fixup_exception() first.
- For instr case: let kill_accessing_process() return 0 to prevent a SIGBUS.
- For patch 3:
The return value of memory_failure() is quite important while discussed
instr case regression with Tony and Miaohe for patch 2, so add comment
about the return value.
This patch (of 3):
Commit 4c132d1d844a ("x86/futex: Remove .fixup usage") introduced a new
extable fixup type, EX_TYPE_EFAULT_REG, and commit 4c132d1d844a
("x86/futex: Remove .fixup usage") updated the extable fixup type for
copy-from-user operations, changing it from EX_TYPE_UACCESS to
EX_TYPE_EFAULT_REG. The error context for copy-from-user operations no
longer functions as an in-kernel recovery context. Consequently, the
error context for copy-from-user operations no longer functions as an
in-kernel recovery context, resulting in kernel panics with the message:
"Machine check: Data load in unrecoverable area of kernel."
To address this, it is crucial to identify if an error context involves a
read operation from user memory. The function is_copy_from_user() can be
utilized to determine:
- the current operation is copy
- when reading user memory
When these conditions are met, is_copy_from_user() will return true,
confirming that it is indeed a direct copy from user memory. This check
is essential for correctly handling the context of errors in these
operations without relying on the extable fixup types that previously
allowed for in-kernel recovery.
So, use is_copy_from_user() to determine if a context is copy user directly.
Link: https://lkml.kernel.org/r/20250312112852.82415-1-xueshuai@linux.alibaba.com
Link: https://lkml.kernel.org/r/20250312112852.82415-2-xueshuai@linux.alibaba.com
Fixes: 4c132d1d844a ("x86/futex: Remove .fixup usage")
Signed-off-by: Shuai Xue <xueshuai(a)linux.alibaba.com>
Suggested-by: Peter Zijlstra <peterz(a)infradead.org>
Acked-by: Borislav Petkov (AMD) <bp(a)alien8.de>
Tested-by: Tony Luck <tony.luck(a)intel.com>
Cc: Baolin Wang <baolin.wang(a)linux.alibaba.com>
Cc: Borislav Betkov <bp(a)alien8.de>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: "H. Peter Anvin" <hpa(a)zytor.com>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Josh Poimboeuf <jpoimboe(a)kernel.org>
Cc: Miaohe Lin <linmiaohe(a)huawei.com>
Cc: Naoya Horiguchi <nao.horiguchi(a)gmail.com>
Cc: Ruidong Tian <tianruidong(a)linux.alibaba.com>
Cc: Thomas Gleinxer <tglx(a)linutronix.de>
Cc: Yazen Ghannam <yazen.ghannam(a)amd.com>
Cc: Jane Chu <jane.chu(a)oracle.com>
Cc: Jarkko Sakkinen <jarkko(a)kernel.org>
Cc: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/arch/x86/kernel/cpu/mce/severity.c b/arch/x86/kernel/cpu/mce/severity.c
index dac4d64dfb2a..2235a7477436 100644
--- a/arch/x86/kernel/cpu/mce/severity.c
+++ b/arch/x86/kernel/cpu/mce/severity.c
@@ -300,13 +300,12 @@ static noinstr int error_context(struct mce *m, struct pt_regs *regs)
copy_user = is_copy_from_user(regs);
instrumentation_end();
- switch (fixup_type) {
- case EX_TYPE_UACCESS:
- if (!copy_user)
- return IN_KERNEL;
- m->kflags |= MCE_IN_KERNEL_COPYIN;
- fallthrough;
+ if (copy_user) {
+ m->kflags |= MCE_IN_KERNEL_COPYIN | MCE_IN_KERNEL_RECOV;
+ return IN_KERNEL_RECOV;
+ }
+ switch (fixup_type) {
case EX_TYPE_FAULT_MCE_SAFE:
case EX_TYPE_DEFAULT_MCE_SAFE:
m->kflags |= MCE_IN_KERNEL_RECOV;
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 1a15bb8303b6b104e78028b6c68f76a0d4562134
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025040805-enlisted-regular-7bac@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1a15bb8303b6b104e78028b6c68f76a0d4562134 Mon Sep 17 00:00:00 2001
From: Shuai Xue <xueshuai(a)linux.alibaba.com>
Date: Wed, 12 Mar 2025 19:28:50 +0800
Subject: [PATCH] x86/mce: use is_copy_from_user() to determine copy-from-user
context
Patch series "mm/hwpoison: Fix regressions in memory failure handling",
v4.
## 1. What am I trying to do:
This patchset resolves two critical regressions related to memory failure
handling that have appeared in the upstream kernel since version 5.17, as
compared to 5.10 LTS.
- copyin case: poison found in user page while kernel copying from user space
- instr case: poison found while instruction fetching in user space
## 2. What is the expected outcome and why
- For copyin case:
Kernel can recover from poison found where kernel is doing get_user() or
copy_from_user() if those places get an error return and the kernel return
-EFAULT to the process instead of crashing. More specifily, MCE handler
checks the fixup handler type to decide whether an in kernel #MC can be
recovered. When EX_TYPE_UACCESS is found, the PC jumps to recovery code
specified in _ASM_EXTABLE_FAULT() and return a -EFAULT to user space.
- For instr case:
If a poison found while instruction fetching in user space, full recovery
is possible. User process takes #PF, Linux allocates a new page and fills
by reading from storage.
## 3. What actually happens and why
- For copyin case: kernel panic since v5.17
Commit 4c132d1d844a ("x86/futex: Remove .fixup usage") introduced a new
extable fixup type, EX_TYPE_EFAULT_REG, and later patches updated the
extable fixup type for copy-from-user operations, changing it from
EX_TYPE_UACCESS to EX_TYPE_EFAULT_REG. It breaks previous EX_TYPE_UACCESS
handling when posion found in get_user() or copy_from_user().
- For instr case: user process is killed by a SIGBUS signal due to #CMCI
and #MCE race
When an uncorrected memory error is consumed there is a race between the
CMCI from the memory controller reporting an uncorrected error with a UCNA
signature, and the core reporting and SRAR signature machine check when
the data is about to be consumed.
### Background: why *UN*corrected errors tied to *C*MCI in Intel platform [1]
Prior to Icelake memory controllers reported patrol scrub events that
detected a previously unseen uncorrected error in memory by signaling a
broadcast machine check with an SRAO (Software Recoverable Action
Optional) signature in the machine check bank. This was overkill because
it's not an urgent problem that no core is on the verge of consuming that
bad data. It's also found that multi SRAO UCE may cause nested MCE
interrupts and finally become an IERR.
Hence, Intel downgrades the machine check bank signature of patrol scrub
from SRAO to UCNA (Uncorrected, No Action required), and signal changed to
#CMCI. Just to add to the confusion, Linux does take an action (in
uc_decode_notifier()) to try to offline the page despite the UC*NA*
signature name.
### Background: why #CMCI and #MCE race when poison is consuming in
Intel platform [1]
Having decided that CMCI/UCNA is the best action for patrol scrub errors,
the memory controller uses it for reads too. But the memory controller is
executing asynchronously from the core, and can't tell the difference
between a "real" read and a speculative read. So it will do CMCI/UCNA if
an error is found in any read.
Thus:
1) Core is clever and thinks address A is needed soon, issues a
speculative read.
2) Core finds it is going to use address A soon after sending the read
request
3) The CMCI from the memory controller is in a race with MCE from the
core that will soon try to retire the load from address A.
Quite often (because speculation has got better) the CMCI from the memory
controller is delivered before the core is committed to the instruction
reading address A, so the interrupt is taken, and Linux offlines the page
(marking it as poison).
## Why user process is killed for instr case
Commit 046545a661af ("mm/hwpoison: fix error page recovered but reported
"not recovered"") tries to fix noise message "Memory error not recovered"
and skips duplicate SIGBUSs due to the race. But it also introduced a bug
that kill_accessing_process() return -EHWPOISON for instr case, as result,
kill_me_maybe() send a SIGBUS to user process.
# 4. The fix, in my opinion, should be:
- For copyin case:
The key point is whether the error context is in a read from user memory.
We do not care about the ex-type if we know its a MOV reading from
userspace.
is_copy_from_user() return true when both of the following two checks are
true:
- the current instruction is copy
- source address is user memory
If copy_user is true, we set
m->kflags |= MCE_IN_KERNEL_COPYIN | MCE_IN_KERNEL_RECOV;
Then do_machine_check() will try fixup_exception() first.
- For instr case: let kill_accessing_process() return 0 to prevent a SIGBUS.
- For patch 3:
The return value of memory_failure() is quite important while discussed
instr case regression with Tony and Miaohe for patch 2, so add comment
about the return value.
This patch (of 3):
Commit 4c132d1d844a ("x86/futex: Remove .fixup usage") introduced a new
extable fixup type, EX_TYPE_EFAULT_REG, and commit 4c132d1d844a
("x86/futex: Remove .fixup usage") updated the extable fixup type for
copy-from-user operations, changing it from EX_TYPE_UACCESS to
EX_TYPE_EFAULT_REG. The error context for copy-from-user operations no
longer functions as an in-kernel recovery context. Consequently, the
error context for copy-from-user operations no longer functions as an
in-kernel recovery context, resulting in kernel panics with the message:
"Machine check: Data load in unrecoverable area of kernel."
To address this, it is crucial to identify if an error context involves a
read operation from user memory. The function is_copy_from_user() can be
utilized to determine:
- the current operation is copy
- when reading user memory
When these conditions are met, is_copy_from_user() will return true,
confirming that it is indeed a direct copy from user memory. This check
is essential for correctly handling the context of errors in these
operations without relying on the extable fixup types that previously
allowed for in-kernel recovery.
So, use is_copy_from_user() to determine if a context is copy user directly.
Link: https://lkml.kernel.org/r/20250312112852.82415-1-xueshuai@linux.alibaba.com
Link: https://lkml.kernel.org/r/20250312112852.82415-2-xueshuai@linux.alibaba.com
Fixes: 4c132d1d844a ("x86/futex: Remove .fixup usage")
Signed-off-by: Shuai Xue <xueshuai(a)linux.alibaba.com>
Suggested-by: Peter Zijlstra <peterz(a)infradead.org>
Acked-by: Borislav Petkov (AMD) <bp(a)alien8.de>
Tested-by: Tony Luck <tony.luck(a)intel.com>
Cc: Baolin Wang <baolin.wang(a)linux.alibaba.com>
Cc: Borislav Betkov <bp(a)alien8.de>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: "H. Peter Anvin" <hpa(a)zytor.com>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Josh Poimboeuf <jpoimboe(a)kernel.org>
Cc: Miaohe Lin <linmiaohe(a)huawei.com>
Cc: Naoya Horiguchi <nao.horiguchi(a)gmail.com>
Cc: Ruidong Tian <tianruidong(a)linux.alibaba.com>
Cc: Thomas Gleinxer <tglx(a)linutronix.de>
Cc: Yazen Ghannam <yazen.ghannam(a)amd.com>
Cc: Jane Chu <jane.chu(a)oracle.com>
Cc: Jarkko Sakkinen <jarkko(a)kernel.org>
Cc: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/arch/x86/kernel/cpu/mce/severity.c b/arch/x86/kernel/cpu/mce/severity.c
index dac4d64dfb2a..2235a7477436 100644
--- a/arch/x86/kernel/cpu/mce/severity.c
+++ b/arch/x86/kernel/cpu/mce/severity.c
@@ -300,13 +300,12 @@ static noinstr int error_context(struct mce *m, struct pt_regs *regs)
copy_user = is_copy_from_user(regs);
instrumentation_end();
- switch (fixup_type) {
- case EX_TYPE_UACCESS:
- if (!copy_user)
- return IN_KERNEL;
- m->kflags |= MCE_IN_KERNEL_COPYIN;
- fallthrough;
+ if (copy_user) {
+ m->kflags |= MCE_IN_KERNEL_COPYIN | MCE_IN_KERNEL_RECOV;
+ return IN_KERNEL_RECOV;
+ }
+ switch (fixup_type) {
case EX_TYPE_FAULT_MCE_SAFE:
case EX_TYPE_DEFAULT_MCE_SAFE:
m->kflags |= MCE_IN_KERNEL_RECOV;
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 60f3caff1492e5b8616b9578c4bedb5c0a88ed14
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025040845-repeater-uninvited-9d3b@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 60f3caff1492e5b8616b9578c4bedb5c0a88ed14 Mon Sep 17 00:00:00 2001
From: Hengqi Chen <hengqi.chen(a)gmail.com>
Date: Sun, 30 Mar 2025 16:31:09 +0800
Subject: [PATCH] LoongArch: BPF: Don't override subprog's return value
The verifier test `calls: div by 0 in subprog` triggers a panic at the
ld.bu instruction. The ld.bu insn is trying to load byte from memory
address returned by the subprog. The subprog actually set the correct
address at the a5 register (dedicated register for BPF return values).
But at commit 73c359d1d356 ("LoongArch: BPF: Sign-extend return values")
we also sign extended a5 to the a0 register (return value in LoongArch).
For function call insn, we later propagate the a0 register back to a5
register. This is right for native calls but wrong for bpf2bpf calls
which expect zero-extended return value in a5 register. So only move a0
to a5 for native calls (i.e. non-BPF_PSEUDO_CALL).
Cc: stable(a)vger.kernel.org
Fixes: 73c359d1d356 ("LoongArch: BPF: Sign-extend return values")
Signed-off-by: Hengqi Chen <hengqi.chen(a)gmail.com>
Signed-off-by: Huacai Chen <chenhuacai(a)loongson.cn>
diff --git a/arch/loongarch/net/bpf_jit.c b/arch/loongarch/net/bpf_jit.c
index a06bf89fed67..fa1500d4aa3e 100644
--- a/arch/loongarch/net/bpf_jit.c
+++ b/arch/loongarch/net/bpf_jit.c
@@ -907,7 +907,10 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx, bool ext
move_addr(ctx, t1, func_addr);
emit_insn(ctx, jirl, LOONGARCH_GPR_RA, t1, 0);
- move_reg(ctx, regmap[BPF_REG_0], LOONGARCH_GPR_A0);
+
+ if (insn->src_reg != BPF_PSEUDO_CALL)
+ move_reg(ctx, regmap[BPF_REG_0], LOONGARCH_GPR_A0);
+
break;
/* tail call */
On 4/7/25 17:25, Sasha Levin wrote:
> This is a note to let you know that I've just added the patch titled
>
> wifi: mac80211: remove debugfs dir for virtual monitor
>
> to the 6.13-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> wifi-mac80211-remove-debugfs-dir-for-virtual-monitor.patch
> and it can be found in the queue-6.13 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
>
The commit here is causing a sparse warning. Please always add commit
861d0445e72e ("wifi: mac80211: Fix sparse warning for monitor_sdata")
when you backport this patch to avoid that. (So far I got the backport
notification for 6.12 and 6.13.)
It's also no big deal when you decide to not backport this patch.
>
>
> commit 193bafe31a2a08b5631a98ecf148fa0d18e94b0d
> Author: Alexander Wetzel <Alexander(a)wetzel-home.de>
> Date: Tue Feb 4 17:42:40 2025 +0100
>
> wifi: mac80211: remove debugfs dir for virtual monitor
>
> [ Upstream commit 646262c71aca87bb66945933abe4e620796d6c5a ]
>
> Don't call ieee80211_debugfs_recreate_netdev() for virtual monitor
> interface when deleting it.
>
> The virtual monitor interface shouldn't have debugfs entries and trying
> to update them will *create* them on deletion.
>
> And when the virtual monitor interface is created/destroyed multiple
> times we'll get warnings about debugfs name conflicts.
>
> Signed-off-by: Alexander Wetzel <Alexander(a)wetzel-home.de>
> Link: https://patch.msgid.link/20250204164240.370153-1-Alexander@wetzel-home.de
> Signed-off-by: Johannes Berg <johannes.berg(a)intel.com>
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
>
> diff --git a/net/mac80211/driver-ops.c b/net/mac80211/driver-ops.c
> index 299d38e9e8630..2fc60e1e77a55 100644
> --- a/net/mac80211/driver-ops.c
> +++ b/net/mac80211/driver-ops.c
> @@ -116,8 +116,14 @@ void drv_remove_interface(struct ieee80211_local *local,
>
> sdata->flags &= ~IEEE80211_SDATA_IN_DRIVER;
>
> - /* Remove driver debugfs entries */
> - ieee80211_debugfs_recreate_netdev(sdata, sdata->vif.valid_links);
> + /*
> + * Remove driver debugfs entries.
> + * The virtual monitor interface doesn't get a debugfs
> + * entry, so it's exempt here.
> + */
> + if (sdata != local->monitor_sdata)
> + ieee80211_debugfs_recreate_netdev(sdata,
> + sdata->vif.valid_links);
>
> trace_drv_remove_interface(local, sdata);
> local->ops->remove_interface(&local->hw, &sdata->vif);
> diff --git a/net/mac80211/iface.c b/net/mac80211/iface.c
> index 806dffa48ef92..04b3626387309 100644
> --- a/net/mac80211/iface.c
> +++ b/net/mac80211/iface.c
> @@ -1212,16 +1212,17 @@ void ieee80211_del_virtual_monitor(struct ieee80211_local *local)
> return;
> }
>
> - RCU_INIT_POINTER(local->monitor_sdata, NULL);
> - mutex_unlock(&local->iflist_mtx);
> -
> - synchronize_net();
> -
> + clear_bit(SDATA_STATE_RUNNING, &sdata->state);
> ieee80211_link_release_channel(&sdata->deflink);
>
> if (ieee80211_hw_check(&local->hw, WANT_MONITOR_VIF))
> drv_remove_interface(local, sdata);
>
> + RCU_INIT_POINTER(local->monitor_sdata, NULL);
> + mutex_unlock(&local->iflist_mtx);
> +
> + synchronize_net();
> +
> kfree(sdata);
> }
>
The sdio_read32() calls sd_read(), but does not handle the error if
sd_read() fails. This could lead to subsequent operations processing
invalid data. A proper implementation can be found in sdio_readN(),
which has an error handling for the sd_read().
Add error handling for the sd_read() to free tmpbuf and return error
code if sd_read() fails. This ensure that the memcpy() is only performed
when the read operation is successful.
Since none of the callers check for the errors, there is no need to
return the error code propagated from sd_read(). Returning SDIO_ERR_VAL32
might be a better choice, which is a specialized error code for SDIO.
Another problem of returning propagated error code is that the error
code is a s32 type value, which is not fit with the u32 type return value
of the sdio_read32().
An practical option would be to go through all the callers and add error
handling, which need to pass a pointer to u32 *val and return zero on
success or negative on failure. It is not a better choice since will cost
unnecessary effort on the error code.
The other opion is to replace sd_read() by sd_read32(), which return an
u32 type error code that can be directly used as the return value of
sdio_read32(). But, it is also a bad choice to use sd_read32() in a
alignment failed branch.
Fixes: 554c0a3abf21 ("staging: Add rtl8723bs sdio wifi driver")
Cc: stable(a)vger.kernel.org # v4.12+
Signed-off-by: Wentao Liang <vulab(a)iscas.ac.cn>
---
v7: Fix error code and add patch explanation
v6: Fix improper code to propagate error code
v5: Fix error code
v4: Add change log and fix error code
v3: Add Cc flag
v2: Change code to initialize val
drivers/staging/rtl8723bs/hal/sdio_ops.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/staging/rtl8723bs/hal/sdio_ops.c b/drivers/staging/rtl8723bs/hal/sdio_ops.c
index 21e9f1858745..d79d41727042 100644
--- a/drivers/staging/rtl8723bs/hal/sdio_ops.c
+++ b/drivers/staging/rtl8723bs/hal/sdio_ops.c
@@ -185,7 +185,12 @@ static u32 sdio_read32(struct intf_hdl *intfhdl, u32 addr)
return SDIO_ERR_VAL32;
ftaddr &= ~(u16)0x3;
- sd_read(intfhdl, ftaddr, 8, tmpbuf);
+ err = sd_read(intfhdl, ftaddr, 8, tmpbuf);
+ if (err) {
+ kfree(tmpbuf);
+ return SDIO_ERR_VAL32;
+ }
+
memcpy(&le_tmp, tmpbuf + shift, 4);
val = le32_to_cpu(le_tmp);
--
2.42.0.windows.2
In tpacpi_battery_init(), the return value of tpacpi_check_quirks() needs
to be checked. The battery should not be hooked if there is no matched
battery information in quirk table.
Add an error check and return -ENODEV immediately if the device fail
the check.
Fixes: 1a32ebb26ba9 ("platform/x86: thinkpad_acpi: Support battery quirk")
Cc: stable(a)vger.kernel.org
Signed-off-by: Wentao Liang <vulab(a)iscas.ac.cn>
---
v3: Fix error code
v2: Fix double assignment error
drivers/platform/x86/thinkpad_acpi.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/platform/x86/thinkpad_acpi.c b/drivers/platform/x86/thinkpad_acpi.c
index 2cfb2ac3f465..ab538bf53716 100644
--- a/drivers/platform/x86/thinkpad_acpi.c
+++ b/drivers/platform/x86/thinkpad_acpi.c
@@ -9974,6 +9974,8 @@ static int __init tpacpi_battery_init(struct ibm_init_struct *ibm)
tp_features.battery_force_primary = tpacpi_check_quirks(
battery_quirk_table,
ARRAY_SIZE(battery_quirk_table));
+ if (!tp_features.battery_force_primary)
+ return -ENODEV;
battery_hook_register(&battery_hook);
return 0;
--
2.42.0.windows.2
From: Alex Hung <alex.hung(a)amd.com>
[ Upstream commit 8aa2864044b9d13e95fe224f32e808afbf79ecdf ]
[WHY & HOW]
dc->links[] has max size of MAX_LINKS and NULL is return when trying to
access with out-of-bound index.
This fixes 3 OVERRUN and 1 RESOURCE_LEAK issues reported by Coverity.
Reviewed-by: Harry Wentland <harry.wentland(a)amd.com>
Acked-by: Tom Chung <chiahsuan.chung(a)amd.com>
Signed-off-by: Alex Hung <alex.hung(a)amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
[The macro MAX_LINKS is introduced by Commit 60df5628144b ("drm/amd/display:
handle invalid connector indices") after 6.10. So here we still use the
original array length MAX_PIPES * 2]
Signed-off-by: Jianqi Ren <jianqi.ren.cn(a)windriver.com>
Signed-off-by: He Zhe <zhe.he(a)windriver.com>
---
Verified the build test
---
drivers/gpu/drm/amd/display/dc/core/dc_link_exports.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_link_exports.c b/drivers/gpu/drm/amd/display/dc/core/dc_link_exports.c
index f365773d5714..e9b3c1c7a931 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_link_exports.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_link_exports.c
@@ -37,6 +37,9 @@
#include "dce/dce_i2c.h"
struct dc_link *dc_get_link_at_index(struct dc *dc, uint32_t link_index)
{
+ if (link_index >= (MAX_PIPES * 2))
+ return NULL;
+
return dc->links[link_index];
}
--
2.34.1
On destroy, we should set each node dead. But current code miss this
when the maple tree has only the root node.
The reason is mt_destroy_walk() leverage mte_destroy_descend() to set
node dead, but this is skipped since the only root node is a leaf.
Fixes this by setting the node dead if it is a leaf.
Fixes: 54a611b60590 ("Maple Tree: add new data structure")
Signed-off-by: Wei Yang <richard.weiyang(a)gmail.com>
CC: Liam R. Howlett <Liam.Howlett(a)Oracle.com>
Cc: <stable(a)vger.kernel.org>
---
v2:
* move the operation into mt_destroy_walk()
* adjust the title accordingly
---
lib/maple_tree.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index 4bd5a5be1440..0696e8d1c4e9 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -5284,6 +5284,7 @@ static void mt_destroy_walk(struct maple_enode *enode, struct maple_tree *mt,
struct maple_enode *start;
if (mte_is_leaf(enode)) {
+ mte_set_node_dead(enode);
node->type = mte_node_type(enode);
goto free_leaf;
}
--
2.34.1