From: Yazen Ghannam yazen.ghannam@amd.com
Always call kill_me_maybe() in order to attempt memory recovery. This ensures that any memory associated with the error is properly marked as poison.
This is needed for errors that occur on memory, but that do not have MCG_STATUS[RIPV] set. One example is data poison consumption through the instruction fetch units on AMD Zen-based systems.
The MF_MUST_KILL flag is passed to memory_failure() when MCG_STATUS[RIPV] is not set. So the associated process will still be killed.
Cc: stable@vger.kernel.org Signed-off-by: Yazen Ghannam yazen.ghannam@amd.com --- arch/x86/kernel/cpu/mce/core.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index 308fb644b94a..9040d45ed997 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -1285,10 +1285,7 @@ static void queue_task_work(struct mce *m, int kill_current_task) current->mce_ripv = !!(m->mcgstatus & MCG_STATUS_RIPV); current->mce_whole_page = whole_page(m);
- if (kill_current_task) - current->mce_kill_me.func = kill_me_now; - else - current->mce_kill_me.func = kill_me_maybe; + current->mce_kill_me.func = kill_me_maybe;
task_work_add(current, ¤t->mce_kill_me, TWA_RESUME); }
On Tue, May 04, 2021 at 05:47:12PM +0000, Yazen Ghannam wrote:
From: Yazen Ghannam yazen.ghannam@amd.com
Always call kill_me_maybe() in order to attempt memory recovery. This ensures that any memory associated with the error is properly marked as poison.
This is needed for errors that occur on memory, but that do not have MCG_STATUS[RIPV] set. One example is data poison consumption through the instruction fetch units on AMD Zen-based systems.
The MF_MUST_KILL flag is passed to memory_failure() when MCG_STATUS[RIPV] is not set. So the associated process will still be killed.
Cc: stable@vger.kernel.org Signed-off-by: Yazen Ghannam yazen.ghannam@amd.com
arch/x86/kernel/cpu/mce/core.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index 308fb644b94a..9040d45ed997 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -1285,10 +1285,7 @@ static void queue_task_work(struct mce *m, int kill_current_task) current->mce_ripv = !!(m->mcgstatus & MCG_STATUS_RIPV); current->mce_whole_page = whole_page(m);
- if (kill_current_task)
current->mce_kill_me.func = kill_me_now;
- else
current->mce_kill_me.func = kill_me_maybe;
- current->mce_kill_me.func = kill_me_maybe;
task_work_add(current, ¤t->mce_kill_me, TWA_RESUME); }
Could we just get rid of kill_me_now() at the same time? It's only one line, and with this change only called in one place (from kill_me_maybe()) ... just put the force_sig(SIGBUS); inline?
-Tony
On Tue, May 04, 2021 at 11:07:34AM -0700, Luck, Tony wrote:
On Tue, May 04, 2021 at 05:47:12PM +0000, Yazen Ghannam wrote:
...
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index 308fb644b94a..9040d45ed997 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -1285,10 +1285,7 @@ static void queue_task_work(struct mce *m, int kill_current_task) current->mce_ripv = !!(m->mcgstatus & MCG_STATUS_RIPV); current->mce_whole_page = whole_page(m);
- if (kill_current_task)
current->mce_kill_me.func = kill_me_now;
- else
current->mce_kill_me.func = kill_me_maybe;
- current->mce_kill_me.func = kill_me_maybe;
task_work_add(current, ¤t->mce_kill_me, TWA_RESUME); }
Could we just get rid of kill_me_now() at the same time? It's only one line, and with this change only called in one place (from kill_me_maybe()) ... just put the force_sig(SIGBUS); inline?
Okay, will do.
Thanks, Yazen
linux-stable-mirror@lists.linaro.org