On 2023/11/23 23:07, Borislav Petkov wrote:
Hi, Borislav,
Thank you for your reply and advice.
On Sat, Oct 07, 2023 at 03:28:16PM +0800, Shuai Xue wrote:
However, this trick is not always be effective
So far so good.
What's missing here is why "this trick" is not always effective.
Basically to explain what exactly the problem is.
I think the main point is that this trick for AR error is not effective, because:
- an AR error consumed by current process is deferred to handle in a dedicated kernel thread, but memory_failure() assumes that it runs in the current context - another page fault is not unnecessary, we can send sigbus to current process in the first Synchronous External Abort SEA on arm64 (analogy Machine Check Exception on x86)
For example, hwpoison-aware user-space processes use the si_code: BUS_MCEERR_AO for 'action optional' early notifications, and BUS_MCEERR_AR for 'action required' synchronous/late notifications. Specifically, when a signal with SIGBUS_MCEERR_AR is delivered to QEMU, it will inject a vSEA to Guest kernel. In contrast, a signal with SIGBUS_MCEERR_AO will be ignored by QEMU.[1]
Fix it by seting memory failure flags as MF_ACTION_REQUIRED on synchronous events. (PATCH 1)
So you're fixing qemu by "fixing" the kernel?
This doesn't make any sense.
I just give an example that the user space process *really* relys on the si_code of signal to handle hardware errors
Make errors which are ACPI_HEST_NOTIFY_SEA type return MF_ACTION_REQUIRED so that it *happens* to fix your use case.
Sounds like a lot of nonsense to me.
What is the issue here you're trying to solve?
The SIGBUS si_codes defined in include/uapi/asm-generic/siginfo.h says:
/* hardware memory error consumed on a machine check: action required */ #define BUS_MCEERR_AR 4 /* hardware memory error detected in process but not consumed: action optional*/ #define BUS_MCEERR_AO 5
When a synchronous error is consumed by Guest, the kernel should send a signal with BUS_MCEERR_AR instead of BUS_MCEERR_AO.
- Handle memory_failure() abnormal fails to avoid a unnecessary reboot
If process mapping fault page, but memory_failure() abnormal return before try_to_unmap(), for example, the fault page process mapping is KSM page. In this case, arm64 cannot use the page fault process to terminate the synchronous exception loop.[4]
This loop can potentially exceed the platform firmware threshold or even trigger a kernel hard lockup, leading to a system reboot. However, kernel has the capability to recover from this error.
Fix it by performing a force kill when memory_failure() abnormal fails or when other abnormal synchronous errors occur.
Just like that?
Without giving the process the opportunity to even save its other data?
Exactly.
So this all is still very confusing, patches definitely need splitting and this whole thing needs restraint.
You go and do this: you split *each* issue you're addressing into a separate patch and explain it like this:
Prepare the context for the explanation briefly.
Explain the problem at hand.
"It happens because of <...>"
"Fix it by doing X"
"(Potentially do Y)."
and each patch explains *exactly* *one* issue, what happens, why it happens and just the fix for it and *why* it is needed.
Otherwise, this is unreviewable.
Thank you for your valuable suggestion, I will split the patches and resubmit a new patch set.
Thx.
Best Regards, Shuai