On 1/18/2016 11:08 AM, Borislav Petkov wrote:
    > On Mon, Jan 18, 2016 at 10:08:00AM -0500, Abdulhamid, Harb wrote:
>> Here is my crack at massaging the language a bit more: "Under
>> normal circumstances, when a hardware error occurs, the kernel gets
>> notified via an NMI, MCE or some other method. When the error has a
>> fatal severity or is unrecoverable, the kernel would normally
>> panic.
> 
> So this is still not exact. It all depends on what the hardware
> does. Even more importantly, does the hardware even run the error
> handler and let it access MCA banks to find about the error or does
> it directly warm-reset the system.
> 
> The error can happen, it is critical, *nothing* might be visible in
> the MCA registers (this is x86-specific) and the machine would reset.
> Only when you warm-reset, you may or may not see anything in there.
> 
> In reading the BERT explanation in the ACPI spec, I have to say, it 
> sounds pretty ok to me:
> 
> "18.3.1 Boot Error Source
> 
> Under normal circumstances, when a hardware error occurs, the error 
> handler receives control and processes the error. This gives OSPM a 
> chance to process the error condition, report it, and optionally
> attempt recovery. In some cases, the system is unable to process an
> error. For example, system firmware or a management controller may
> choose to reset the system or the system might experience an
> uncontrolled crash or reset.The boot error source is used to report
> unhandled errors that occurred in a previous boot. This mechanism is
> described in the BERT table."
> 
> I think we should take that text. :)
    Agreed. That would be best.
    
    Apologies for the format of the last email.
    
    -- 
    Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center,
    Inc.
    Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a
    Linux Foundation Collaborative Project