[OpenPower-Firmware] poor correctable MC errors logging
Mahesh Jagannath Salgaonkar
mahesh at linux.vnet.ibm.com
Mon Mar 19 15:56:05 AEDT 2018
On 03/16/2018 10:09 PM, Sergey Kachkin wrote:
> Hi Mahesh,
>
> thanks for your reply.
>
>> We can improve it to print CPU pir number. Do you also want location code
> info there ?
>
> Yes, I would prefer as much info as possible that may help to distinguish
> one MC problem from another and isolate the root cause. So adding more
> details would be beneficial.
> Am I correct that CPU numbers etc will be printed for other similar
> recoverable errors also?
Yes. If we print cpu pir info it will reflect for all errors.
I have never seen anything except ERAT but
> wondering if it could be also SLB / TLB multihit etc.
>
>
> regards,
> Sergey
> YADRO
>
>
> On Fri, Mar 16, 2018 at 6:52 PM, Mahesh Jagannath Salgaonkar <
> mahesh at linux.vnet.ibm.com> wrote:
>
>> On 03/14/2018 07:05 PM, Sergey Kachkin wrote:
>>> Hi,
>>>
>>> recently there was a number of HMI logging improvements which may help to
>>> isolate the source of HMI errors, but troubleshooting MCs like below is
>>> also challenging.
>>> Can we have additional logging for MCs also?
>>
>> We can improve it to print CPU pir number. Do you also want location
>> code info there ?
>>
>>
>
>
>>>
>>>
>>> 1. Feb 15 02:56:33 host kernel: Severe Machine check interrupt
>>> [Recovered]
>>> 2. Feb 15 02:56:33 host kernel: Initiator: CPU
>>> 3. Feb 15 02:56:33 host kernel: Error type: ERAT [Multihit]
>>> 4. Feb 15 02:56:33 host kernel: Effective address:
>> c00003eefc12f018
>>> 5. Feb 15 03:04:19 host kernel: Severe Machine check interrupt
>>> [Recovered]
>>> 6. Feb 15 03:04:19 host kernel: Initiator: CPU
>>> 7. Feb 15 03:04:19 host kernel: Error type: ERAT [Multihit]
>>> 8. Feb 15 03:04:19 host kernel: Effective address:
>> c00003eefc12f018
>>> 9.
>>>
>>>
>>>
>>> * [282d5fee5c4f](https://github.com/open-power/skiboot/commit/
>> 282d5fee5c4f)
>>> core/hmi: Use pr_fmt macro for tagging log messages
>>> * [c531ff957669](https://github.com/open-power/skiboot/commit/
>> c531ff957669)
>>> opal/hmi: HMI logging with location code info.
>>> * [b33ed1e6b6b0](https://github.com/open-power/skiboot/commit/
>> b33ed1e6b6b0)
>>> core/hmi: Do not display FIR details if none of the bits are set.
>>> * [45a961515be6](https://github.com/open-power/skiboot/commit/
>> 45a961515be6)
>>> core/hmi: Display chip location code while displaying core FIR.
>>>
>>>
>>> thanks,
>>>
>>> regards,
>>> Sergey
>>> YADRO
>>>
>>>
>>>
>>> _______________________________________________
>>> OpenPower-Firmware mailing list
>>> OpenPower-Firmware at lists.ozlabs.org
>>> https://lists.ozlabs.org/listinfo/openpower-firmware
>>>
>>
>>
>
More information about the OpenPower-Firmware
mailing list