[Skiboot] [PATCH v2] opal: Do not overwrite same HMI event for multiple HMI errors.
Mahesh Jagannath Salgaonkar
mahesh at linux.vnet.ibm.com
Thu Feb 12 16:07:33 AEDT 2015
On 02/12/2015 10:18 AM, Stewart Smith wrote:
> Mahesh J Salgaonkar <mahesh at linux.vnet.ibm.com> writes:
>> The current implementation overwrites the same HMI event if there are
>> multiple HMI errors reported through a single HMI interrupt. This patch
>> fixes that issue by sending separate HMI event per error.
>
> Fundamentally looks like a good idea, couple of small notes that'd be
> good to fix before merging:
>
>> @@ -247,8 +277,10 @@ int handle_hmi_exception(uint64_t hmer, struct OpalHMIEvent *hmi_evt)
>> if (hmer & SPR_HMER_MALFUNCTION_ALERT) {
>> hmer &= ~SPR_HMER_MALFUNCTION_ALERT;
>>
>> - if (hmi_evt)
>> + if (hmi_evt) {
>> recover = decode_malfunction(hmi_evt);
>> + queue_hmi_event(hmi_evt, recover);
>> + }
>> }
>>
>> /* Assert if we see Hypervisor resource error, we can not continue. */
>> @@ -257,6 +289,7 @@ int handle_hmi_exception(uint64_t hmer, struct OpalHMIEvent *hmi_evt)
>> if (hmi_evt) {
>> hmi_evt->severity = OpalHMI_SEV_FATAL;
>> hmi_evt->type = OpalHMI_ERROR_HYP_RESOURCE;
>> + queue_hmi_event(hmi_evt, recover);
>> }
>> recover = 0;
>
> (and actually pasting in the code so it's a bit easier to see)
>
> /* Assert if we see malfunction alert, we can not continue. */
> if (hmer & SPR_HMER_MALFUNCTION_ALERT) {
> hmer &= ~SPR_HMER_MALFUNCTION_ALERT;
>
> if (hmi_evt) {
> recover = decode_malfunction(hmi_evt);
> queue_hmi_event(hmi_evt, recover);
> }
> }
>
> /* Assert if we see Hypervisor resource error, we can not continue. */
> if (hmer & SPR_HMER_HYP_RESOURCE_ERR) {
> hmer &= ~SPR_HMER_HYP_RESOURCE_ERR;
> if (hmi_evt) {
> hmi_evt->severity = OpalHMI_SEV_FATAL;
> hmi_evt->type = OpalHMI_ERROR_HYP_RESOURCE;
> queue_hmi_event(hmi_evt, recover);
> }
> recover = 0;
> }
>
> It looks as though if there's both malfunction alert and hyp resource
> err in one, recover for the second will be set based on the first?
Nice catch. I think I should move 'recover = 0' before "if (hmi_evt)"
statement.
>
> Perhaps better to explicitly pass OpalHMI_DISPOSITION_(NOT_)RECOVERED to
> queue_hmi_event rather than rely on 1/0 ?
>
The recover variable can carry three values -1, 0 and 1 and
queue_hmi_event() makes decision based on these three values.
Will send out v3 with the fix.
Thanks,
-Mahesh.
More information about the Skiboot
mailing list