[Skiboot] [PATCH] opal/hmi: Fix a TOD HMI failure during a race condition.

Mahesh Jagannath Salgaonkar mahesh at linux.vnet.ibm.com
Tue Aug 16 20:19:36 AEST 2016


On 08/16/2016 01:18 PM, Ananth N Mavinakayanahalli wrote:
> On Sat, Aug 13, 2016 at 06:41:11PM +0530, Mahesh J Salgaonkar wrote:
>> From: Mahesh Salgaonkar <mahesh at linux.vnet.ibm.com>
>>
>> There are chances where another interrupt can wake a CPU in 0x100
>> vector just when HMI for TOD error is also pending. In such a rare race
>> condition if CPU has woken up with tb_loss power saving mode, it will
>> invoke opal call to resync the TB. Since TOD is already in error state,
>> resync TB will timeout leaving TFMR bit 18 set to '1'. (TFMR[18]=1 means
>> TB is prepared to receive new value from TOD. Once the new value is
>> received this bit gets reset to '0', otherwise TB would stay in waiting
>> state). When HMI is delivered, it may find all TFMR errors are already
>> cleared but would fail to restore TB since TFMR bit 18 is already set.
>> This leads to HMI recovery failure causing a kernel crash.
>>
>> This patch fixes this by clearing of TB errors if TFMR[18] is set to 1.
>> This makes sure that TB is in clean state before TB restore process starts.
> 
> Does this need to go into older firmware release updates if there are
> any?

Yes, it should go as updates for FW840 and above.

Thanks,
-Mahesh.



More information about the Skiboot mailing list