[Skiboot] [PATCH] opal/hmi: Fix a TOD HMI failure during a race condition.

Ananth N Mavinakayanahalli ananth at linux.vnet.ibm.com
Tue Aug 16 17:48:00 AEST 2016


On Sat, Aug 13, 2016 at 06:41:11PM +0530, Mahesh J Salgaonkar wrote:
> From: Mahesh Salgaonkar <mahesh at linux.vnet.ibm.com>
> 
> There are chances where another interrupt can wake a CPU in 0x100
> vector just when HMI for TOD error is also pending. In such a rare race
> condition if CPU has woken up with tb_loss power saving mode, it will
> invoke opal call to resync the TB. Since TOD is already in error state,
> resync TB will timeout leaving TFMR bit 18 set to '1'. (TFMR[18]=1 means
> TB is prepared to receive new value from TOD. Once the new value is
> received this bit gets reset to '0', otherwise TB would stay in waiting
> state). When HMI is delivered, it may find all TFMR errors are already
> cleared but would fail to restore TB since TFMR bit 18 is already set.
> This leads to HMI recovery failure causing a kernel crash.
> 
> This patch fixes this by clearing of TB errors if TFMR[18] is set to 1.
> This makes sure that TB is in clean state before TB restore process starts.

Does this need to go into older firmware release updates if there are
any?

Ananth



More information about the Skiboot mailing list