[Skiboot] [PATCH] opal/hmi: Fix a TOD HMI failure during a race condition.

Stewart Smith stewart at linux.vnet.ibm.com
Thu Aug 25 19:03:48 AEST 2016


Mahesh J Salgaonkar <mahesh at linux.vnet.ibm.com> writes:
> From: Mahesh Salgaonkar <mahesh at linux.vnet.ibm.com>
>
> There are chances where another interrupt can wake a CPU in 0x100
> vector just when HMI for TOD error is also pending. In such a rare race
> condition if CPU has woken up with tb_loss power saving mode, it will
> invoke opal call to resync the TB. Since TOD is already in error state,
> resync TB will timeout leaving TFMR bit 18 set to '1'. (TFMR[18]=1 means
> TB is prepared to receive new value from TOD. Once the new value is
> received this bit gets reset to '0', otherwise TB would stay in waiting
> state). When HMI is delivered, it may find all TFMR errors are already
> cleared but would fail to restore TB since TFMR bit 18 is already set.
> This leads to HMI recovery failure causing a kernel crash.
>
> This patch fixes this by clearing of TB errors if TFMR[18] is set to 1.
> This makes sure that TB is in clean state before TB restore process starts.
>
> Signed-off-by: Mahesh Salgaonkar <mahesh at linux.vnet.ibm.com>
> ---
>  hw/chiptod.c |    7 +++++++
>  1 file changed, 7 insertions(+)

Thanks,

merged to:
026b9a1  master
bb18811  skiboot-5.1.x
0abc875  skiboot-5.3.x


-- 
Stewart Smith
OPAL Architect, IBM.



More information about the Skiboot mailing list