[Skiboot] [PATCH] opal/hmi: Fix a TOD HMI failure during a race condition.
Stewart Smith
stewart at linux.vnet.ibm.com
Thu Aug 25 19:03:48 AEST 2016
Mahesh J Salgaonkar <mahesh at linux.vnet.ibm.com> writes:
> From: Mahesh Salgaonkar <mahesh at linux.vnet.ibm.com>
>
> There are chances where another interrupt can wake a CPU in 0x100
> vector just when HMI for TOD error is also pending. In such a rare race
> condition if CPU has woken up with tb_loss power saving mode, it will
> invoke opal call to resync the TB. Since TOD is already in error state,
> resync TB will timeout leaving TFMR bit 18 set to '1'. (TFMR[18]=1 means
> TB is prepared to receive new value from TOD. Once the new value is
> received this bit gets reset to '0', otherwise TB would stay in waiting
> state). When HMI is delivered, it may find all TFMR errors are already
> cleared but would fail to restore TB since TFMR bit 18 is already set.
> This leads to HMI recovery failure causing a kernel crash.
>
> This patch fixes this by clearing of TB errors if TFMR[18] is set to 1.
> This makes sure that TB is in clean state before TB restore process starts.
>
> Signed-off-by: Mahesh Salgaonkar <mahesh at linux.vnet.ibm.com>
> ---
> hw/chiptod.c | 7 +++++++
> 1 file changed, 7 insertions(+)
Thanks,
merged to:
026b9a1 master
bb18811 skiboot-5.1.x
0abc875 skiboot-5.3.x
--
Stewart Smith
OPAL Architect, IBM.
More information about the Skiboot
mailing list