[Skiboot] [PATCH v2] opal/hmi: Fix the soft lockup issue on HMI for certain TB errors.

Stewart Smith stewart at linux.vnet.ibm.com
Wed Oct 14 18:10:15 AEDT 2015


Mahesh J Salgaonkar <mahesh at linux.vnet.ibm.com> writes:
> diff --git a/core/hmi.c b/core/hmi.c
> index ee556fc..aeeabe8 100644
> --- a/core/hmi.c
> +++ b/core/hmi.c
> @@ -164,6 +164,9 @@
>   */
>  #define NX_HMI_ACTIVE		PPC_BIT(54)
>  
> +/* Number of iterations for the various timeouts */
> +#define TIMEOUT_LOOPS		20000000
> +
>  static const struct core_xstop_bit_info {
>  	uint8_t bit;		/* CORE FIR bit number */
>  	enum OpalHMI_CoreXstopReason reason;
> @@ -448,8 +451,21 @@ static int decode_malfunction(struct OpalHMIEvent *hmi_evt)
>  
>  static void wait_for_subcore_threads(void)
>  {
> -	while (!(*(this_cpu()->core_hmi_state_ptr) & HMI_STATE_CLEANUP_DONE))
> +	uint64_t timeout = 0;
> +
> +	while (!(*(this_cpu()->core_hmi_state_ptr) & HMI_STATE_CLEANUP_DONE)) {
> +		if (++timeout >= (TIMEOUT_LOOPS*3)) {

Summarising discussion on IRC:
- we can't use timebase here to do a wait as TB may not actually work
  anymore
- so a loop is best
- I'll add a comment when merging so I don't ask the same question in 8
  months time when I wonder why we have a number of loops rather than
  looking at timebase.

As discussed on IRC, this should also head to stable.

Merged into stable as of a47b98e - which will head towards what becomes
skiboot-5.1.8

Merged into master as of a0e385b.



More information about the Skiboot mailing list