[Skiboot] [PATCH V2] FSP/ELOG: elog_enable flag should be false by default

Vasant Hegde hegdevasant at linux.vnet.ibm.com
Sat Aug 6 00:40:28 AEST 2016


On 08/05/2016 04:27 PM, Mukesh Ojha wrote:
> This issue is one of the corner case, which is related to recent change
> went upstream and only observed in the petitboot prompt, where we see
> only one error log instead of getting all error log in
> /sys/firmware/opal/elog.
>
> Below is snippet of the code, where elog module in the kernel
> initialised.
>
> {
> 	..
> 	...
> 	rc = request_threaded_irq(irq, NULL, elog_event,		=<=======
> 			IRQF_TRIGGER_HIGH | IRQF_ONESHOT, "opal-elog", NULL);	|
> 	if (rc) {								|	
> 		pr_err("%s: Can't request OPAL event irq (%d)\n",		|
> 			__func__, rc);						|
> 		return rc;							|
> 	}									|
> 	/* We are now ready to pull error logs from opal. */			|
> 	if (opal_check_token(OPAL_ELOG_RESEND))					|
> 		opal_resend_pending_logs();				=<=======
> }
> 	
> Scenario:
> While elog_enabled is true, OPAL_EVENT_ERROR_LOG_AVAIL will be set from
> OPAL, whenever it has error logs that are waiting to be fetched from the
> kernel.
>
> Race occurs between the code arrowed above, as soon as kernel registers
> error log handler, it sees OPAL_EVENT_ERROR_LOG_AVAIL is set, so it
> schedule the handler. Which makes 'opal_get_elog_size'(kernel) call on
> the error log set the state from ELOG_STATE_FETCHED_DATA to
> ELOG_STATE_FETCHED_INFO and clears OPAL_EVENT_ERROR_LOG_AVAIL. During
> the same time 'opal_resend_pending_logs'(kernel) call which will set the
> state machine from ELOG_STATE_FETCHED_INFO to ELOG_STATE_NONE in OPAL.
> Because of that, read call from the kernel, which was to be made after
> the 'opal_get_elog_size' ends up failing. But, the elog kobject was
> created for the particular error log.
>
> Further in the resend routine in the OPAL, we make opal_commit_elog_in_host()
> call that sets OPAL_EVENT_ERROR_LOG_AVAIL. So, Kernel again makes
> 'opal_get_elog_size' which results in getting the error log info of the
> same error log which was fetched earlier. It also changes the state
> machine to ELOG_STATE_FETCHED_INFO and clears OPAL_EVENT_ERROR_LOG_AVAIL.
>
>
> Below is the snippet from the elog_event registered handler call
> {
> 	...
> 	...
>
> 	/* we may get notified twice, let's handle
> 	 * that gracefully and not create two conflicting
> 	 * entries.
> 	 */
> 	if (kset_find_obj(elog_kset, name))
> 		return IRQ_HANDLED;
> 	...
> 	...
> }
> 	
> In the kernel, we search kobject for the error log whether it already
> exist. So kobject is found and it returns without reading error log
> data.
>
> So, this patch makes the flag which was true during initialisation
> to false. And that solves the race.
>
> Signed-off-by: Mukesh Ojha <mukesh02 at linux.vnet.ibm.com>

Patch looks good.

Reviewed-by: Vasant Hegde <hegdevasant at linux.vnet.ibm.com>

@Stewart,
   With this patch we make sure OPAL doesn't enable notification until host is 
ready to accept error logs.


-Vasant

> ---
> Changes in V2:
>   - As per the Vasant's comment offline, it moves the 'elog_enabled = false' to
>     while defining elog_enabled.
>
>   hw/fsp/fsp-elog-read.c | 4 +---
>   1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/hw/fsp/fsp-elog-read.c b/hw/fsp/fsp-elog-read.c
> index 4f0bf16..a980281 100644
> --- a/hw/fsp/fsp-elog-read.c
> +++ b/hw/fsp/fsp-elog-read.c
> @@ -88,7 +88,7 @@ static uint32_t elog_read_retries;	/* bad response status count */
>   /* Initialize the state of the log */
>   static enum elog_head_state elog_read_from_fsp_head_state = ELOG_STATE_NONE;
>
> -static bool elog_enabled;
> +static bool elog_enabled = false;
>
>   /* Need forward declaration because of Circular dependency */
>   static void fsp_elog_queue_fetch(void);
> @@ -612,8 +612,6 @@ void fsp_elog_read_init(void)
>   	if (val != 0)
>   		return;
>
> -	elog_enabled = true;
> -
>   	/* register Eror log Class D2 */
>   	fsp_register_client(&fsp_get_elog_notify, FSP_MCLASS_ERR_LOG);
>



More information about the Skiboot mailing list