[Skiboot] [PATCH V2] FSP/ELOG: elog_enable flag should be false by default
Vasant Hegde
hegdevasant at linux.vnet.ibm.com
Sat Aug 6 00:40:28 AEST 2016
On 08/05/2016 04:27 PM, Mukesh Ojha wrote:
> This issue is one of the corner case, which is related to recent change
> went upstream and only observed in the petitboot prompt, where we see
> only one error log instead of getting all error log in
> /sys/firmware/opal/elog.
>
> Below is snippet of the code, where elog module in the kernel
> initialised.
>
> {
> ..
> ...
> rc = request_threaded_irq(irq, NULL, elog_event, =<=======
> IRQF_TRIGGER_HIGH | IRQF_ONESHOT, "opal-elog", NULL); |
> if (rc) { |
> pr_err("%s: Can't request OPAL event irq (%d)\n", |
> __func__, rc); |
> return rc; |
> } |
> /* We are now ready to pull error logs from opal. */ |
> if (opal_check_token(OPAL_ELOG_RESEND)) |
> opal_resend_pending_logs(); =<=======
> }
>
> Scenario:
> While elog_enabled is true, OPAL_EVENT_ERROR_LOG_AVAIL will be set from
> OPAL, whenever it has error logs that are waiting to be fetched from the
> kernel.
>
> Race occurs between the code arrowed above, as soon as kernel registers
> error log handler, it sees OPAL_EVENT_ERROR_LOG_AVAIL is set, so it
> schedule the handler. Which makes 'opal_get_elog_size'(kernel) call on
> the error log set the state from ELOG_STATE_FETCHED_DATA to
> ELOG_STATE_FETCHED_INFO and clears OPAL_EVENT_ERROR_LOG_AVAIL. During
> the same time 'opal_resend_pending_logs'(kernel) call which will set the
> state machine from ELOG_STATE_FETCHED_INFO to ELOG_STATE_NONE in OPAL.
> Because of that, read call from the kernel, which was to be made after
> the 'opal_get_elog_size' ends up failing. But, the elog kobject was
> created for the particular error log.
>
> Further in the resend routine in the OPAL, we make opal_commit_elog_in_host()
> call that sets OPAL_EVENT_ERROR_LOG_AVAIL. So, Kernel again makes
> 'opal_get_elog_size' which results in getting the error log info of the
> same error log which was fetched earlier. It also changes the state
> machine to ELOG_STATE_FETCHED_INFO and clears OPAL_EVENT_ERROR_LOG_AVAIL.
>
>
> Below is the snippet from the elog_event registered handler call
> {
> ...
> ...
>
> /* we may get notified twice, let's handle
> * that gracefully and not create two conflicting
> * entries.
> */
> if (kset_find_obj(elog_kset, name))
> return IRQ_HANDLED;
> ...
> ...
> }
>
> In the kernel, we search kobject for the error log whether it already
> exist. So kobject is found and it returns without reading error log
> data.
>
> So, this patch makes the flag which was true during initialisation
> to false. And that solves the race.
>
> Signed-off-by: Mukesh Ojha <mukesh02 at linux.vnet.ibm.com>
Patch looks good.
Reviewed-by: Vasant Hegde <hegdevasant at linux.vnet.ibm.com>
@Stewart,
With this patch we make sure OPAL doesn't enable notification until host is
ready to accept error logs.
-Vasant
> ---
> Changes in V2:
> - As per the Vasant's comment offline, it moves the 'elog_enabled = false' to
> while defining elog_enabled.
>
> hw/fsp/fsp-elog-read.c | 4 +---
> 1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/hw/fsp/fsp-elog-read.c b/hw/fsp/fsp-elog-read.c
> index 4f0bf16..a980281 100644
> --- a/hw/fsp/fsp-elog-read.c
> +++ b/hw/fsp/fsp-elog-read.c
> @@ -88,7 +88,7 @@ static uint32_t elog_read_retries; /* bad response status count */
> /* Initialize the state of the log */
> static enum elog_head_state elog_read_from_fsp_head_state = ELOG_STATE_NONE;
>
> -static bool elog_enabled;
> +static bool elog_enabled = false;
>
> /* Need forward declaration because of Circular dependency */
> static void fsp_elog_queue_fetch(void);
> @@ -612,8 +612,6 @@ void fsp_elog_read_init(void)
> if (val != 0)
> return;
>
> - elog_enabled = true;
> -
> /* register Eror log Class D2 */
> fsp_register_client(&fsp_get_elog_notify, FSP_MCLASS_ERR_LOG);
>
More information about the Skiboot
mailing list