[Skiboot] [PATCH] FSP/ELOG: elog_enable flag should be false by default
Mukesh Ojha
mukesh02 at linux.vnet.ibm.com
Fri Aug 5 20:39:44 AEST 2016
Lets move this
elog_enabled = false;
while defining elog_enabled variable. (As per Vasant's comment)
This changes also need to go to all stable branches.
Changes will be in V2.
Regards,
Mukesh
On Tuesday 02 August 2016 02:27 PM, Mukesh Ojha wrote:
> This issue is one of the corner case, which is related to recent change
> went upstream and only observed in the petitboot prompt, where we see
> only one error log instead of getting all error log in
> /sys/firmware/opal/elog.
>
> Below is snippet of the code, where elog module in the kernel
> initialised.
>
> {
> ..
> ...
> rc = request_threaded_irq(irq, NULL, elog_event, =<=======
> IRQF_TRIGGER_HIGH | IRQF_ONESHOT, "opal-elog", NULL); |
> if (rc) { |
> pr_err("%s: Can't request OPAL event irq (%d)\n", |
> __func__, rc); |
> return rc; |
> } |
> /* We are now ready to pull error logs from opal. */ |
> if (opal_check_token(OPAL_ELOG_RESEND)) |
> opal_resend_pending_logs(); =<=======
> }
>
> Scenario:
> While elog_enabled is true, OPAL_EVENT_ERROR_LOG_AVAIL will be set from
> OPAL, whenever it has error logs that are waiting to be fetched from the
> kernel.
>
> Race occurs between the code arrowed above, as soon as kernel registers
> error log handler, it sees OPAL_EVENT_ERROR_LOG_AVAIL is set, so it
> schedule the handler. Which makes 'opal_get_elog_size'(kernel) call on
> the error log set the state from ELOG_STATE_FETCHED_DATA to
> ELOG_STATE_FETCHED_INFO and clears OPAL_EVENT_ERROR_LOG_AVAIL. During
> the same time 'opal_resend_pending_logs'(kernel) call which will set the
> state machine from ELOG_STATE_FETCHED_INFO to ELOG_STATE_NONE in OPAL.
> Because of that, read call from the kernel, which was to be made after
> the 'opal_get_elog_size' ends up failing. But, the elog kobject was
> created for the particular error log.
>
> Further in the resend routine in the OPAL, we make opal_commit_elog_in_host()
> call that sets OPAL_EVENT_ERROR_LOG_AVAIL. So, Kernel again makes
> 'opal_get_elog_size' which results in getting the error log info of the
> same error log which was fetched earlier. It also changes the state
> machine to ELOG_STATE_FETCHED_INFO and clears OPAL_EVENT_ERROR_LOG_AVAIL.
>
>
> Below is the snippet from the elog_event registered handler call
> {
> ...
> ...
>
> /* we may get notified twice, let's handle
> * that gracefully and not create two conflicting
> * entries.
> */
> if (kset_find_obj(elog_kset, name))
> return IRQ_HANDLED;
> ...
> ...
> }
>
> In the kernel, we search kobject for the error log whether it already
> exist. So kobject is found and it returns without reading error log
> data.
>
> So, this patch makes the flag which was true during initialisation
> to false. And that solves the race.
>
> Signed-off-by: Mukesh Ojha <mukesh02 at linux.vnet.ibm.com>
> ---
> hw/fsp/fsp-elog-read.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/hw/fsp/fsp-elog-read.c b/hw/fsp/fsp-elog-read.c
> index 4f0bf16..c133438 100644
> --- a/hw/fsp/fsp-elog-read.c
> +++ b/hw/fsp/fsp-elog-read.c
> @@ -612,7 +612,7 @@ void fsp_elog_read_init(void)
> if (val != 0)
> return;
>
> - elog_enabled = true;
> + elog_enabled = false;
>
> /* register Eror log Class D2 */
> fsp_register_client(&fsp_get_elog_notify, FSP_MCLASS_ERR_LOG);
More information about the Skiboot
mailing list