[Skiboot] [PATCH V2] FSP/ELOG: elog_enable flag should be false by default
Mukesh Ojha
mukesh02 at linux.vnet.ibm.com
Fri Aug 5 20:57:02 AEST 2016
This issue is one of the corner case, which is related to recent change
went upstream and only observed in the petitboot prompt, where we see
only one error log instead of getting all error log in
/sys/firmware/opal/elog.
Below is snippet of the code, where elog module in the kernel
initialised.
{
..
...
rc = request_threaded_irq(irq, NULL, elog_event, =<=======
IRQF_TRIGGER_HIGH | IRQF_ONESHOT, "opal-elog", NULL); |
if (rc) { |
pr_err("%s: Can't request OPAL event irq (%d)\n", |
__func__, rc); |
return rc; |
} |
/* We are now ready to pull error logs from opal. */ |
if (opal_check_token(OPAL_ELOG_RESEND)) |
opal_resend_pending_logs(); =<=======
}
Scenario:
While elog_enabled is true, OPAL_EVENT_ERROR_LOG_AVAIL will be set from
OPAL, whenever it has error logs that are waiting to be fetched from the
kernel.
Race occurs between the code arrowed above, as soon as kernel registers
error log handler, it sees OPAL_EVENT_ERROR_LOG_AVAIL is set, so it
schedule the handler. Which makes 'opal_get_elog_size'(kernel) call on
the error log set the state from ELOG_STATE_FETCHED_DATA to
ELOG_STATE_FETCHED_INFO and clears OPAL_EVENT_ERROR_LOG_AVAIL. During
the same time 'opal_resend_pending_logs'(kernel) call which will set the
state machine from ELOG_STATE_FETCHED_INFO to ELOG_STATE_NONE in OPAL.
Because of that, read call from the kernel, which was to be made after
the 'opal_get_elog_size' ends up failing. But, the elog kobject was
created for the particular error log.
Further in the resend routine in the OPAL, we make opal_commit_elog_in_host()
call that sets OPAL_EVENT_ERROR_LOG_AVAIL. So, Kernel again makes
'opal_get_elog_size' which results in getting the error log info of the
same error log which was fetched earlier. It also changes the state
machine to ELOG_STATE_FETCHED_INFO and clears OPAL_EVENT_ERROR_LOG_AVAIL.
Below is the snippet from the elog_event registered handler call
{
...
...
/* we may get notified twice, let's handle
* that gracefully and not create two conflicting
* entries.
*/
if (kset_find_obj(elog_kset, name))
return IRQ_HANDLED;
...
...
}
In the kernel, we search kobject for the error log whether it already
exist. So kobject is found and it returns without reading error log
data.
So, this patch makes the flag which was true during initialisation
to false. And that solves the race.
Signed-off-by: Mukesh Ojha <mukesh02 at linux.vnet.ibm.com>
---
Changes in V2:
- As per the Vasant's comment offline, it moves the 'elog_enabled = false' to
while defining elog_enabled.
hw/fsp/fsp-elog-read.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/hw/fsp/fsp-elog-read.c b/hw/fsp/fsp-elog-read.c
index 4f0bf16..a980281 100644
--- a/hw/fsp/fsp-elog-read.c
+++ b/hw/fsp/fsp-elog-read.c
@@ -88,7 +88,7 @@ static uint32_t elog_read_retries; /* bad response status count */
/* Initialize the state of the log */
static enum elog_head_state elog_read_from_fsp_head_state = ELOG_STATE_NONE;
-static bool elog_enabled;
+static bool elog_enabled = false;
/* Need forward declaration because of Circular dependency */
static void fsp_elog_queue_fetch(void);
@@ -612,8 +612,6 @@ void fsp_elog_read_init(void)
if (val != 0)
return;
- elog_enabled = true;
-
/* register Eror log Class D2 */
fsp_register_client(&fsp_get_elog_notify, FSP_MCLASS_ERR_LOG);
--
2.7.4
More information about the Skiboot
mailing list