[Skiboot] [PATCH] OPAL:Handle mbox response with bad status:0x24 during FSP termination
Stewart Smith
stewart at linux.vnet.ibm.com
Tue Feb 23 14:02:00 AEDT 2016
Mamatha Inamdar <mamatha4 at linux.vnet.ibm.com> writes:
> Problem Description:
> During FSP termination/reset, FSP received mbox command from OPAL for
> "Fetching platform management function data". As FSP is in termination
> state DMAE operation failed to write memory data to hypervisor,
> so FSP sent mbox command with response status as 0x24 to OPAL and
> OPAL committed a predictive log with SRC BB822411 and sent back
> response status as 0xFE, which FSP IPMI will not understand the
> failure at the Host and IPMI will log the error.
>
> Fix:This patch is to fix when OPAL receives a bad response from FSP 0x24
> due to DMAE error, commit informational log and return response status
> as SUCCESS and for all other bad status response commit predictive
> log.
Hi!
So I was trying to reproduce this on a FW840 machine doing "smgr
resetReload" on the FSP side. While I get a bunch of hidden error logs
from the FSP (and, mysteriously, on reset/reload the FSP seems to
re-inform us of the error logs that have previously been acknowledged),
I don't seem to get that specific SRC... can you share how you managed
to reproduce/test this?
> diff --git a/hw/fsp/fsp-ipmi.c b/hw/fsp/fsp-ipmi.c
> index 750d144..f803f17 100644
> --- a/hw/fsp/fsp-ipmi.c
> +++ b/hw/fsp/fsp-ipmi.c
> @@ -50,6 +50,10 @@ DEFINE_LOG_ENTRY(OPAL_RC_IPMI_RESP, OPAL_PLATFORM_ERR_EVT, OPAL_IPMI,
> OPAL_PLATFORM_FIRMWARE, OPAL_PREDICTIVE_ERR_GENERAL,
> OPAL_NA);
>
> +DEFINE_LOG_ENTRY(OPAL_RC_IPMI_DMA_ERROR_RESP, OPAL_PLATFORM_ERR_EVT, OPAL_IPMI,
> + OPAL_PLATFORM_FIRMWARE, OPAL_INFO,
> + OPAL_NA);
> +
> struct fsp_ipmi_msg {
> struct list_node link;
> struct ipmi_msg ipmi_msg;
> @@ -281,13 +285,19 @@ static bool fsp_ipmi_read_response(struct fsp_msg *msg)
> assert(msg->data.words[1] == PSI_DMA_PLAT_RESP_BUF);
>
> if (status != FSP_STATUS_SUCCESS) {
> - log_simple_error(&e_info(OPAL_RC_IPMI_RESP), "IPMI: Response "
> - "with bad status:0x%02x\n", status);
> + if(status == FSP_STATUS_DMA_ERROR)
> + log_simple_error(&e_info(OPAL_RC_IPMI_DMA_ERROR_RESP), "IPMI: Received "
> + "DMA ERROR response from FSP, this may be due to FSP "
> + "is in termination state:0x%02x\n", status);
> + else
> + log_simple_error(&e_info(OPAL_RC_IPMI_RESP), "IPMI: FSP response "
> + "received with bad status:0x%02x\n", status);
> +
> fsp_ipmi_cmd_done(ipmi_msg->cmd,
> IPMI_NETFN_RETURN_CODE(ipmi_msg->netfn),
> IPMI_ERR_UNSPECIFIED);
> return fsp_ipmi_send_response(FSP_RSP_PLAT_DATA |
> - FSP_STATUS_GENERIC_ERROR);
> + FSP_STATUS_SUCCESS);
So... responding with success here seems really counter intuitive for
the error case.
--
Stewart Smith
OPAL Architect, IBM.
More information about the Skiboot
mailing list