<html>
<head>
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
Hi Stewart,<br>
<br>
Sorry for the late response.<br>
<br>
<br>
<div class="moz-cite-prefix">On 02/23/2016 08:32 AM, Stewart Smith
wrote:<br>
</div>
<blockquote cite="mid:87wppwceon.fsf@linux.vnet.ibm.com" type="cite">
<pre wrap="">Mamatha Inamdar <a class="moz-txt-link-rfc2396E" href="mailto:mamatha4@linux.vnet.ibm.com"><mamatha4@linux.vnet.ibm.com></a> writes:
</pre>
<blockquote type="cite">
<pre wrap="">Problem Description:
During FSP termination/reset, FSP received mbox command from OPAL for
"Fetching platform management function data". As FSP is in termination
state DMAE operation failed to write memory data to hypervisor,
so FSP sent mbox command with response status as 0x24 to OPAL and
OPAL committed a predictive log with SRC BB822411 and sent back
response status as 0xFE, which FSP IPMI will not understand the
failure at the Host and IPMI will log the error.
Fix:This patch is to fix when OPAL receives a bad response from FSP 0x24
due to DMAE error, commit informational log and return response status
as SUCCESS and for all other bad status response commit predictive
log.
</pre>
</blockquote>
<pre wrap="">
Hi!
So I was trying to reproduce this on a FW840 machine doing "smgr
resetReload" on the FSP side. While I get a bunch of hidden error logs
from the FSP (and, mysteriously, on reset/reload the FSP seems to
re-inform us of the error logs that have previously been acknowledged),
I don't seem to get that specific SRC... can you share how you managed
to reproduce/test this?</pre>
</blockquote>
<span dir="ltr" style="text-align: left;" id=":3z.co" class="tL8wMe
EMoHub"><br>
It's difficult to recreate the issue based on the traces, we have
observed during termination, FSP receives mbox command from OPAL,
a</span>s FSP is in termination state DMAE operation failed, so
FSP sent mbox command with response status as 0x24 to OPAL and OPAL
committed a predictive log with SRC BB822411. <br>
<br>
Me and Mahesh discussed on this issue with Gajendra(FSP mbox
component owner) and came up with a fix, when OPAL receives a bad
response from FSP 0x24 due to DMAE error, commit informational log
and return response status as SUCCESS and for all other bad status
response commit predictive<br>
log.<br>
<br>
<blockquote cite="mid:87wppwceon.fsf@linux.vnet.ibm.com" type="cite">
<blockquote type="cite">
<pre wrap="">diff --git a/hw/fsp/fsp-ipmi.c b/hw/fsp/fsp-ipmi.c
index 750d144..f803f17 100644
--- a/hw/fsp/fsp-ipmi.c
+++ b/hw/fsp/fsp-ipmi.c
@@ -50,6 +50,10 @@ DEFINE_LOG_ENTRY(OPAL_RC_IPMI_RESP, OPAL_PLATFORM_ERR_EVT, OPAL_IPMI,
OPAL_PLATFORM_FIRMWARE, OPAL_PREDICTIVE_ERR_GENERAL,
OPAL_NA);
+DEFINE_LOG_ENTRY(OPAL_RC_IPMI_DMA_ERROR_RESP, OPAL_PLATFORM_ERR_EVT, OPAL_IPMI,
+ OPAL_PLATFORM_FIRMWARE, OPAL_INFO,
+ OPAL_NA);
+
struct fsp_ipmi_msg {
struct list_node link;
struct ipmi_msg ipmi_msg;
@@ -281,13 +285,19 @@ static bool fsp_ipmi_read_response(struct fsp_msg *msg)
assert(msg->data.words[1] == PSI_DMA_PLAT_RESP_BUF);
if (status != FSP_STATUS_SUCCESS) {
- log_simple_error(&e_info(OPAL_RC_IPMI_RESP), "IPMI: Response "
- "with bad status:0x%02x\n", status);
+ if(status == FSP_STATUS_DMA_ERROR)
+ log_simple_error(&e_info(OPAL_RC_IPMI_DMA_ERROR_RESP), "IPMI: Received "
+ "DMA ERROR response from FSP, this may be due to FSP "
+ "is in termination state:0x%02x\n", status);
+ else
+ log_simple_error(&e_info(OPAL_RC_IPMI_RESP), "IPMI: FSP response "
+ "received with bad status:0x%02x\n", status);
+
fsp_ipmi_cmd_done(ipmi_msg->cmd,
IPMI_NETFN_RETURN_CODE(ipmi_msg->netfn),
IPMI_ERR_UNSPECIFIED);
return fsp_ipmi_send_response(FSP_RSP_PLAT_DATA |
- FSP_STATUS_GENERIC_ERROR);
+ FSP_STATUS_SUCCESS);
</pre>
</blockquote>
<pre wrap="">
So... responding with success here seems really counter intuitive for
the error case.
</pre>
</blockquote>
<br>
<span dir="ltr" style="text-align: left;" id=":3l.co" class="tL8wMe
EMoHub">As per FSP team for any bad response of ipmi, OPAL should
not send back response status as 0xFE. FSP IPMI determines the
failure at the Host and IPMI will log the error. Hence we are
sending success as response and committing predictive log error<br>
</span><br>
<br>
<blockquote cite="mid:87wppwceon.fsf@linux.vnet.ibm.com" type="cite">
</blockquote>
<br>
</body>
</html>