[Skiboot] [PATCH V3 3/3] doc/errorlogging : Updates detail about error log retrieval from FSP

Mon Jul 18 23:10:26 AEST 2016

Add description about how FSP error logs get fetched in OPAL and finally pulled
by the POWERNV and also talks about the design constraints.

Signed-off-by: Mukesh Ojha <mukesh02 at linux.vnet.ibm.com>
---
Changes in V3:
 - Some of the corrections related to caseletters and sentence making.

Changes in V2:
 - Adds separate section for error log retrieval.
 - Adds design constraints as per Vasant's comment.

 doc/error-logging.txt | 63 +++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 63 insertions(+)

diff --git a/doc/error-logging.txt b/doc/error-logging.txt
index 8a3134e..57fc05c 100644
--- a/doc/error-logging.txt
+++ b/doc/error-logging.txt
@@ -252,6 +252,69 @@ Using the reason code, an error log is generated with the information derived
 from the look-up table, populated and committed to FSP. All of it
 is done with just one call.
 
+
+Error logging retrieval from FSP:
+================================
+
+FSP send their error logs via giving notification to OPAL via mailbox protocol.
+
+OPAL maintains some lists:
+
+Free list      : List of free nodes.
+Pending list   : List of nodes which is yet to be read by the POWERNV.
+Processed list : List of nodes which has been read but still waiting for
+                 acknowledgement.
+
+Free node fields are log_id, log_size and the link which points to the next
+node.
+
+OPAL has state machine which has following states.
+
+enum elog_head_state {
+        ELOG_STATE_FETCHING,    /*In the process of reading log from FSP. */
+        ELOG_STATE_FETCHED_INFO,/* Indicates reading log info is completed */
+        ELOG_STATE_FETCHED_DATA,/* Indicates reading log is completed */
+        ELOG_STATE_NONE,        /* Indicates to fetch next log */
+        ELOG_STATE_REJECTED,    /* resend all pending logs to linux */
+};
+
+Initially, state of the state machine is ELOG_STATE_NONE.
+
+When OPAL gets the notification about the error log, it takes out the node from
+free list and put it into pending list and update the state machine to
+fetching state(ELOG_STATE_FETCHING). It also gives response back to FSP about
+the received error log notification.
+
+It also queue mailbox message to get the error log data in OPAL error log
+buffer, once it is done state machine gets into fetched state
+(ELOG_STATE_FETCHED_DATA). After that, OPAL notify the POWERNV to fetch the
+error log.
+
+POWERNV uses the OPAL interface call to get the error log info(elogid,
+elog_size, elog_type) first then it reads the error log data in its buffer
+that moves the pending error log to processed list. After reading the
+state machine moves to ELOG_STATE_NONE state.
+
+It acknowledges the error log id after reading error log data by sending the
+call to OPAL, which in turn sends the ack mbox message to FSP and moves
+error log id from processed list to again back to free node list and this
+process goes on every FSP error log.
+
+Design constraints:
+==================
+* #define ELOG_READ_MAX_RECORD            128
+
+  Currently, the number of error logs from FSP, OPAL can hold is limited to
+  128. if OPAL run out of free node in the list for the new error log, it sends
+  'Discarded by OPAL' message to the FSP. At some point in the future, it is
+  upto FSP when it notifies again to OPAL about the error log discarded.
+
+* #define ELOG_WRITE_MAX_RECORD		64
+
+  There is also limitation on the number of OPAL error logs OPAL can hold is 64.
+  If it is run out of the buffers in the pool, it will log the message saying
+  'Failed to get the buffer'.
+
 Note:
 ====
 * For more information regarding error logging and PEL format
-- 
2.7.4