[Skiboot] [PATCH V2 2/3] doc/errorlogging : Update details about error logging on FSP and BMC

Vasant Hegde hegdevasant at linux.vnet.ibm.com
Fri Jul 15 18:01:50 AEST 2016


On 07/14/2016 12:12 PM, Mukesh Ojha wrote:
> This patch add more description and example how errorlogging is independent of
> the platform and also talks about how errorlogs committed on BMC systems.
>
> Signed-off-by: Mukesh Ojha <mukesh02 at linux.vnet.ibm.com>
> ---
> Changes in V2:
>   - Corrects typo mistake.
>   - Adds more detail about eSEL format.
>   - Changes talk about generic service processors.
>
>   doc/error-logging.txt | 68 +++++++++++++++++++++++++++++++++++++++++++++------
>   1 file changed, 60 insertions(+), 8 deletions(-)
>
> diff --git a/doc/error-logging.txt b/doc/error-logging.txt
> index 7c62520..a9e5993 100644
> --- a/doc/error-logging.txt
> +++ b/doc/error-logging.txt
> @@ -178,16 +178,68 @@ Step 2: Data can be appended to the user data section using the either of
>   	uint32_t tag: Unique value to identify the data.
>                          Ideal to have ASCII value for 4-byte string.
>
> -Step 3: Once all the data for an error is logged in, the error needs to be
> -	committed in FSP.
> +Step 3: There is a platform hook for the opal error log to be committed on any
> +	service processor(Currently used for FSP and BMC based machines).
>
> -	rc = elog_fsp_commit(buf);
> +	FSP:
> +		.elog_commit            = elog_fsp_commit
> +
> +	Once all the data for an error is logged in, the error needs to
> +	be committed in FSP.
> +
> +	rc = platform.elog_commit(elog);
>   	Value of 0 is returned on success.
>
> -In the process of committing an error to FSP, log info is first internally
> -converted to PEL format and then pushed to the FSP. All the errors logged
> -in Sapphire are again pushed up to POWERNV platform by the FSP and all the errors
> -reported by Sapphire and POWERNV are logged in FSP.
> +	In the process of committing an error to FSP, log info is first
> +	internally converted to PEL format and then pushed to the FSP. All the
> +	errors logged in Sapphire are again pushed up to POWERNV platform by
> +	the FSP and all the errors reported by Sapphire and POWERNV are logged

Please use consistent terminology. here you are referring PowerNV and below host 
kernel.

Also host kernel will not log any errors to service processor.


-Vasant

> +	in FSP. Sapphire maintains timeout field for all error logs it is
> +	sending to FSP. if it is not logged within allotted time period (e.g if
> +	FSP is down), in that case OPAL sends those logs to host kernel.
> +
> +	BMC:
> +		.elog_commit            = ipmi_elog_commit
> +
> +	rc = platform.elog_commit(elog);
> +	Value of 0 is returned on success.
> +
> +	In case of BMC machines, Error logs are first converted to eSEL format.
> +	i.e:
> +		eSEL = SEL header + PEL data
> +
> +	SEL header contains below fields,
> +	struct sel_header {
> +		uint16_t id;
> +		uint8_t record_type;
> +		uint32_t timestamp;
> +		uint16_t genid;
> +		uint8_t evmrev;
> +		uint8_t sensor_type;
> +		uint8_t sensor_num;
> +		uint8_t dir_type;
> +		uint8_t signature;
> +		uint8_t reserved[2];
> +	}
> +
> +	After filling up the SEL header fields, Sapphire copies the errorlog PEL
> +	data after the header section. After eSEL log gets logged in BMC via
> +	IPMI interface.
> +
> +e.g:
> +	void log_commit(struct errorlog *elog)
> +	{
> +		....
> +		....
> +		if (platform.elog_commit) {
> +			rc = platform.elog_commit(elog);
> +			if (rc)
> +				prerror("ELOG: Platform commit error %d\n", rc);
> +			return;
> +		}
> +		....
> +		....
> +	}
>
>   If the user does not intend to dump various user data sections, but just
>   log the error with some amount of description around that error, they can do
> @@ -196,7 +248,7 @@ so using just the simple error logging interface
>   log_simple_error(uint32_t reason_code, char *fmt, ...);
>
>   Eg: log_simple_error(OPAL_RC_SURVE_STATUS,
> -			"SURV: Error retreiving surveillance status: %d\n",
> +			"SURV: Error retrieving surveillance status: %d\n",
>                          						err_len);
>
>   Using the reason code, an error log is generated with the information derived
>



More information about the Skiboot mailing list