[Skiboot] Request to OPAL Contributors - New OLOG Feature in the Firmware Test Suite
Patrick Williams
patrick at stwcx.xyz
Sat Apr 2 02:53:20 AEDT 2016
Adding a few Hostboot developers to the list. We should at least get
the PRD errors added to the "critical list".
On Thu, Mar 31, 2016 at 05:01:50PM -0500, Deb McLemore wrote:
> To All OPAL participants,
>
> We are introducing a new feature to the Firmware Test Suite (FWTS)
> called "OLOG", "Other log analysis" or "OPAL log analysis"
> ( https://wiki.ubuntu.com/FirmwareTestSuite ).
>
>
> FWTS when run on OPAL PPC will dump out the
> /sys/firmware/opal/msglog and then compare the msglog to a
> set of patterns and/or strings defined in a default
> JSON file (see olog.json sample below).
>
> What this request is asking from each of you is to provide
> a contribution from your area of expertise to identify any
> log entries which may prove to be either suspicious or of
> interest to highlight when something goes not quite right
> on a PPC OPAL server.
>
> So far the most interesting entries to flag are included in the
> sample JSON file provided to the FWTS project. This
> sample is by no means thought to be of technical content,
> but more of a sample of the supposed severity levels of an event
> and some sample texts for advice for the consumer of these
> tools on what to do to take action to correct or what action to
> take to report a bug or to perform deeper analysis/debug, etc.
>
> Included below are some outputs from an "OLOG" scan based
> on the FWTS OLOG JSON default sample to represent the type
> of output that may be produced.
>
> We ask for technical contributions for the olog.json pattern matcher
> to be sent to debmc at linux.vnet.ibm.com for review and contribution.
>
> 1 - We can include JSON entries using a string compare or
> regular expressions.
>
> 2 - The advice should be specific and useful for the consumer
> to understand and to provide all the needed steps and actions
> in a summary fashion.
>
> 3 - The labels used are custom and can be leveraged by each
> component owner as they see fit, most usage scenarios may
> be to include expected sequences of patterns in a group to
> help review what was or was not seen logged in case
> something goes wrong or to make sure begin/end sequences
> are properly matched, etc. If any label submissions overlap
> we will work with the submitter to eliminate the duplication.
>
>
> SAMPLE olog.json:
>
> $ cat olog.json
> {
> "olog_error_warning_patterns":
> [
> {
> "compare_mode": "string",
> "log_level": "LOG_LEVEL_CRITICAL",
> "pattern": "CRITICAL",
> "advice": "SAMPLE TEXT -> Check your log for this condition and
> give some specific investigative and action steps.",
> "label": "OLOG_Filter_GROUP1"
> },
> {
> "compare_mode": "string",
> "log_level": "LOG_LEVEL_HIGH",
> "pattern": "STOP",
> "advice": "SAMPLE TEXT -> Check your log for this condition and
> give some specific investigative and action steps.",
> "label": "OLOG_Filter_GROUPA"
> },
> {
> "compare_mode": "string",
> "log_level": "LOG_LEVEL_MEDIUM",
> "pattern": "STOP",
> "advice": "SAMPLE TEXT -> Check your log for this condition and
> give some specific investigative and action steps.",
> "label": "OLOG_Filter_GROUPB"
> },
> {
> "compare_mode": "regex",
> "log_level": "LOG_LEVEL_LOW",
> "pattern": "Trying.*",
> "advice": "SAMPLE TEXT -> This needs further investigation and
> review, please take xyz corrective action.",
> "label": "OLOG_Filter_GROUPC"
> }
> ]
> }
>
> # cat results.log
> Results generated by fwts: Version V16.03.00 (2016-03-14 09:10:20).
>
> Some of this work - Copyright (c) 1999 - 2016, Intel Corp. All rights
> reserved.
> Some of this work - Copyright (c) 2010 - 2016, Canonical.
> Some of this work - Copyright (c) 2016 IBM.
>
> This test run on 30/03/16 at 17:58:31 on host Linux mega-mon.austin.ibm.com
> 3.18.22-359.el7_1.pkvm3_1_0.4000.1.ppc64le #1
> SMP Tue Nov 10 11:07:22 CST 2015 ppc64le.
>
> Command: "fwts olog --json-data-file=olog.json ".
> Running tests: olog.
>
> olog: Run OLOG scan and analysis checks.
>
> Test Failure Summary
> =====================================================
> Critical failures: NONE
>
> High failures: NONE
>
> Medium failures: NONE
>
> Low failures: 8
> olog: LOW Kernel message: [3001360713,5] Trying to load LID 81e08430
> from FSP
> olog: LOW Kernel message: [3704532490,5] VPD: Trying to load VPD LID
> 0x80e08042...
> olog: LOW Kernel message: [3704535287,5] Trying to load LID 80e08042
> from FSP
> olog: LOW Kernel message: [3704627893,5] Trying to load OPAL LID
> 80a02001...
> olog: LOW Kernel message: [3704640862,5] Trying to load OPAL LID
> 80f00101...
> olog: LOW Kernel message: [3704653746,5] Trying to load OPAL LID
> 80f00102...
> olog: LOW Kernel message: [10029546437,3] OPAL: Trying a CPU re-init
> with flags: 0x2
> olog: LOW Kernel message: [184758561647,3] OPAL: Trying a CPU re-init
> with flags: 0x1
> --------------------------------------------------------------------------------
> Test 1 of 1: OLOG scan and analysis checks results.
>
> FAILED [LOW] OLOG_Filter_GROUPC: Test 1, LOW Kernel message:
> [3001360713,5] Trying to load LID 81e08430 from FSP
>
> ADVICE: SAMPLE TEXT -> This needs further investigation and review,
> please take xyz corrective action.
>
> FAILED [LOW] OLOG_Filter_GROUPC: Test 1, LOW Kernel message:
> [3704532490,5] VPD: Trying to load VPD LID 0x80e08042...
>
> ADVICE: SAMPLE TEXT -> This needs further investigation and review,
> please take xyz corrective action.
>
> FAILED [LOW] OLOG_Filter_GROUPC: Test 1, LOW Kernel message:
> [3704535287,5] Trying to load LID 80e08042 from FSP
>
> ADVICE: SAMPLE TEXT -> This needs further investigation and review,
> please take xyz corrective action.
>
> FAILED [LOW] OLOG_Filter_GROUPC: Test 1, LOW Kernel message:
> [3704627893,5] Trying to load OPAL LID 80a02001...
>
> ADVICE: SAMPLE TEXT -> This needs further investigation and review,
> please take xyz corrective action.
>
> FAILED [LOW] OLOG_Filter_GROUPC: Test 1, LOW Kernel message:
> [3704640862,5] Trying to load OPAL LID 80f00101...
>
> ADVICE: SAMPLE TEXT -> This needs further investigation and review,
> please take xyz corrective action.
>
> FAILED [LOW] OLOG_Filter_GROUPC: Test 1, LOW Kernel message:
> [3704653746,5] Trying to load OPAL LID 80f00102...
>
> ADVICE: SAMPLE TEXT -> This needs further investigation and review,
> please take xyz corrective action.
>
> FAILED [LOW] OLOG_Filter_GROUPC: Test 1, LOW Kernel message:
> [10029546437,3] OPAL: Trying a CPU re-init with flags: 0x2
> Message repeated 1 times.
>
> ADVICE: SAMPLE TEXT -> This needs further investigation and review,
> please take xyz corrective action.
>
> FAILED [LOW] OLOG_Filter_GROUPC: Test 1, LOW Kernel message:
> [184758561647,3] OPAL: Trying a CPU re-init with flags: 0x1
>
> ADVICE: SAMPLE TEXT -> This needs further investigation and review,
> please take xyz corrective action.
>
> OLOG scan and analysis found 8 unique issue(s).
>
> =====================================================
> 0 passed, 8 failed, 0 warning, 0 aborted, 0 skipped, 0 info only.
> =====================================================
>
> --
> ==========================================
> Deb McLemore
> IBM OpenPower - IBM Systems
> (512) 286 9980
>
> debmc at us.ibm.com
> debmc at linux.vnet.ibm.com - (plain text)
> ==========================================
>
> _______________________________________________
> Skiboot mailing list
> Skiboot at lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/skiboot
--
Patrick Williams
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: Digital signature
URL: <http://lists.ozlabs.org/pipermail/skiboot/attachments/20160401/18169c26/attachment.sig>
More information about the Skiboot
mailing list