[Skiboot] Request to OPAL Contributors - New OLOG Feature in the Firmware Test Suite

Deb McLemore debmc at linux.vnet.ibm.com
Fri Apr 1 09:01:50 AEDT 2016


To All OPAL participants,

We are introducing a new feature to the Firmware Test Suite (FWTS)
called "OLOG", "Other log analysis" or "OPAL log analysis"
( https://wiki.ubuntu.com/FirmwareTestSuite ).


FWTS when run on OPAL PPC will dump out the
/sys/firmware/opal/msglog and then compare the msglog to a
set of patterns and/or strings defined in a default
JSON file (see olog.json sample below).

What this request is asking from each of you is to provide
a contribution from your area of expertise to identify any
log entries which may prove to be either suspicious or of
interest to highlight when something goes not quite right
on a PPC OPAL server.

So far the most interesting entries to flag are included in the
sample JSON file provided to the FWTS project.  This
sample is by no means thought to be of technical content,
but more of a sample of the supposed severity levels of an event
and some sample texts for advice for the consumer of these
tools on what to do to take action to correct or what action to
take to report a bug or to perform deeper analysis/debug, etc.

Included below are some outputs from an "OLOG" scan based
on the FWTS OLOG JSON default sample to represent the type
of output that may be produced.

We ask for technical contributions for the olog.json pattern matcher
to be sent to debmc at linux.vnet.ibm.com for review and contribution.

1 - We can include JSON entries using a string compare or
regular expressions.

2 - The advice should be specific and useful for the consumer
to understand and to provide all the needed steps and actions
in a summary fashion.

3 - The labels used are custom and can be leveraged by each
component owner as they see fit, most usage scenarios may
be to include expected sequences of patterns in a group to
help review what was or was not seen logged in case
something goes wrong or to make sure begin/end sequences
are properly matched, etc.  If any label submissions overlap
we will work with the submitter to eliminate the duplication.


SAMPLE olog.json:

$ cat olog.json
{
  "olog_error_warning_patterns":
  [
   {
    "compare_mode": "string",
    "log_level": "LOG_LEVEL_CRITICAL",
    "pattern": "CRITICAL",
    "advice": "SAMPLE TEXT -> Check your log for this condition and
give some specific investigative and action steps.",
    "label": "OLOG_Filter_GROUP1"
   },
   {
    "compare_mode": "string",
    "log_level": "LOG_LEVEL_HIGH",
    "pattern": "STOP",
    "advice": "SAMPLE TEXT -> Check your log for this condition and
give some specific investigative and action steps.",
    "label": "OLOG_Filter_GROUPA"
   },
   {
    "compare_mode": "string",
    "log_level": "LOG_LEVEL_MEDIUM",
    "pattern": "STOP",
    "advice": "SAMPLE TEXT -> Check your log for this condition and
give some specific investigative and action steps.",
    "label": "OLOG_Filter_GROUPB"
   },
   {
    "compare_mode": "regex",
    "log_level": "LOG_LEVEL_LOW",
    "pattern": "Trying.*",
    "advice": "SAMPLE TEXT -> This needs further investigation and
review, please take xyz corrective action.",
    "label": "OLOG_Filter_GROUPC"
   }
  ]
}

# cat results.log
Results generated by fwts: Version V16.03.00 (2016-03-14 09:10:20).

Some of this work - Copyright (c) 1999 - 2016, Intel Corp. All rights 
reserved.
Some of this work - Copyright (c) 2010 - 2016, Canonical.
Some of this work - Copyright (c) 2016 IBM.

This test run on 30/03/16 at 17:58:31 on host Linux mega-mon.austin.ibm.com
3.18.22-359.el7_1.pkvm3_1_0.4000.1.ppc64le #1
SMP Tue Nov 10 11:07:22 CST 2015 ppc64le.

Command: "fwts olog  --json-data-file=olog.json ".
Running tests: olog.

olog: Run OLOG scan and analysis checks.

Test Failure Summary
=====================================================
Critical failures: NONE

High failures: NONE

Medium failures: NONE

Low failures: 8
  olog: LOW Kernel message: [3001360713,5] Trying to load LID 81e08430 
from FSP
  olog: LOW Kernel message: [3704532490,5] VPD: Trying to load VPD LID 
0x80e08042...
  olog: LOW Kernel message: [3704535287,5] Trying to load LID 80e08042 
from FSP
  olog: LOW Kernel message: [3704627893,5] Trying to load OPAL LID 
80a02001...
  olog: LOW Kernel message: [3704640862,5] Trying to load OPAL LID 
80f00101...
  olog: LOW Kernel message: [3704653746,5] Trying to load OPAL LID 
80f00102...
  olog: LOW Kernel message: [10029546437,3] OPAL: Trying a CPU re-init 
with flags: 0x2
  olog: LOW Kernel message: [184758561647,3] OPAL: Trying a CPU re-init 
with flags: 0x1
--------------------------------------------------------------------------------
Test 1 of 1: OLOG scan and analysis checks results.

FAILED [LOW] OLOG_Filter_GROUPC: Test 1, LOW Kernel message:
[3001360713,5] Trying to load LID 81e08430 from FSP

ADVICE: SAMPLE TEXT -> This needs further investigation and review,
please take xyz corrective action.

FAILED [LOW] OLOG_Filter_GROUPC: Test 1, LOW Kernel message:
[3704532490,5] VPD: Trying to load VPD LID 0x80e08042...

ADVICE: SAMPLE TEXT -> This needs further investigation and review,
please take xyz corrective action.

FAILED [LOW] OLOG_Filter_GROUPC: Test 1, LOW Kernel message:
[3704535287,5] Trying to load LID 80e08042 from FSP

ADVICE: SAMPLE TEXT -> This needs further investigation and review,
please take xyz corrective action.

FAILED [LOW] OLOG_Filter_GROUPC: Test 1, LOW Kernel message:
[3704627893,5] Trying to load OPAL LID 80a02001...

ADVICE: SAMPLE TEXT -> This needs further investigation and review,
please take xyz corrective action.

FAILED [LOW] OLOG_Filter_GROUPC: Test 1, LOW Kernel message:
[3704640862,5] Trying to load OPAL LID 80f00101...

ADVICE: SAMPLE TEXT -> This needs further investigation and review,
please take xyz corrective action.

FAILED [LOW] OLOG_Filter_GROUPC: Test 1, LOW Kernel message:
[3704653746,5] Trying to load OPAL LID 80f00102...

ADVICE: SAMPLE TEXT -> This needs further investigation and review,
please take xyz corrective action.

FAILED [LOW] OLOG_Filter_GROUPC: Test 1, LOW Kernel message:
[10029546437,3] OPAL: Trying a CPU re-init with flags: 0x2
Message repeated 1 times.

ADVICE: SAMPLE TEXT -> This needs further investigation and review,
please take xyz corrective action.

FAILED [LOW] OLOG_Filter_GROUPC: Test 1, LOW Kernel message:
[184758561647,3] OPAL: Trying a CPU re-init with flags: 0x1

ADVICE: SAMPLE TEXT -> This needs further investigation and review,
please take xyz corrective action.

OLOG scan and analysis found 8 unique issue(s).

=====================================================
0 passed, 8 failed, 0 warning, 0 aborted, 0 skipped, 0 info only.
=====================================================

-- 
==========================================
Deb McLemore
IBM OpenPower - IBM Systems
(512) 286 9980

debmc at us.ibm.com
debmc at linux.vnet.ibm.com  - (plain text)
==========================================



More information about the Skiboot mailing list