[Skiboot] [PATCH v3 0/6] IMC Instrumentation Support
Madhavan Srinivasan
maddy at linux.vnet.ibm.com
Tue Jan 24 12:15:52 AEDT 2017
Hi ben/stewart,
Any comments on this patchset.
Maddy
On Monday 05 December 2016 11:40 PM, Hemant Kumar wrote:
> Patchset adds support for In Memory Collection instrumentation (IMC)
> services in OPAL for Power9. The entire IMC infrastructure consists of
> two kinds of Performance Monitoring Units (PMUs) : nest imc pmus (chip
> level) and core imc pmus (core level).
>
> Nest IMC PMUs are off core but on chip. And these can be accessed via
> in-band scoms. Programming these counters and accumulating the counter
> data to memory is done via microcode running in one of the OCC Engines.
>
> This patchset is to add nest IMC instrumentation support in the OPAL
> side.
>
> "IMA_CATALOG" partition in PNOR contains multiple device tree binaries
> (DTB) in a compressed form with PVR tag. So, when loading the IMA_CATALOG
> partition, OPAL passes the system PVR as a "subid" to the load_resource
> API. If a catalog dtb found for a given pvr, it is decompressed and
> linked to the main device tree.
>
> Commit which adds the "IMA_CATALOG" partition to PNOR is :
> https://github.com/open-power/pnor/commit/c940142c6dc64dd176096dc648f433c889919e84
>
> Each event node in the device tree contains "event-name" and "offset".
> Some of the PMUs may contain properties such as "scale" and "unit" which
> reflects the fact that all the events inside this PMU will have the
> same "scale" and "unit" values.
>
> https://github.com/open-power/ima-catalog/commit/99b73ee691fb5273f502e879e07603c510db5f7a
> talks about the DTS file for power8. For power9, it has a similar
> format and adds more units for nest, core and thread level PMUs.
>
> An excerpt from the dtb showing "mcs" pmu node and two of its event nodes:
>
> /dts-v1/;
>
> / {
> name = "";
> compatible = "ibm,opal-in-memory-counters";
> #address-cells = <0x1>;
> #size-cells = <0x1>;
> ima-nest-offset = <0x320000>;
> ima-nest-size = <0x30000>;
> version-id = "";
>
> mcs0 {
> compatible = "ibm,ima-counters-nest";
> ranges;
> #address-cells = <0x1>;
> #size-cells = <0x1>;
> unit = "MiB" ;
> scale = "1.2207e-4" ;
>
> event at 118 {
> event-name = "PM_MCS_UP_128B_DATA_XFER_MC0" ;
> reg = <0x118 0x8>;
> desc = "Total Read Bandwidth seen on both MCS of MC0" ;
> };
> [SNIP]
>
> Why this design for the IMC DTS files?
> These DTS files for now contain PMUs only for Nest (i.e., chip). But,
> going forward, the DTS files for Power 9 will contain the IMC PMUs for
> core and thread as well. An argument could be to design the device tree
> in such a way, so that one can use of_translate_address() directly on
> the event nodes and can get the cpu address for that event. However,
> there are some issues with that.
> For nest imc, we need to attach the device tree to per-chip HOMER region
> node. For multiple chips, this will increase replication.
> For core imc, we allocate the memory in the kernel for each core and the
> base location for core imc is not fixed. Hence, we can't use
> of_translate_address on the core events.
> For thread imc, we allocate memory for each linux process which needs to
> be monitored. This will be particularly difficult to take care of in
> the device tree since, the allocation will be dynamic.
>
> So, from the OPAL side, we need to :
> - Find out the current processor's PVR.
> - Fetch the "IMA_CATALOG" partition.
> - Fetch the correct subpartition based on the current processor's PVR.
> - Decompress the blob taken from the subpartition.
> - Expand the (now uncompressed) device tree binary and attach it to the
> system's device tree, so that, it can now be discovered by the
> kernel.
> - Look at the IMC availability vector which denotes which of the nest
> PMUs are available and remove the unavailable PMU nodes from the
> device tree.
>
> Note that :
> - The Catalog team is working on upstreaming the DTS files.
> - The commit which adds the IMA_CATALOG partition to PNOR is mentioned
> above.
> - Since OPAL lacks a xz decompression library, an xz decompression
> library has been reused from the hostboot repo (link has been
> mentioned in patch 3/7).
> - This patchset is for base enablement for IMC and hence, only contains
> the nest IMC support.
> - The last patch in the series is to add "chip-id" to reserved homer
> region node in the device tree. This will give us the homer region's
> associated chip in the kernel (which will be needed to fetch the
> counter values from the required chip).
>
> This Patchset does a couple of things :
>
> 1) At the time of boot, it detects the IMA_CATALOG resource. Based on
> the current processor's PVR value, it fetches the appropriate
> subpartition. The blob in this subpartition is then uncompressed and the
> flattened device tree is obtained. This dtb is then expanded and then
> linked to the system's device tree under
> "/proc/device-tree/ima-counters". The node "ima-counters" is a new node
> created in this patchset. The kernel can then discover this node based
> on its compatibility field.
>
> 2) It implements an opal call to control a microcode running in one of the
> OCC engines (responsible for nest IMC data collection) from kernel to
> start/stop Nest PMU counter data collection.
>
> This patchset is based on the initial work for Nest Instrumentation done
> by Madhavan Srinivasan, which can be found here :
> (https://lists.ozlabs.org/pipermail/skiboot/2016-March/002999.html).
>
> TODOs:
> - Add support for Core IMC.
>
> Changelog :
> v2 -> v3 :
> Major changes include
> - Addressed review comments from Oliver O'Halloran.
> - Renamed this infrastructure from IMA (In-Memory Accumulation) to IMC
> (In-Memory Collection), since, the name IMA conflicts with existing
> IMA (Integrity Measurement Architecture) in the linux kernel.
> - Patches 2 and 4 have been merged together (3/6).
> - Patch 3 (xz library) has been moved to Patch 2/6.
>
> Changes since v1 have been mentioned in the individual patches.
>
> Hemant Kumar (5):
> skiboot: Nest IMC macro definitions
> skiboot: Add a library for xz
> skiboot: Find the IMC DTB
> skiboot: Add opal call to enable/disable Nest IMC microcode
> skiboot: Add documentation for nest IMC opal call
>
> Vasant Hegde (1):
> skiboot: Add chip id to HOMER reserved region
>
> Makefile.main | 5 +-
> core/flash.c | 1 +
> core/init.c | 7 +
> doc/opal-api/opal-nest-ima-counters.rst | 49 ++
> hw/Makefile.inc | 2 +-
> hw/homer.c | 30 +
> hw/imc.c | 243 +++++++
> include/imc.h | 117 +++
> include/nest_imc.h | 85 +++
> include/opal-api.h | 9 +-
> include/platform.h | 1 +
> include/skiboot.h | 1 +
> libxz/Makefile.inc | 7 +
> libxz/xz.h | 312 ++++++++
> libxz/xz_config.h | 133 ++++
> libxz/xz_crc32.c | 67 ++
> libxz/xz_dec_lzma2.c | 1183 +++++++++++++++++++++++++++++++
> libxz/xz_dec_stream.c | 855 ++++++++++++++++++++++
> libxz/xz_lzma2.h | 212 ++++++
> libxz/xz_private.h | 164 +++++
> libxz/xz_stream.h | 70 ++
> 21 files changed, 3549 insertions(+), 4 deletions(-)
> create mode 100644 doc/opal-api/opal-nest-ima-counters.rst
> create mode 100644 hw/imc.c
> create mode 100644 include/imc.h
> create mode 100644 include/nest_imc.h
> create mode 100644 libxz/Makefile.inc
> create mode 100644 libxz/xz.h
> create mode 100644 libxz/xz_config.h
> create mode 100644 libxz/xz_crc32.c
> create mode 100644 libxz/xz_dec_lzma2.c
> create mode 100644 libxz/xz_dec_stream.c
> create mode 100644 libxz/xz_lzma2.h
> create mode 100644 libxz/xz_private.h
> create mode 100644 libxz/xz_stream.h
>
More information about the Skiboot
mailing list