[Skiboot] [PATCH v5 0/8] skiboot: OPAL support for IMC instrumentation

Hemant Kumar hemant at linux.vnet.ibm.com
Wed Feb 15 23:09:59 AEDT 2017

On 02/14/2017 09:23 AM, Oliver O'Halloran wrote:
> On Wed, 2017-02-08 at 02:41 +0530, Hemant Kumar wrote:
>> Patchset adds support for In Memory Collection instrumentation (IMC)
>> services in OPAL for Power9. The entire IMC infrastructure consists
>> of
>> two kinds of Performance Monitoring Units (PMUs) : nest imc pmus
>> (chip
>> level) and core imc pmus (core level).
>> Nest IMC PMUs are off core but on chip. And these can be accessed via
>> in-band scoms. Programming these counters and accumulating the
>> counter
>> data to memory is done via microcode running in one of the OCC
>> Engines.
>> Core IMC PMUs handle the per-core counters. These are initialized
>> with
>> per-core PDBARs, HTM_MODE and EVENT_MASK scoms.
>> This patchset is to add nest and core IMC instrumentation support in
>> the
>> OPAL side.
>> "IMA_CATALOG" partition in PNOR contains multiple device tree
>> binaries
>> (DTB) in a compressed form with PVR tag. So, when loading the
>> partition, OPAL passes the system PVR as a "subid" to the
>> load_resource
>> API. If a catalog dtb is found for a given pvr, it is decompressed
>> and
>> linked to the main device tree.
>> Commit which adds the "IMA_CATALOG" partition to PNOR is :
>> https://github.com/open-power/pnor/commit/c940142c6dc64dd176096dc648f
>> 433c889919e84
>> The root node of a IMC catalog device tree contains nodes for the IMC
>> PMUs and the common events across the PMUs. Here is an excerpt from
>> the device tree :
>> /dts-v1/;
>> / {
>>          name = "";
>>          compatible = "ibm,opal-in-memory-counters";
>>          #address-cells = <0x1>;
>>          #size-cells = <0x1>;
>>          imc-nest-offset = <0x320000>;
>>          imc-nest-size = <0x30000>;
>>          version-id = "";
> What's version-id for? Versioning is usually handled by changing the
> compatible so I'm not sure why a seperate version property is needed
> here.

This version-id refers to version of the catalog from which this
dts was generated. But, yeah, this shouldn't be empty. Will
work with the catalog team and fix this field.

> The other thing that bothers me about this is the imc-nest-offset
> property. We could just remove it entirely and bake the offset into the
> reg property of the ibm,imc-counters-nest nodes. For example the reg
> property for mcs0 would could be <0x320180 0x0> instead of <0x118 0x0>
> I'm not too fussed about it though. If you think the current usage is
> fine then stick with that.

I think the current usage is fine, since in case of any base offset
change within HOMER, is will be lot easier to handle.

>>          NEST_MCS: nest-mcs-events {
>>                  #address-cells = <0x1>;
>>                  #size-cells = <0x1>;
>>                  event at 0 {
>>                          event-name = "RRTO_QFULL_NO_DISP" ;
>>                          reg = <0x0 0x8>;
>>                          desc = "RRTO not dispatched in MCS0 due to
>> capacity - pulses once for each time a valid RRTO op is not
>> dispatched due to a command list full condition" ;
>>                  };
>>                  event at 8 {
>>                          event-name = "WRTO_QFULL_NO_DISP" ;
>>                          reg = <0x8 0x8>;
>>                          desc = "WRTO not dispatched in MCS0 due to
>> capacity - pulses once for each time a valid WRTO op is not
>> dispatched due to a command list full condition" ;
>>                  };
>> 		[...]
>>          mcs0 {
>>                  compatible = "ibm,imc-counters-nest";
>>                  events-prefix = "PM_MCS0_";
>>                  unit = "";
>>                  scale = "";
> With the new structure I don't know if it makes sense to have unit and
> scale (or prefix?) as a part of these nodes rather in the event catalog
> or event node itself. The current usage is fine, but if we ever have to
> deal with a set of events with different units this structure would be
> a little awkward.

The idea with the unit and the scale fields is that the local
properties will supersede the global unit and scale (global meaning
the PMU nodes' unit and scale). So, individual events may have
unit and scale properties in case, they don't agree with the PMU's
event and scale.
events-prefix has to be part of the PMU node (and not part of
the events node), since, the events node is for the common
events across different PMUs. These common events have
different prefix for different PMUs.

>>                  reg = <0x118 0x8>;
>>                  events = < &NEST_MCS >;
> What exactly does the size in the reg property mean here? It might be
> worth setting the #size-cells to zero and only supplying the offset.

AFAIU the size field is the size of the offset itself. But need to check 

>>   - The last patch in the series is to add "chip-id" to reserved homer
>>     region node in the device tree. This will give us the homer
>> region's
>>     associated chip in the kernel (which will be needed to fetch the
>>     counter values from the required chip).
> IMO move this patch to be earlier in the series. In it's current
> position all the IMC DT nodes are present, but without the chip-ids for
> the HOMER regions it will always fail. It's not a big deal, but it
> might save someone a bit of work when doing a bisect.

Can you please elaborate more on this? The last patch to add
"chip-id" is not used in any of the other patches in this patchset.
This will be used by the kernel IMC patchset to discover a chip's
homer region.

> Overall, looks OK. I would prefer if we had an explicit DT node for
> finding the nest-imc region of each chip rather than digging through
> the reserved memory nodes looking for the HOMER. Unfortunately, I can't
> see any way of making the DT bindings for that work out nicer than what
> you've done.
> Oliver

More information about the Skiboot mailing list