Initial MCTP design proposal

Supreeth Venkatesh supreeth.venkatesh at arm.com
Thu Dec 13 09:50:49 AEDT 2018


On Tue, 2018-12-11 at 13:08 +0530, Deepak Kodihalli wrote:
> On 10/12/18 11:10 PM, Supreeth Venkatesh wrote:
> > On Mon, 2018-12-10 at 11:44 +0530, Deepak Kodihalli wrote:
> > > On 07/12/18 10:39 PM, Supreeth Venkatesh wrote:
> > > > On Fri, 2018-12-07 at 10:43 +0530, Deepak Kodihalli wrote:
> > > > > On 07/12/18 8:11 AM, Jeremy Kerr wrote:
> > > > > > Hi OpenBMCers!
> > > > > > 
> > > > > > In an earlier thread, I promised to sketch out a design for
> > > > > > a
> > > > > > MCTP
> > > > > > implementation in OpenBMC, and I've included it below.
> > > > > 
> > > > > 
> > > > > Thanks Jeremy for sending this out. This looks good (have
> > > > > just
> > > > > one
> > > > > comment below).
> > > > > 
> > > > > Question for everyone : do you have plans to employ PLDM over
> > > > > MCTP?
> > > > 
> > > > Yes Deepak. we do eventually.
> > > 
> > > 
> > > Thanks for letting me know Supreeth!
> > 
> > My pleasure.
> > 
> > > 
> > > > > 
> > > > > We are interested in PLDM for various "inside the box"
> > > > > communications
> > > > > (at the moment for the Host <-> BMC communication). I'd like
> > > > > to
> > > > > propose
> > > > > a design for a PLDM stack on OpenBMC, and would send a design
> > > > > template
> > > > > for review on the mailing list in some amount of time (I've
> > > > > just
> > > > > started
> > > > > with some initial sketches). I'd like to also know if others
> > > > > have
> > > > > embarked on a similar activity, so that we can collaborate
> > > > > earlier
> > > > > and
> > > > > avoid duplicate work.
> > > > 
> > > > Yes. Interested to collaborate.
> > > > Which portion of PLDM are you working on, other than base?
> > > > Platform Monitoring and Control?
> > > > Firmware Update?
> > > > BIOS Control andConfiguration?
> > > > SMBIOS Transfer?
> > > > FRU Data?
> > > > Redfish Device Enablement?
> > > > 
> > > > We are currently interested in Platform Monitoring and Control.
> > > 
> > > 
> > > We're interested in each of these profiles for the BMC host
> > > communications. Are you interested in Platform monitoring and
> > > control
> > > for communications involving the BMC and the host firmware, or
> > > the
> > > BMC
> > > and other devices?
> > 
> > BMC and the host firmware initially.
> > 
> > > 
> > > Also, I have been thinking about the usefulness/feasibility of a
> > > common
> > > PLDM library (just the protocol piece - encoding and decoding
> > > PLDM
> > > messages), so as to be able to share code between BMC and host
> > > firmware.
> > > This of course sets expectations on the library based on OpenBMC
> > > and
> > > various host firmware stacks. Do you have an opinion on this?
> > 
> > Glad that we are on the same page.
> > My thinking at this point is to come up with a generic standalone
> > "C"
> > library which processes PLDM messages without regard to whether
> > this
> > message contains payload for Sensors, firmware update, etc., so
> > that it
> > can be ported to Host firmware if needed.
> 
> I was thinking of a C lib as well (given the lack of or limited C++ 
> stdlib support on some host firmware stacks). Although, when you say
> a 
> lib that processes the PLDM messages, do you mean just the parsing
> part?
> 
The example you gave below aptly sums up what I had in mind.
 
> I think the processing/handling of a PLDM message would be platform 
> specific, because that involves mapping PLDM concepts to platform 
> concepts (for eg to D-Bus on OpenBMC). What I believe can get to the 
> common lib is the marshalling and umarshalling of PLDM messages. So
> for 
> eg if the platform has all the necessary information to make a PLDM 
> message, it can rely on this lib to actually prepare the message for
> it. 
> Plus the reverse flow - decode an incoming PLDM message into C-style 
> data types. We'd have to work on what these APIs look like. Consumers
> of 
> this lib would be the PLDM app(s)/daemon(s).
Yes Exactly.

> 
> > > 
> > > > > 
> > > > > > This is roughly in the OpenBMC design document format
> > > > > > (thanks
> > > > > > for
> > > > > > the
> > > > > > reminder Andrew), but I've sent it to the list for initial
> > > > > > review
> > > > > > before
> > > > > > proposing to gerrit - mainly because there were a lot of
> > > > > > folks
> > > > > > who
> > > > > > expressed interest on the list. I suggest we move to gerrit
> > > > > > once we
> > > > > > get
> > > > > > specific feedback coming in. Let me know if you have
> > > > > > general
> > > > > > comments
> > > > > > whenever you like though.
> > > > > > 
> > > > > > In parallel, I've been developing a prototype for the MCTP
> > > > > > library
> > > > > > mentioned below, including a serial transport binding. I'll
> > > > > > push to
> > > > > > github soon and post a link, once I have it in a
> > > > > > slightly-more-consumable form.
> > > > > > 
> > > > > > Cheers,
> > > > > > 
> > > > > > 
> > > > > > Jeremy
> > > > > > 
> > > > > > --------------------------------------------------------
> > > > > > 
> > > > > > # Host/BMC communication channel: MCTP & PLDM
> > > > > > 
> > > > > > Author: Jeremy Kerr <jk at ozlabs.org> <jk>
> > > > > > 
> > > > > > ## Problem Description
> > > > > > 
> > > > > > Currently, we have a few different methods of communication
> > > > > > between
> > > > > > host
> > > > > > and BMC. This is primarily IPMI-based, but also includes a
> > > > > > few
> > > > > > hardware-specific side-channels, like hiomap. On OpenPOWER
> > > > > > hardware
> > > > > > at
> > > > > > least, we've definitely started to hit some of the
> > > > > > limitations
> > > > > > of
> > > > > > IPMI
> > > > > > (for example, we have need for >255 sensors), as well as
> > > > > > the
> > > > > > hardware
> > > > > > channels that IPMI typically uses.
> > > > > > 
> > > > > > This design aims to use the Management Component Transport
> > > > > > Protocol
> > > > > > (MCTP) to provide a common transport layer over the
> > > > > > multiple
> > > > > > channels
> > > > > > that OpenBMC platforms provide. Then, on top of MCTP, we
> > > > > > have
> > > > > > the
> > > > > > opportunity to move to newer host/BMC messaging protocols
> > > > > > to
> > > > > > overcome
> > > > > > some of the limitations we've encountered with IPMI.
> > > > > > 
> > > > > > ## Background and References
> > > > > > 
> > > > > > Separating the "transport" and "messaging protocol" parts
> > > > > > of
> > > > > > the
> > > > > > current
> > > > > > stack allows us to design these parts separately.
> > > > > > Currently,
> > > > > > IPMI
> > > > > > defines both of these; we currently have BT and KCS (both
> > > > > > defined
> > > > > > as
> > > > > > part of the IPMI 2.0 standard) as the transports, and IPMI
> > > > > > itself
> > > > > > as the
> > > > > > messaging protocol.
> > > > > > 
> > > > > > Some efforts of improving the hardware transport mechanism
> > > > > > of
> > > > > > IPMI
> > > > > > have
> > > > > > been attempted, but not in a cross-implementation manner so
> > > > > > far.
> > > > > > This
> > > > > > does not address some of the limitations of the IPMI data
> > > > > > model.
> > > > > > 
> > > > > > MCTP defines a standard transport protocol, plus a number
> > > > > > of
> > > > > > separate
> > > > > > hardware bindings for the actual transport of MCTP packets.
> > > > > > These
> > > > > > are
> > > > > > defined by the DMTF's Platform Management Working group;
> > > > > > standards
> > > > > > are
> > > > > > available at:
> > > > > > 
> > > > > >      https://www.dmtf.org/standards/pmci
> > > > > > 
> > > > > > I have included a small diagram of how these standards may
> > > > > > fit
> > > > > > together
> > > > > > in an OpenBMC system. The DSP numbers there are references
> > > > > > to
> > > > > > DMTF
> > > > > > standards.
> > > > > > 
> > > > > > One of the key concepts here is that separation of
> > > > > > transport
> > > > > > protocol
> > > > > > from the hardware bindings; this means that an MCTP "stack"
> > > > > > may
> > > > > > be
> > > > > > using
> > > > > > either a I2C, PCI, Serial or custom hardware channel,
> > > > > > without
> > > > > > the
> > > > > > higher
> > > > > > layers of that stack needing to be aware of the hardware
> > > > > > implementation.
> > > > > > These higher levels only need to be aware that they are
> > > > > > communicating
> > > > > > with a certain entity, defined by an Entity ID (MCTP EID).
> > > > > > 
> > > > > > I've mainly focussed on the "transport" part of the design
> > > > > > here.
> > > > > > While
> > > > > > this does enable new messaging protocols (mainly PLDM), I
> > > > > > haven't
> > > > > > covered that much; we will propose those details for a
> > > > > > separate
> > > > > > design
> > > > > > effort.
> > > > > > 
> > > > > > As part of the design, I have referred to MCTP "messages"
> > > > > > and
> > > > > > "packets";
> > > > > > this is intentional, to match the definitions in the MCTP
> > > > > > standard.
> > > > > > MCTP
> > > > > > messages are the higher-level data transferred between MCTP
> > > > > > endpoints,
> > > > > > which packets are typically smaller, and are what is sent
> > > > > > over
> > > > > > the
> > > > > > hardware. Messages that are larger than the hardware MTU
> > > > > > are
> > > > > > split
> > > > > > into
> > > > > > individual packets by the transmit implementation, and
> > > > > > reassembled
> > > > > > at
> > > > > > the receive implementation.
> > > > > > 
> > > > > > A final important point is that this design is for the host
> > > > > > <
> > > > > > -->
> > > > > > BMC
> > > > > > channel *only*. Even if we do replace IPMI for the host
> > > > > > interface,
> > > > > > we
> > > > > > will certainly need an IPMI interface available for
> > > > > > external
> > > > > > system
> > > > > > management.
> > > > > > 
> > > > > > ## Requirements
> > > > > > 
> > > > > > Any channel between host and BMC should:
> > > > > > 
> > > > > >     - Have a simple serialisation and deserialisation
> > > > > > format, to
> > > > > > enable
> > > > > >       implementations in host firmware, which have widely
> > > > > > varying
> > > > > > runtime
> > > > > >       capabilities
> > > > > > 
> > > > > >     - Allow different hardware channels, as we have a wide
> > > > > > variety of
> > > > > >       target platforms for OpenBMC
> > > > > > 
> > > > > >     - Be usable over simple hardware implementations, but
> > > > > > have a
> > > > > > facility
> > > > > >       for higher bandwidth messaging on platforms that
> > > > > > require
> > > > > > it.
> > > > > > 
> > > > > >     - Ideally, integrate with newer messaging protocols
> > > > > > 
> > > > > > ## Proposed Design
> > > > > > 
> > > > > > The MCTP core specification just provides the
> > > > > > packetisation,
> > > > > > routing and
> > > > > > addressing mechanisms. The actual transmit/receive of those
> > > > > > packets
> > > > > > is
> > > > > > up to the hardware binding of the MCTP transport.
> > > > > > 
> > > > > > For OpenBMC, we would introduce an MCTP daemon, which
> > > > > > implements
> > > > > > the
> > > > > > transport over a configurable hardware channel (eg., Serial
> > > > > > UART,
> > > > > > I2C or
> > > > > > PCI). This daemon is responsible for the packetisation and
> > > > > > routing
> > > > > > of
> > > > > > MCTP messages to and from host firmware.
> > > > > > 
> > > > > > I see two options for the "inbound" or "application"
> > > > > > interface
> > > > > > of
> > > > > > the
> > > > > > MCTP daemon:
> > > > > > 
> > > > > >     - it could handle upper parts of the stack (eg PLDM)
> > > > > > directly,
> > > > > > through
> > > > > >       in-process handlers that register for certain MCTP
> > > > > > message
> > > > > > types; or
> > > > > 
> > > > > We'd like to somehow ensure (at least via documentation) that
> > > > > the
> > > > > handlers don't block the MCTP daemon from processing incoming
> > > > > traffic.
> > > > > The handlers might anyway end up making IPC calls (via D-Bus) 
> > > > > to
> > > > > other
> > > > > processes. The second approach below seems to alleviate this
> > > > > problem.
> > > > > 
> > > > > >     - it could channel raw MCTP messages (reassembled from
> > > > > > MCTP
> > > > > > packets) to
> > > > > >       DBUS messages (similar to the current IPMI host
> > > > > > daemons),
> > > > > > where
> > > > > > the
> > > > > >       upper layers receive and act on those DBUS events.
> > > > > > 
> > > > > > I have a preference for the former, but I would be
> > > > > > interested
> > > > > > to
> > > > > > hear
> > > > > > from the IPMI folks about how the latter structure has
> > > > > > worked
> > > > > > in
> > > > > > the
> > > > > > past.
> > > > > > 
> > > > > > The proposed implementation here is to produce an MCTP
> > > > > > "library"
> > > > > > which
> > > > > > provides the packetisation and routing functions, between:
> > > > > > 
> > > > > >     - an "upper" messaging transmit/receive interface, for
> > > > > > tx/rx
> > > > > > of a
> > > > > > full
> > > > > >       message to a specific endpoint
> > > > > > 
> > > > > >     - a "lower" hardware binding for transmit/receive of
> > > > > > individual
> > > > > >       packets, providing a method for the core to tx/rx
> > > > > > each
> > > > > > packet
> > > > > > to
> > > > > >       hardware
> > > > > > 
> > > > > > The lower interface would be plugged in to one of a number
> > > > > > of
> > > > > > hardware-specific binding implementations (most of which
> > > > > > would
> > > > > > be
> > > > > > included in the library source tree, but others can be
> > > > > > plugged-
> > > > > > in
> > > > > > too)
> > > > > > 
> > > > > > The reason for a library is to allow the same MCTP
> > > > > > implementation
> > > > > > to be
> > > > > > used in both OpenBMC and host firmware; the library should
> > > > > > be
> > > > > > bidirectional. To allow this, the library would be written
> > > > > > in
> > > > > > portable C
> > > > > > (structured in a way that can be compiled as "extern C" in
> > > > > > C++
> > > > > > codebases), and be able to be configured to suit those
> > > > > > runtime
> > > > > > environments (for example, POSIX IO may not be available on
> > > > > > all
> > > > > > platforms; we should be able to compile the library to
> > > > > > suit).
> > > > > > The
> > > > > > licence for the library should also allow this re-use; I'd
> > > > > > suggest
> > > > > > a
> > > > > > dual Apache & GPL licence.
> > > > > > 
> > > > > > As for the hardware bindings, we would want to implement a
> > > > > > serial
> > > > > > transport binding first, to allow easy prototyping in
> > > > > > simulation.
> > > > > > For
> > > > > > OpenPOWER, we'd want to implement a "raw LPC" binding for
> > > > > > better
> > > > > > performance, and later PCIe for large transfers. I imagine
> > > > > > that
> > > > > > there is
> > > > > > a need for an I2C binding implementation for other hardware
> > > > > > platforms
> > > > > > too.
> > > > > > 
> > > > > > Lastly, I don't want to exclude any currently-used
> > > > > > interfaces
> > > > > > by
> > > > > > implementing MCTP - this should be an optional component of
> > > > > > OpenBMC, and
> > > > > > not require platforms to implement it.
> > > > > > 
> > > > > > ## Alternatives Considered
> > > > > > 
> > > > > > There have been two main alternatives to this approach:
> > > > > > 
> > > > > > Continue using IPMI, but start making more use of OEM
> > > > > > extensions to
> > > > > > suit the requirements of new platforms. However, given that
> > > > > > the
> > > > > > IPMI
> > > > > > standard is no longer under active development, we would
> > > > > > likely
> > > > > > end
> > > > > > up
> > > > > > with a large amount of platform-specific customisations.
> > > > > > This
> > > > > > also
> > > > > > does
> > > > > > not solve the hardware channel issues in a standard manner.
> > > > > > 
> > > > > > Redfish between host and BMC. This would mean that host
> > > > > > firmware
> > > > > > needs a
> > > > > > HTTP client, a TCP/IP stack, a JSON (de)serialiser, and
> > > > > > support
> > > > > > for
> > > > > > Redfish schema. This is not feasible for all host firmware
> > > > > > implementations; certainly not for OpenPOWER. It's possible
> > > > > > that we
> > > > > > could run a simplified Redfish stack - indeed, MCTP has a
> > > > > > proposal
> > > > > > for a
> > > > > > Redfish-over-MCTP protocol, which uses simplified
> > > > > > serialisation
> > > > > > and
> > > > > > no
> > > > > > requirement on HTTP. However, this still introduces a large
> > > > > > amount
> > > > > > of
> > > > > > complexity in host firmware.
> > > > > > 
> > > > > > ## Impacts
> > > > > > 
> > > > > > Development would be required to implement the MCTP
> > > > > > transport,
> > > > > > plus
> > > > > > any
> > > > > > new users of the MCTP messaging (eg, a PLDM
> > > > > > implementation).
> > > > > > These
> > > > > > would
> > > > > > somewhat duplicate the work we have in IPMI handlers.
> > > > > > 
> > > > > > We'd want to keep IPMI running in parallel, so the
> > > > > > "upgrade"
> > > > > > path
> > > > > > should
> > > > > > be fairly straightforward.
> > > > > > 
> > > > > > Design and development needs to involve potential host
> > > > > > firmware
> > > > > > implementations.
> > > > > > 
> > > > > > ## Testing
> > > > > > 
> > > > > > For the core MCTP library, we are able to run tests there
> > > > > > in
> > > > > > complete
> > > > > > isolation (I have already been able to run a prototype MCTP
> > > > > > stack
> > > > > > through the afl fuzzer) to ensure that the core transport
> > > > > > protocol
> > > > > > works.
> > > > > > 
> > > > > > For MCTP hardware bindings, we would develop channel-
> > > > > > specific
> > > > > > tests
> > > > > > that
> > > > > > would be run in CI on both host and BMC.
> > > > > > 
> > > > > > For the OpenBMC MCTP daemon implementation, testing models
> > > > > > would
> > > > > > depend
> > > > > > on the structure we adopt in the design section.
> > > > > > 
> > > > > 
> > > > > Regards,
> > > > > Deepak
> > > > > 
> > > 
> > > 
> 
> 



More information about the openbmc mailing list