MCTP Sockets related questions

Andrew Jeffery andrew at aj.id.au
Fri Apr 17 14:04:35 AEST 2020


+openbmc at lists.ozlabs.org

On Fri, 17 Apr 2020, at 13:18, Andrew Jeffery wrote:
> Hi Sumanth
> 
> On Fri, 17 Apr 2020, at 01:48, Bhat, Sumanth wrote:
> >  
> > Hi Jeremy, Andrew,
> > 
> >  I have tried to capture our questions and concerns related to MCTP 
> > Sockets in the PMCI WorkGroup page under the topic – MCTP Socket 
> > Interfaces; link here - 
> > https://github.com/openbmc/openbmc/wiki/OpenBMC-PMCI-WG. Hope you can 
> > have a look at it.
> 
> Thanks for getting these written down, they are all great questions. It's
> hard to have a conversation via a wiki, so I'm pasting the questions below:
> 
> > Here are few questions for socket based implementation – 
> > 
> > Bus Owner / Bridging / Endpoint roles:
> > The current demux-daemon supports only static EIDs. How do we extend ‘Bus Owner”,
> > ‘Endpoint’ and ‘Bridging Device’ concepts to demux-daemon?
> 
> I think it probably needs to be made clear that the role of the
> mctp-demux-daemon is nothing more than to transform the interface for MCTP
> messages from direct calls to libmctp to use of sockets, as this will make
> migration to the planned kernel interface easier. Applications wanting to talk
> over MCTP should connect to the mctp-demux-daemon socket and send messages this
> way. This includes the application that will handle MCTP control messages
> defined in the base specification.
> 
> Now, there is the issue that MCTP control commands affect the binding
> associated with an endpoint, and as above the mctp-demux-daemon doesn't handle
> any commands itself. What's missing from mctp-demux-daemon is an out-of-band
> interface to manipulate the binding in response to control messages. Elements
> of this out-of-band interface are being proposed in the
> phosphor-dbus-interfaces patch that is currently under review[1].
> 
> Regarding the planned kernel interface for MCTP, it will come in two parts:
> 
> 1. A common socket-based interface for exchanging messages between endpoints
> 2. A netlink interface to control configuration of MCTP networks and endpoints
>    connected to the system.
> 
> Control messages in the kernel implementation will also be handled in userspace
> (possibly except for binding-defined messages). The daemon handling control
> messages responds by poking the netlink interface to reconfigure the kernel as
> appropriate. Note that we have an alignment between the kernel interface
> proposed here and the need for the out-of-band interfaces on the
> mctp-demux-daemon outlined above (netlink is also out-of-band). As part of the
> eventual transition away from the mctp-demux-daemon to the kernel-based socket
> implementation it's a possibility that we could wrap the netlink interface with
> the D-Bus interface, which should mean minimal changes for applications already
> using the D-Bus interface (though realistically this should just impact the
> daemon handling control messages).
> 
> Returning to your question about the three operational modes in light of the
> above, a few points:
> 
> 1. Endpoint-mode needs to respond to e.g. `Set EID` messages. A `Set EID`
> message would be received by the MCTP control daemon connected to the
> mctp-demux-daemon, and the MCTP control daemon would e.g. call a SetEID()
> method on the mctp-demux-daemon object's D-Bus interface to reconfigure the
> endpoint.
> 
> 2. I think Bus-owner mode is mainly a consideration of how the MCTP control
> daemon operates (i.e. sending messages rather than simply responding to them as
> in Endpoint-mode).
> 
> 3. Bridging is handled in two parts: The binding together of endpoints may
> occur in the mctp-demux-daemon if the design is such that the bridge has a
> singled EID rather than an EID per binding instance. Alternatively, if an EID
> is provided per-endpoint, multiple mctp-demux-daemons could be run with a
> separate daemon subscribing to each mctp-demux-daemon socket participating in
> the bridge.
> 
> Point 3. requires some rework of the mctp-demux-daemon to provide a
> deterministic abstract-socket naming scheme to enable multiple concurrent
> mctp-demux-daemon instances to exist, and the work is similar to what I've
> recently done for the obmc-console package. The naming-scheme may be defined by
> the system configuration (as is the case with obmc-console).
> 
> [1] 
> https://gerrit.openbmc-project.xyz/c/openbmc/phosphor-dbus-interfaces/+/30139
> 
> > Also, when the sockets
> > move to Kernel, what would be the way for a user to configure a certain physical
> > binding in “Bus Owner” mode and another binding in “Endpoint” mode ?
> 
> As mentioned above, this isn't a property of the endpoint so much as how the
> endpoint is used by applications (such as the MCTP control daemon). The kernel
> will be agnostic to how an endpoint is used beyond configuring bindings and
> endpoints as directed by userspace via the netlink interface.
> 
> > We will have cases
> > where BMC will be Bus Owner on a certain bus and Endpoint on another bus simultaneously.
> 
> That's fine, it's a use-case I anticipated (and again it partly comes back to
> how endpoints are used by applications rather than a property of the endpoint
> itself).
> 
> > 
> > Multiple MCTP Daemon instantiations:
> > 
> > The rate of transmission and reception of MCTP messages will be limited by the
> > underlying physical binding. Having one instance of MCTP transport
> > interface per physical port would speed up the TX and RX. How can
> > this be achieved in demux daemon?
> 
> This is enabled by deterministic naming of abstract sockets that I talked about
> above.
> 
> > And how would this be addressed
> > in Kernel based sockets ?
> 
> Messages are sent in process context, so concurrency is as wide as the number
> of threads available assuming you're sending across different interfaces.
> 
> > How can a user specify the physical bindings
> > he/she is going to need and instances for the same ?
> 
> Via the netlink interface.
> 
> > 
> > Support for upper layer protocols:
> > In Intel’s usecases, most of the upper layer protocols like PLDM/Intel Vendor
> > Defined Messaging Type/SPDM etc. are going to be “Requesters” i.e. they are going
> > to send out request packets to a connected device on the platform (ex: Add-In Cards)
> > using MCTP. These protocols will not have prior knowledge about
> > the EIDs and thus need a way to query the existing EIDs and the
> > message types supported by the EIDs from the MCTP layer in order
> > to start their communication. The D-Bus proposal handles this by
> > creating D-Bus objects for the EIDs. How would we achieve the
> > same in demux daemon ?
> 
> This is resolved by e.g. the mctp-demux-daemon implementing the D-Bus
> interfaces your proposing (aside from the Rx/Tx methods) as the out-of-band
> information/event mechanism.
> 
> > How would Kernel based sockets handle this ?
> 
> The netlink interface allows userspace to query the topology of the network,
> which will be set up by the MCTP control daemon.
> 
> As to what message types are supported by endpoints, I didn't have any plan to
> cache this information on the BMC. I figured the application wanting to talk to
> the endpoint would query the endpoint for this information directly. These
> queries operate at multiple levels, e.g:
> 
> * What MCTP control commands are supported?
> * What PLDM constructs are supported?
> * What SPDM concepts are supported?
> 
> None of this information belongs in the kernel. Whether userspace should expose
> it in some generic fashion is up for debate but as mentioned I feel the answer
> is probably not, just leave it to specific applications.
> 
> > 
> > Discovery of MCTP capable devices:
> > 
> > We would need to modify the demux-daemon to cater for the discovery mechanisms.
> > When BMC acts as a Bus Owner, it would have to go ahead and discover other
> > end points on the bus and this discovery mechanism varies according to the bus,
> > and the role: For example: How a PCIe device discovers other endpoints is totally
> > different from how a SMBus device would do.
> 
> But this is binding-implementation specific. The logic should live in the
> binding, no? Device hotplug notifications are binding-specific but there is a
> the Discovery Notify message that bindings can propagate up the stack to notify
> e.g. the MCTP control daemon that a device has appeared, and this is generic.
> 
> > Similarly, how BMC as PCIe bus owner
> > would discover other endpoints (Endpoint Discovery control commands) is different
> > from how BMC as PCIe endpoint would discover other endpoints(Get Routing Table update).
> > And discovered endpoints need to have a representation (ex: D-Bus objects) so that upper
> > layer protocols can discover them. How would this be handled in demux daemon/Kernel approach ?
> 
> Userspace interacting with the endpoint at an MCTP-control level will know
> which mode it's operating in, and so will know what method it needs to use to
> construct the routing table (as a means to know the other endpoints in the
> network).
> 
> It sounds like what you're after is an abstraction that presents the network to
> applications that do not care which mode in which the endpoints are operating?
> If so, this is something I brought up on the phosphor-dbus-interfaces patch: We
> should come up with an abstract representation of the network for applications
> to consume.
> 
> > 
> > Control commands:
> > 
> > Most control commands couple tightly with the binding and mctp layer itself; for example,
> > when Set EID is used by the BMC to allocate EID to another device, it needs to use
> > Special EID 0 + physical address of the device.
> 
> Okay, so I had a bit of a bag of tricks planned here that mean we don't need to
> embed physical addresses into e.g. `Set EID` packets. There's no allowance for
> this in the spec anyway.
> 
> The main insight for e.g. `Set EID` is that commands like this are only sent by
> bus-owners who must be controlling their own route table. The MCTP route table
> is effectively the combination of the ARP table and route table concepts from
> IP networks, and so the intent with the kernel-based MCTP implementation is to
> expose the route table to userspace just like the ARP table, including the
> ability to inject entries into the table (like the ARP table).
> 
> From there we maintain state for each entry that describes whether or not the
> EID has been assigned by userspace, akin to the `discovered` flag that we
> maintain for the endpoint itself: This is set when the endpoint ID has been
> successfully assigned (i.e. we see a `SUCCESS` completion code to for a `Set
> EID` message).
> 
> Further, EIDs must be unique in the network, so the route table must not
> contain the same EID assigned to multiple devices. This means that the EID is
> unambiguous in identifying the device.
> 
> The trick is that EIDs unambiguously identifying devices is true regardless of
> the state of the `discovered` flag associated with the entry in the route
> table. So the plan is that in order to send a `Set EID` to a discovered
> endpoint, we take the following steps:
> 
> For static networks:
> 
> 1. The MCTP control daemon injects an entry into the route table, setting the
> *proposed* EID, the bus and the physical address. The `discovered` bit
> associated with this entry remains clear.
> 
> 2. The MCTP control daemon constructs a `Set EID` message with an MCTP header
> containing the destination EID set to the *proposed* EID (setting the
> destination EID to the *proposed* EID is purely for routing purposes, the
> message does not go onto the wire in this state).
> 
> 3. The MCTP control daemon sends the `Set EID` message via the socket interface.
> 
> 4. The kernel receives the message and parses the MCTP header to resolve the
> route.
> 
> 5. The kernel discovers from the routing table that the `discovered` flag is
> _not_ set for the destination EID and introspects the packet for the `Set EID`
> MCTP command.
> 
> 6. The kernel _modifies_ the packet, replacing destination EID with Special EID
> 0 for the `Set EID` packet.
> 
> 7. The kernel passes the modified message onto the binding implementation
> (resolved via the route table) for transmission to the target endpoint.
> 
> 8. The target endpoint responds to the `Set EID` message.
> 
> 9. The kernel passes the response back to the userspace process associated with
> the sending socket.
> 
> 10. The MCTP control daemon receives the response to the `Set EID` command. If
> the command is successful the MCTP control daemon sets the `discovered` flag in
> the route table and no further EID replacement is performed for packets routed
> to that device. If the command failed then the discovered flag remains clear.
> Further, the response may indicate the device has already received a
> _different_ endpoint ID from a previous `Set EID` command, in which case the
> route table is updated with the returned EID and the discovered flag is set.
> 
> For dynamic networks the process is largely the same, though the route table is
> updated to contain the device bus address when we receive the binding-specific
> `Discovery Notify` signal. This signal is translated to a Discovery Notify
> message to trigger userspace probing of the bus for new devices and to perform
> address assignment. Userspace can inspect the route table for devices with the
> `discovered` flag cleared to determine what devices need address assignment.
> 
> > Get EID command needs to return
> > binding specific information as a part of its response.
> 
> Rather, `Get EID` returns the EID for the device at a particular physical
> address. This is subject to the same sequence outlined above.
> 
> > Get UUID command needs to
> > return same UUID across all physical bindings.
> 
> This is tied to how bridging will be implemented. Again, bridging is handled by
> commands through the netlink interface in the case of the proposed kernel
> implementation, and we just need to associate the one UUID with each of the
> endpoints participating in the bridge.
> 
> > And so on. Thus how would control
> > commands be handled in demux daemon? How would it look like when Kernel based
> > sockets are introduced ?
> 
> We may need to translate some of these concepts to designs that we could
> implement on the mctp-demux-daemon, but otherwise I think your two questions
> here are largely answered by the descriptions above.
> 
> Hope that helps!
> 
> Andrew


More information about the openbmc mailing list