MCTP Sockets related questions

Bhat, Sumanth sumanth.bhat at intel.com
Wed Apr 22 18:33:41 AEST 2020


Hi Andrew,
    Thanks a lot for taking the time out for answering our questions on MCTP sockets in detail. Do you have plans of pushing a Kernel based MCTP implementation proposal to gerrit ?

Thanks,
Sumanth

-----Original Message-----
From: Andrew Jeffery <andrew at aj.id.au> 
Sent: Friday, April 17, 2020 9:35 AM
To: Bhat, Sumanth <sumanth.bhat at intel.com>; Jeremy Kerr <jk at ozlabs.org>; openbmc at lists.ozlabs.org
Cc: Thomaiyar, Richard Marian <richard.marian.thomaiyar at intel.com>; Winiarska, Iwona <iwona.winiarska at intel.com>
Subject: Re: MCTP Sockets related questions

+openbmc at lists.ozlabs.org

On Fri, 17 Apr 2020, at 13:18, Andrew Jeffery wrote:
> Hi Sumanth
> 
> On Fri, 17 Apr 2020, at 01:48, Bhat, Sumanth wrote:
> >  
> > Hi Jeremy, Andrew,
> > 
> >  I have tried to capture our questions and concerns related to MCTP 
> > Sockets in the PMCI WorkGroup page under the topic – MCTP Socket 
> > Interfaces; link here - 
> > https://github.com/openbmc/openbmc/wiki/OpenBMC-PMCI-WG. Hope you 
> > can have a look at it.
> 
> Thanks for getting these written down, they are all great questions. 
> It's hard to have a conversation via a wiki, so I'm pasting the questions below:
> 
> > Here are few questions for socket based implementation –
> > 
> > Bus Owner / Bridging / Endpoint roles:
> > The current demux-daemon supports only static EIDs. How do we extend 
> > ‘Bus Owner”, ‘Endpoint’ and ‘Bridging Device’ concepts to demux-daemon?
> 
> I think it probably needs to be made clear that the role of the 
> mctp-demux-daemon is nothing more than to transform the interface for 
> MCTP messages from direct calls to libmctp to use of sockets, as this 
> will make migration to the planned kernel interface easier. 
> Applications wanting to talk over MCTP should connect to the 
> mctp-demux-daemon socket and send messages this way. This includes the 
> application that will handle MCTP control messages defined in the base specification.
> 
> Now, there is the issue that MCTP control commands affect the binding 
> associated with an endpoint, and as above the mctp-demux-daemon 
> doesn't handle any commands itself. What's missing from 
> mctp-demux-daemon is an out-of-band interface to manipulate the 
> binding in response to control messages. Elements of this out-of-band 
> interface are being proposed in the phosphor-dbus-interfaces patch that is currently under review[1].
> 
> Regarding the planned kernel interface for MCTP, it will come in two parts:
> 
> 1. A common socket-based interface for exchanging messages between 
> endpoints 2. A netlink interface to control configuration of MCTP networks and endpoints
>    connected to the system.
> 
> Control messages in the kernel implementation will also be handled in 
> userspace (possibly except for binding-defined messages). The daemon 
> handling control messages responds by poking the netlink interface to 
> reconfigure the kernel as appropriate. Note that we have an alignment 
> between the kernel interface proposed here and the need for the 
> out-of-band interfaces on the mctp-demux-daemon outlined above 
> (netlink is also out-of-band). As part of the eventual transition away 
> from the mctp-demux-daemon to the kernel-based socket implementation 
> it's a possibility that we could wrap the netlink interface with the 
> D-Bus interface, which should mean minimal changes for applications 
> already using the D-Bus interface (though realistically this should just impact the daemon handling control messages).
> 
> Returning to your question about the three operational modes in light 
> of the above, a few points:
> 
> 1. Endpoint-mode needs to respond to e.g. `Set EID` messages. A `Set 
> EID` message would be received by the MCTP control daemon connected to 
> the mctp-demux-daemon, and the MCTP control daemon would e.g. call a 
> SetEID() method on the mctp-demux-daemon object's D-Bus interface to 
> reconfigure the endpoint.
> 
> 2. I think Bus-owner mode is mainly a consideration of how the MCTP 
> control daemon operates (i.e. sending messages rather than simply 
> responding to them as in Endpoint-mode).
> 
> 3. Bridging is handled in two parts: The binding together of endpoints 
> may occur in the mctp-demux-daemon if the design is such that the 
> bridge has a singled EID rather than an EID per binding instance. 
> Alternatively, if an EID is provided per-endpoint, multiple 
> mctp-demux-daemons could be run with a separate daemon subscribing to 
> each mctp-demux-daemon socket participating in the bridge.
> 
> Point 3. requires some rework of the mctp-demux-daemon to provide a 
> deterministic abstract-socket naming scheme to enable multiple 
> concurrent mctp-demux-daemon instances to exist, and the work is 
> similar to what I've recently done for the obmc-console package. The 
> naming-scheme may be defined by the system configuration (as is the case with obmc-console).
> 
> [1]
> https://gerrit.openbmc-project.xyz/c/openbmc/phosphor-dbus-interfaces/
> +/30139
> 
> > Also, when the sockets
> > move to Kernel, what would be the way for a user to configure a 
> > certain physical binding in “Bus Owner” mode and another binding in “Endpoint” mode ?
> 
> As mentioned above, this isn't a property of the endpoint so much as 
> how the endpoint is used by applications (such as the MCTP control 
> daemon). The kernel will be agnostic to how an endpoint is used beyond 
> configuring bindings and endpoints as directed by userspace via the netlink interface.
> 
> > We will have cases
> > where BMC will be Bus Owner on a certain bus and Endpoint on another bus simultaneously.
> 
> That's fine, it's a use-case I anticipated (and again it partly comes 
> back to how endpoints are used by applications rather than a property 
> of the endpoint itself).
> 
> > 
> > Multiple MCTP Daemon instantiations:
> > 
> > The rate of transmission and reception of MCTP messages will be 
> > limited by the underlying physical binding. Having one instance of 
> > MCTP transport interface per physical port would speed up the TX and 
> > RX. How can this be achieved in demux daemon?
> 
> This is enabled by deterministic naming of abstract sockets that I 
> talked about above.
> 
> > And how would this be addressed
> > in Kernel based sockets ?
> 
> Messages are sent in process context, so concurrency is as wide as the 
> number of threads available assuming you're sending across different interfaces.
> 
> > How can a user specify the physical bindings he/she is going to need 
> > and instances for the same ?
> 
> Via the netlink interface.
> 
> > 
> > Support for upper layer protocols:
> > In Intel’s usecases, most of the upper layer protocols like 
> > PLDM/Intel Vendor Defined Messaging Type/SPDM etc. are going to be 
> > “Requesters” i.e. they are going to send out request packets to a 
> > connected device on the platform (ex: Add-In Cards) using MCTP. 
> > These protocols will not have prior knowledge about the EIDs and 
> > thus need a way to query the existing EIDs and the message types 
> > supported by the EIDs from the MCTP layer in order to start their 
> > communication. The D-Bus proposal handles this by creating D-Bus 
> > objects for the EIDs. How would we achieve the same in demux daemon ?
> 
> This is resolved by e.g. the mctp-demux-daemon implementing the D-Bus 
> interfaces your proposing (aside from the Rx/Tx methods) as the 
> out-of-band information/event mechanism.
> 
> > How would Kernel based sockets handle this ?
> 
> The netlink interface allows userspace to query the topology of the 
> network, which will be set up by the MCTP control daemon.
> 
> As to what message types are supported by endpoints, I didn't have any 
> plan to cache this information on the BMC. I figured the application 
> wanting to talk to the endpoint would query the endpoint for this 
> information directly. These queries operate at multiple levels, e.g:
> 
> * What MCTP control commands are supported?
> * What PLDM constructs are supported?
> * What SPDM concepts are supported?
> 
> None of this information belongs in the kernel. Whether userspace 
> should expose it in some generic fashion is up for debate but as 
> mentioned I feel the answer is probably not, just leave it to specific applications.
> 
> > 
> > Discovery of MCTP capable devices:
> > 
> > We would need to modify the demux-daemon to cater for the discovery mechanisms.
> > When BMC acts as a Bus Owner, it would have to go ahead and discover 
> > other end points on the bus and this discovery mechanism varies 
> > according to the bus, and the role: For example: How a PCIe device 
> > discovers other endpoints is totally different from how a SMBus device would do.
> 
> But this is binding-implementation specific. The logic should live in 
> the binding, no? Device hotplug notifications are binding-specific but 
> there is a the Discovery Notify message that bindings can propagate up 
> the stack to notify e.g. the MCTP control daemon that a device has appeared, and this is generic.
> 
> > Similarly, how BMC as PCIe bus owner would discover other endpoints 
> > (Endpoint Discovery control commands) is different from how BMC as 
> > PCIe endpoint would discover other endpoints(Get Routing Table update).
> > And discovered endpoints need to have a representation (ex: D-Bus 
> > objects) so that upper layer protocols can discover them. How would this be handled in demux daemon/Kernel approach ?
> 
> Userspace interacting with the endpoint at an MCTP-control level will 
> know which mode it's operating in, and so will know what method it 
> needs to use to construct the routing table (as a means to know the 
> other endpoints in the network).
> 
> It sounds like what you're after is an abstraction that presents the 
> network to applications that do not care which mode in which the endpoints are operating?
> If so, this is something I brought up on the phosphor-dbus-interfaces 
> patch: We should come up with an abstract representation of the 
> network for applications to consume.
> 
> > 
> > Control commands:
> > 
> > Most control commands couple tightly with the binding and mctp layer 
> > itself; for example, when Set EID is used by the BMC to allocate EID 
> > to another device, it needs to use Special EID 0 + physical address of the device.
> 
> Okay, so I had a bit of a bag of tricks planned here that mean we 
> don't need to embed physical addresses into e.g. `Set EID` packets. 
> There's no allowance for this in the spec anyway.
> 
> The main insight for e.g. `Set EID` is that commands like this are 
> only sent by bus-owners who must be controlling their own route table. 
> The MCTP route table is effectively the combination of the ARP table 
> and route table concepts from IP networks, and so the intent with the 
> kernel-based MCTP implementation is to expose the route table to 
> userspace just like the ARP table, including the ability to inject entries into the table (like the ARP table).
> 
> From there we maintain state for each entry that describes whether or 
> not the EID has been assigned by userspace, akin to the `discovered` 
> flag that we maintain for the endpoint itself: This is set when the 
> endpoint ID has been successfully assigned (i.e. we see a `SUCCESS` 
> completion code to for a `Set EID` message).
> 
> Further, EIDs must be unique in the network, so the route table must 
> not contain the same EID assigned to multiple devices. This means that 
> the EID is unambiguous in identifying the device.
> 
> The trick is that EIDs unambiguously identifying devices is true 
> regardless of the state of the `discovered` flag associated with the 
> entry in the route table. So the plan is that in order to send a `Set 
> EID` to a discovered endpoint, we take the following steps:
> 
> For static networks:
> 
> 1. The MCTP control daemon injects an entry into the route table, 
> setting the
> *proposed* EID, the bus and the physical address. The `discovered` bit 
> associated with this entry remains clear.
> 
> 2. The MCTP control daemon constructs a `Set EID` message with an MCTP 
> header containing the destination EID set to the *proposed* EID 
> (setting the destination EID to the *proposed* EID is purely for 
> routing purposes, the message does not go onto the wire in this state).
> 
> 3. The MCTP control daemon sends the `Set EID` message via the socket interface.
> 
> 4. The kernel receives the message and parses the MCTP header to 
> resolve the route.
> 
> 5. The kernel discovers from the routing table that the `discovered` 
> flag is _not_ set for the destination EID and introspects the packet 
> for the `Set EID` MCTP command.
> 
> 6. The kernel _modifies_ the packet, replacing destination EID with 
> Special EID
> 0 for the `Set EID` packet.
> 
> 7. The kernel passes the modified message onto the binding 
> implementation (resolved via the route table) for transmission to the target endpoint.
> 
> 8. The target endpoint responds to the `Set EID` message.
> 
> 9. The kernel passes the response back to the userspace process 
> associated with the sending socket.
> 
> 10. The MCTP control daemon receives the response to the `Set EID` 
> command. If the command is successful the MCTP control daemon sets the 
> `discovered` flag in the route table and no further EID replacement is 
> performed for packets routed to that device. If the command failed then the discovered flag remains clear.
> Further, the response may indicate the device has already received a 
> _different_ endpoint ID from a previous `Set EID` command, in which 
> case the route table is updated with the returned EID and the discovered flag is set.
> 
> For dynamic networks the process is largely the same, though the route 
> table is updated to contain the device bus address when we receive the 
> binding-specific `Discovery Notify` signal. This signal is translated 
> to a Discovery Notify message to trigger userspace probing of the bus 
> for new devices and to perform address assignment. Userspace can 
> inspect the route table for devices with the `discovered` flag cleared to determine what devices need address assignment.
> 
> > Get EID command needs to return
> > binding specific information as a part of its response.
> 
> Rather, `Get EID` returns the EID for the device at a particular 
> physical address. This is subject to the same sequence outlined above.
> 
> > Get UUID command needs to
> > return same UUID across all physical bindings.
> 
> This is tied to how bridging will be implemented. Again, bridging is 
> handled by commands through the netlink interface in the case of the 
> proposed kernel implementation, and we just need to associate the one 
> UUID with each of the endpoints participating in the bridge.
> 
> > And so on. Thus how would control
> > commands be handled in demux daemon? How would it look like when 
> > Kernel based sockets are introduced ?
> 
> We may need to translate some of these concepts to designs that we 
> could implement on the mctp-demux-daemon, but otherwise I think your 
> two questions here are largely answered by the descriptions above.
> 
> Hope that helps!
> 
> Andrew


More information about the openbmc mailing list