[RFC] SPDM attestation E2E findings from Renode testing
Gary Beihl
garybeihl at microsoft.com
Tue Apr 14 02:10:35 AEST 2026
(CC'ing the SPDM chain maintainers directly for visibility; posting to the list so the discussion stays searchable alongside the April 2 thread.)
Following up on the April 2 RFC [1] and Matt's reply [2]: I now have a group of six patches on a local branch that take the SPDM E2E suite
from 0/13 to 17/17 passing on Renode evb-ast2600 (14 attestation scenarios + 1 boot + 2 new SPDMGetSignedMeasurements scenarios I added to cover the on-demand D-Bus path).
I'd like input from the SPDM chain maintainers on how you'd prefer to receive these patches before I start pushing to Gerrit.
Here are the 6 patches: All are small and independent. Each has a one-paragraph rationale in the commit message plus a "Tested:" trailer describing what the Renode E2E suite exercises.
1. spdmd: Handle missing MCTP endpoints gracefully - mctp_transport_discovery.cpp. Catches ResourceNotFound and per-endpoint query exceptions so spdmd doesn't terminate on a freshly booted system with no MCTP endpoints yet configured. Same pattern as Archana's fix in 88794 for the TCP discovery path.
2. spdmd: Search the full object tree in mapper lookups - utils/mapper.cpp. Changes get_sub_tree("/xyz/openbmc_project", ...) to get_sub_tree("/", ...) so the search finds mctpd-managed endpoints under /au/com/codeconstruct/mctp1/... - mctpd implements both xyz.openbmc_project.MCTP.Endpoint and au.com.codeconstruct.MCTP.Endpoint1 on the same object, but the xyz-rooted search never finds it.
3. spdmd: Remove unnecessary MCTP socket bind() - mctp_helper.hpp. This is the fix Matt recommended in [2]. Removes the wildcard bind(ANY, ANY, SPDM, OWNER) that prevented multi-device discovery from working - multiple MctpIoClass instances would collide on the same bound tuple. spdmd is a pure synchronous requester so no bind() is needed. Same pattern as the pldmtool fix Matt referenced (openbmc/pldm 83626).
4. spdmd: Set MCTP socket receive timeout in mctp_helper.hpp. Adds SO_RCVTIMEO=10s at socket creation. The existing MctpIoClass::read() marks its timeout parameter as [[maybe_unused]] and recvfrom() blocks forever on unreachable devices, so libspdm never returns LIBSPDM_STATUS_RECEIVE_FAIL and systemd's start-timeout kills spdmd before it becomes active.
5. spdmd: Implement eager SPDM attestation on discovery spdm_dbus_responder.cpp/hpp, spdmd.cpp. Adds a performEagerAttestation() method that runs VCA + GET_DIGESTS + GET_CERTIFICATE + CHALLENGE on every discovered device and updates the ComponentIntegrity.ResponderVerificationStatus D-Bus property. Called from main() after ctx.request_name() so attestation timing can't trigger systemd's TimeoutStartSec.
CHALLENGE is the critical step for the security gap I mentioned in the original RFC - without it, a wrong or compromised private key still passes attestation because GET_CERTIFICATE alone does not prove key ownership.
This is the commit I'm least sure about architecturally.
Archana: I see 88873 adds an SPDMResponderManager with a connectSPDMDevice() stub. Would you rather I rebase the attestation logic into your manager class instead of adding it as a method on SPDMDBusResponder? Happy to do either.
6. spdmd: Surface SPDMGetSignedMeasurements errors component_integrity_dbus.cpp. The catch block in method_call() currently swallows all std::exception subtypes and returns a tuple of empty strings as if the call had succeeded, so bmcweb returns HTTP 200 with "SignedMeasurements":"" on any libspdm failure. Fix is to split the catch: sdbusplus::exception::exception subtypes are rethrown (preserving InvalidArgument), generic std::exception is converted to InternalFailure. Bmcweb then maps the D-Bus error to HTTP 500 with a Redfish @Message.ExtendedInfo body.
TCP discovery exception handling - I had also fixed the TCP ResourceNotFound crash, but then noticed Archana's 88794 already does this. So that one's dropped from my stack.
## Questions for the chain maintainers
The chain I'm targeting is NVIDIA's (80262, 80355, 80264, 80272, 80311, 80358, 84019, ...) since my fixes touch files that only exist there. I understand the IBM chain (88112, 88124, 88873, 88794, 89179) is a parallel track - fixes 1-4 and 6 are on files IBM's chain doesn't touch, and fix 5 will need to be reconciled once the two chains merge.
Before I start pushing, could you let me know:
a) Are you okay with fixes 1, 2, 3, 4, and 6 landing as a five-commit stack on Gerrit depending on the NVIDIA chain (specifically 80262 PS19, 80355, and 84019)? I'd submit them under my Microsoft email and reference the April 2 RFC in each commit message.
b) For fix 5 (eager attestation): does it make more sense to (i) post it as a stand-alone Gerrit change stacked on 84019 now, and reconcile with 88873's SPDMResponderManager later, (ii) post it as an amendment to 88873 that fills in the connectSPDMDevice() TODO, or (iii) hold until the NVIDIA and IBM chains converge upstream?
c) Is there a planned coordination point where the two chains merge? I see 88873 intends to drive NVIDIA's SpdmTransport eventually. Happy to help bridge them if that's a blocker for landing the attestation flow.
If any of the six descriptions look off to you, please push back before I push them to Gerrit - I'd rather have the discussion in email than after N patchsets back-and-forth.
Full commit messages and diffs are on a local branch; I can either (a) upload to Gerrit as a draft and share links, (b) post patches inline to the list as a v1 series, or (c) share them privately if you'd prefer. Let me know which you'd like.
Thanks,
Gary
[1] https://lists.ozlabs.org/pipermail/openbmc/2026-April/038553.html
[2] https://lists.ozlabs.org/pipermail/openbmc/2026-April/038554.html
From: Gary Beihl <garybeihl at microsoft.com>
Sent: Tuesday, April 7, 2026 4:51 PM
To: Matt Johnston <matt at codeconstruct.com.au>; openbmc at lists.ozlabs.org
Cc: Thirupathaiah Annapureddy <thiruan at microsoft.com>; Sagar Dharia <Sagar.Dharia at microsoft.com>; Giri Mudusuru <girimudusuru at microsoft.com>
Subject: RE: [EXTERNAL] Re: [RFC] SPDM attestation E2E findings from Renode testing
Hi Matt,
Thank you for the recommendation - I followed your suggestion and dropped the call to bind() altogether. Since spdmd only does request/response (no async SPDM messages), regular sendto()/recvfrom() with SO_RCVTIMEO) works just fine. Each endpoint gets its own socket with no EADDRINUSE conflict.
All 14 SPDM E2E Robot Framework tests continue to pass with this change, including the multi-endpoint scenarios that previously hit EADDRINUSE.
Per-endpoint bind() from Linux 6.17 is good to know about for future work in case we need to handle async SPDM notifications (e.g, KEY_UPDATE).
Thanks,
Gary
From: Matt Johnston <matt at codeconstruct.com.au<mailto:matt at codeconstruct.com.au>>
Sent: Wednesday, April 1, 2026 10:09 PM
To: Gary Beihl <garybeihl at microsoft.com<mailto:garybeihl at microsoft.com>>; openbmc at lists.ozlabs.org<mailto:openbmc at lists.ozlabs.org>
Cc: Thirupathaiah Annapureddy <thiruan at microsoft.com<mailto:thiruan at microsoft.com>>; Sagar Dharia <Sagar.Dharia at microsoft.com<mailto:Sagar.Dharia at microsoft.com>>; Giri Mudusuru <girimudusuru at microsoft.com<mailto:girimudusuru at microsoft.com>>
Subject: [EXTERNAL] Re: [RFC] SPDM attestation E2E findings from Renode testing
You don't often get email from matt at codeconstruct.com.au<mailto:matt at codeconstruct.com.au>. Learn why this is important<https://aka.ms/LearnAboutSenderIdentification>
Hi Gary,
On Wed, 2026-04-01 at 21:35 +0000, Gary Beihl wrote:
(c) Shared AF_MCTP socket (affects 80311: mctp_helper.hpp)
Only one socket can bind to (MCTP_ADDR_ANY, MCTP_TYPE_SPDM). When spdmd attests multiple endpoints sequentially, the second MctpIoClass::createSocket() fails with EADDRINUSE. The fix is a process-lifetime shared socket (singleton pattern), draining stale responses between endpoint attestations with recv(MSG_DONTWAIT).
Since Linux 6.17 it is possible to restrict a bind() to only receive from a single remote endpoint [1]. Call connect() with the remote address before the bind().
Is the SPDM implementation using asynchronous messages sent by the responder? (KEY_UPDATE, HEARTBEAT, END_SESSION)
If not, I think the bind() could be removed altogether.
bind() isn't needed in the case where the Linux host is performing a plain send() then receiving a response. A similar situation was fixed in pldmtool [2].
[1] https://lore.kernel.org/all/20250710-mctp-bind-v4-6-8ec2f6460c56@codeconstruct.com.au/
[2] https://gerrit.openbmc.org/c/openbmc/pldm/+/83626
Cheers,
Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ozlabs.org/pipermail/openbmc/attachments/20260413/4d3d90a9/attachment.htm>
More information about the openbmc
mailing list