OpenBMC - Support NVMe drive health monitoring

Jeremy Kerr jk at codeconstruct.com.au
Wed Apr 12 11:31:04 AEST 2023


Hi Lior,

> > If so, can you please guide me how to do that?
> 
> The prerequisite for nvme OOB is the mctp layer, including:
> 1. mctp linux kernel - starting from 5.15
> 2. libmctp
> 3. mctpd
> + Jeremy as the author of MCTP.

Thanks Hao!

Lior: the general method involves getting the MCTP network established
so that we can route messages to the NVMe drive, and then using libnvme
+ nvme-cli to perform NVMe messaging over that MCTP channel.

I've put together a bit of an introduction here:

https://codeconstruct.com.au/docs/nvme-mi-firmware-update/

- the context for that document is NVMe firmware updates, but you can
perform any other (supported) nvme-cli commands too.

> > Can this tool be tested on QEMU or RaspberryPi (running OpenBMC
> > image)?
> I am not sure if you want to emulate a nvme mctp device in QEMU.
> RaspberryPi is doable but you need to rework the raiser for I2C from
> PCIe.

There were some discussions about adding a NVMe-MI (over i2c) interface
to qemu; Klaus' patch set is here:

https://lore.kernel.org/qemu-devel/20221116084312.35808-1-its@irrelevant.dk/

For Raspberry Pi: the MCTP-over-i2c transport requires an i2c
controller that supports both controller and device modes; I think the
rpi hardware only supports controller mode.

And, as Hao mentions, you'll then need some way to route those i2c
signals to your actual device.

> 
> > Is there a tool that runs auto discovery and will give a map of the
> > devices it finds? (kind of like nvme list)?
> That is mctpd. https://github.com/CodeConstruct/mctp

Yep, mctpd can enumerate local MCTP networks, and exports the
enumeration results over dbus. The libnvme MCTP implementation can then
use that enumeration data for the object iteration functions to query
the available subsystems.

However: the enumeration needs to be triggered by something; when a
device becomes available we need to tell mctpd to start enumeration at
a particular physical address.

On OpenBMC, Entity Manager provides that initial trigger (based on 
discovered FRU data). If that doesn't suit, it would also be possible
to use other events - like presence signals, or bus-specific discovery
(eg., SMBus ARP) for that too.

If he hasn't mentioned already: Hao has put together the higher-level
infrastructure to export the discovered NVMe subsystems over standard
OpenBMC dbus and Redfish, which allows an OpenBMC-integrated
interaction with NVMe devices, sensors, etc.

Cheers,


Jeremy




More information about the openbmc mailing list