OpenBMC on RCS platforms

Thu Apr 29 17:54:31 AEST 2021

-----Timothy Pearson <tpearson at raptorengineering.com> wrote: -----

>To: Milton Miller II <miltonm at us.ibm.com>
>From: Timothy Pearson <tpearson at raptorengineering.com>
>Date: 04/28/2021 03:22PM
>Cc: Patrick Williams <patrick at stwcx.xyz>, openbmc
><openbmc at lists.ozlabs.org>
>Subject: [EXTERNAL] Re: OpenBMC on RCS platforms
>
>
>----- Original Message -----
>> From: "Milton Miller II" <miltonm at us.ibm.com>
>> To: "Timothy Pearson" <tpearson at raptorengineering.com>
>> Cc: "Patrick Williams" <patrick at stwcx.xyz>, "openbmc"
><openbmc at lists.ozlabs.org>
>> Sent: Monday, April 26, 2021 4:42:16 PM
>> Subject: RE: OpenBMC on RCS platforms
>
>[snip]
>
>>>At first glance, that's another overly complex solution for a
>>>simple
>>>problem that would cause a degraded user experience vs. other
>>>platforms.
>>>
>> 
>> I have to agree, both overly complex and probably not useful in
>> that
>> its just a port interface for control.
>> 
>>>We have an 800Mhz Linux-based computer with 512MB of RAM, serial
>>>and
>>>video out support already integrated into every one of our
>>>products.
>>>It can receive data via PCIe and via USB from an active host.  Why
>>>isn't there a mechanism to send a signed container to it over one
>>>of
>>>these existing channels for self-update?
>>>
>>>A potential user story looks like this:
>>>
>>>=====
>>>
>>>I want to update the firmware on my Blackbird desktop to fix a
>>>problem I'm having with a new control widget I've plugged in.  To
>>>make things more interesting, I'm on an oil rig in the Gulf, and
>>>the
>>>desktop only connects via intermittent WiFi.  Spare parts are weeks
>>>away, and I have next to no electronic diagnostic equipment
>>>available
>>>to me.  There's one or two USB ports I can normally use because I
>>>have administrative privileges, but I was able to grab the upgrade
>>>file over WiFi instead, saving myself some time cleaning
>>>accumulated
>>>gunk out of the ports.
>>>
>>>I can update my <large vendor> standard PC firmware just by running
>>>a
>>>tool on Windows, but the Blackbird was selected because it controls
>>>a
>>>critical process that needed to be malware-resistant.
>>>
>>>Fortunately, OpenBMC implemented a quality firmware update process.
>>>I just need to launch a GUI tool with host administrative
>>>privileges,
>>>select the upgrade file, and queue an upgrade to happen when I
>>>reboot
>>>the machine.  I queue the update, start the reboot, and stick
>>>around
>>>to see the upgrade progress on the screen while it's booting back
>>>up.
>>> Because I can see the status on the screen, I know what is
>>>happening
>>>and don't pull the power plug due to only seeing a black screen and
>>>power LED for 10 minutes.  Finally, the machine loads the OS and I
>>>verify the new control widget is working properly.
>>>
>>>=====
>>>
>>>Is there a technical / architectural reason this can't be done, or
>>>some other reason it's a bad idea?
>>>
>> 
>> I ended up writing this twice or thrice.  Also what I call
>> phosphor-initfs is actually the package obmc-phosphor-initfs.bb
>> found in meta-phosphor/recipies-phosphor/initrdscripts/.
>> 
>> 
>> There are two issues.  One is that there is no graphics
>> library or console code for the aspeed bmc.  I understand a
>> text rendering library was added for boot monitoring). But
>> if you are starting from the host up, then use the host to
>> drive the GUI and just establish a command session (network,
>> USB to host, or serial).
>> 
>> The biggest limitation is we use squashfs for file system
>> for space efficency.  This is a read-only filesystem that
>> contains references between different pieces that is loaded
>> and decompressed by the kernel on demand.  That means you can
>> not be running on the copy in flash while trying to update
>> that copy in the flash.
>> 
>> If you have space for two copies then you can update the
>> second copy while the primary is online.  This is supported
>> in the UBI and eMMC layouts upstream.
>> 
>> If you only have flash space for one copy then you have to
>> arrange for something more limited.  Either way you are
>> subject to bricking on interrupted flash unless you do
>> something exotic like repurpose the host chip as a backup
>> BMC during the process.   But if its just the feedback
>> then the upstream code has help that isn't in the Redfish
>> flow.
>
>Most of these systems also have a significant amount of RAM
>available, enough to hold both the update file and the existing BMC
>Flash contents while the system remains online.  Is there any way we
>could copy the existing Flash into RAM, then "pivot" the running
>system to use the copy in RAM as the backing store?

[See also the Thrice description ...]

There is no version of filesystem that I am aware of that 
says "instead of using layer x, start using layer y that 
will have the same content".

The existing init script has a config option to copy the 
contents from the flash to RAM then loop mount the file.  
Of course this will likely increase the boot time because 
all content had to be copied from the flash before starting 
any userspace from the volume.   Also the copy uses all
space alloced to the rofs layer; it is not smart enough to
only copy the length of the squashfs contents even though 
that is in the filesystem header.

Thinking a bit this evening, squashfs uses a block device 
for storage so one could use DM to create a 1-member 
degraded raid1 on the mtdblock device, and add a ramdisk 
block drive (rd) as the mirror.  The ramdisk can be added 
as a degraded volume after boot to avoid having the kernel
spending time coping the data instead of starting the real
userspace.  After the rd copy is synced, one could remove 
the mtdblock volume from the raid1.

This requires access to dm-tools to setup the raid unless 
the in-kernel raid metadata would work on a mtdblock 
volume.   The md layerprobably wants to update the 
superblock of the good volume or something.

The above got the rofs, but didn't address the rwfs.  We
use jffs2 today.  While there are other options, the mtd
writable filesystems understand the large erase blocks but
the general block device file systems do not.

The existing init does have an option to copy designated 
files from the host to a tmpfs, and from the tmpfs back 
to the rwfs.  It also can erase the rwfs partition.  This
can be used for freeing the rwfs space during the firmware
update but on an abnormal shutdown the updates to the
rwfs are lost, be they logs or configuration updates.

>
>Bricking on power cut is, well, expected during a BMC update without
>a backup Flash chip.  Not cutting power during a low level firmware
>update is I think still ingrained sufficiently in the average PC
>users psyche not to be a significant issue, especially if several
>warnings are given before and during the update process regarding
>ensuring power is not cut.  Even if it is cut, the BMC Flash is
>socketed for a reason.
>
>All that said, ideally, longer term, a recovery partition could be
>added to the Flash -- basically, a normal BMC update would only
>update the rofs partition, leaving u-boot, kernel, and the recovery
>partition alone.  The recovery partition would contain a very small
>userspace, just enough to accept some kind of network connection for
>e.g. TFTP upload of a new firmware (similar to how various embedded
>devices and even small PCs can be recovered).
>
>> 
>> ====
>> Once
>> 
>> The "static" mtd layout with phosphor-initfs has support
>> for both loading the static flash content into RAM, allowing
>> the update to occur with full services running, and as  a
>> backup on shutdown it will apply the update on bmc reboot
>> by switching back to the initramfs and performing the flash
>> from there.  The status of the later update is only visible
>> on the console, which might be hidden on an internal serial
>> cable by default.
>> 
>> Unfortunately the "prepare for update" method that was in
>> the original update instructions and tells the BMC init
>> "hey, load all this content into ram, so that you can write
>> over the flash" got lost in the "we must be limited to what
>> RedFish can support".  The code is still in the low level
>> scripts but the fancy rest api is missing.  Also with the
>> addition of code verification the actual flash progress
>> was hidden.
>> 
>> The phosphor-initfs scripts also allow a new filesystem
>> image to be downloaded over the network if you wish to test.
>> This doesn't have signature checking code, and it can be
>> disabled by build options.
>> 
>> All of the options to phosphor-initfs can be set by u-boot
>> environment variables (one of which is cleared by a systemd
>> unit each boot, on that is not) and by the kernel command
>> line.
>> 
>> Note: I highly suggest not to use image-bmc (for the whole
>> flash) as this erases the entire flash (although we try to
>> write back the u-boot environment), but instead use image-kernel,
>> image-rofs, etc to allow the prior rwfs and u-boot to persist.
>> Some bad assertions may have migrated into the code-update
>> rest endpoints and we should accept patches.
>> 
>> Bottom Line:
>> 
>> Put the BMC in maintence mode and you can update the image
>> while the stack is running.  You can then use ssh to
>> display the flash progress.  If you need a fancy gui and
>> not the internal serial then use the host, or write the
>> rest of the graphics stack.
>
>That's all over external network again, though.  Point is we want to
>do this from the host -- the host in general is unable to connect to
>the BMC when the BMC is piggybacking on a host network port (all of
>our products do this, and a lot of other vendors use the same
>design).

Well, Intel i210 has a bmc controlled mode to control if the host
can see the network, the bmc, or both.   However, it also allows 
the bmc to redirect any traffic to itself, so that is another can
of worms.

Point is, can your customized firmware add BMC to Host networking?

>
>If we were assured of external BMC network access, updates become
>very simple.  In this kind of deployment though, there is no external
>network access to the BMC.
>
>> If you need the reliable backout then you need space for
>> a second image, even if its smaller due to being emergency
>> servies only.
>> 
>> 
>> PS:  There were some flashes we tried early that had
>> horrible erase times -- over 20 minutes for a full
>> erase.  Check the specs for the parts you provide vs
>> others in the market, the better ones erase in a few
>> minutes.
>
>We use the better-specced ones for both BMC and PNOR.
>
>> PPS:  The reason we added UBI was its feature to use
>> the whole flash for wear leveling (minus the bootloader
>> that is outside the UBI partition).
>> 
>> =======================================
>> Twice: Going back to the scenerio again
>> 
>>>I just need to launch a GUI tool with host administrative
>privileges,
>>>select the upgrade file, and queue an upgrade to happen when I
>reboot
>>>the machine.  I queue the update, start the reboot, and stick
>around
>>>to see the upgrade progress on the screen while it's booting back
>up.
>>> Because I can see the status on the screen, I know what is
>happening
>>>and don't pull the power plug due to only seeing a black screen and
>>>power LED for 10 minutes.  Finally, the machine loads the OS and I
>>>verify the new control widget is working properly.
>> 
>> If the gui is on the host, with todays stock phosphor-initfs, you
>need
>> 1) a connection from the host to the bmc
>>   ethernet, serial, usb ethernet etc
>>   (to copy files from host to BMC RAM and to monitor command
>output)
>
>Precisely.  USB would be an interesting control channel, but I don't
>think OpenBMC currently supports this kind of access?
>

Actually the current usb-ctrl script has an option to configure the
ecm gadget, and there are patches to update the script to use 
defined mac addresses.

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/usb/gadget/Kconfig?h=v5.12#n281
https://gerrit.openbmc-project.xyz/c/openbmc/phosphor-misc/+/42280/

>> 2) hardware ability to reboot bmc with host surviving
>> - all userspace has to be replaced with those on the filesystem in
>> RAM
>> - can be shortened slightly by preloading image in BMC before
>> shuting
>>   down services if the current kernel is compatible.  This can be
>> the
>>   old or new image.
>> 
>> - or -
>> 
>> Boot the host for GUI support with the BMC in an optimized
>> update mode.
>> 
>>  This can be before or after the file is downloaded to the
>>  host.
>> 
>> 
>> 3) Once the bmc is running from a squashfs in RAM (and if you want
>> to clean the rwfs overlay, persist on clean reboot/shutdown mode),
>> 
>> - copy the image to the bmc
>> - validate as required (preferably somewhere under /run)
>> - move imgage-rofs , kernel, etc as needed to /run/initramfs
>> - /run/initramfs/update
>>    (which checks the fs is not obviously mounted,
>>     runs flashcp, which has status on stdout
>>     moves files successfully written
>>     and then writes selected overlay content back to rwfs
>> - check the images were all written
>> - reboot
>> 
>> =================
>> Option Three:
>> This might be a better experience but needs some software work
>> to enable kexec on the 2500.
>> 
>> 
>> Transfer the FS and kernel to the BMC RAM, and kexec the kernel
>> (note patches on the list for 2600 need to test and maybe a bit of
>> coding for the 2500).  Optionally this can contain the virt pnor
>> image too.  After the BMC boots from the system in RAM boot the
>> host from vpnor image in RAM then use the host to drive the GUI
>> to acknoledge and initiate the flash as desired.
>> 
>> The hooks are in phosphor-initfs to flash the image after the
>> host is up, and to boot with the image in RAM.
>> 
>> As an alternative to kexec, if the new file system supports the
>> old BMC kernel then the shutdown script can easily be edited to
>> restart the exec script with the images in /run.  Alternatively
>> if the new kernel supports the old user space then it can be
>> flashed first, then on the next boot the prior case applies as
>> it is the updated kernel.  Note: I did this flow several times
>> in developement but decided not to put code in the shutdown
>> script because its a script that is executed from /run/initramfs
>> and can easily be edited there when alternative flow is required.
>> (there are comments that show where to edit).
>> 
>> 
>>>>> == BMC boot time ==
>>>>> 
>>>>> This is self explanatory.  Other vendors' solutions allow the
>host
>>>>> to be powered
>>>>> on within seconds of power application from the wall, and even
>>>>> our
>>>>> own Kestrel
>>>>> soft BMC allows the host to begin booting less than 10 seconds
>>>>> after power is
>>>>> applied.  Several *minutes* for OpenBMC to reach a point where
>>>>> it
>>>>> can even
>>>>> start to boot the host is a major issue outside of datacenter
>>>>> applications.
>>>> 
>>>> Some of this is, to me, an artifact of the Power architecture and
>>>> not an
>>>> artifact of OpenBMC explicitly.  On x86 systems we have a little
>>>> code in
>>>> u-boot that wiggles a GPIO and gets the Host power sequence going
>>>> while
>>>> the BMC is booting up.  This overlaps quite a bit of the memory
>>>> testing
>>>> of the Host with the BMC boot time.  The "well-known proprietary
>>>> BMC"
>>>> also does this same trick.
>>>
>>>I think we're talking about two different well know proprietary
>>>BMCs,
>>>but that's not important for this discussion other than no, the one
>>>I
>>>have in mind doesn't resort to such tricks.  What it does do is
>>>start
>>>up its core services rapidly enough where this isn't a problem, and
>>>lets the rest of the BMC stack start up at its own pace later on.
>>> 
>>>> Power requires the BMC to be up in order to serve out the virtual
>>>> PNOR,
>>>> from my recollection.  It seems like this could be solved in
>>>> other
>>>> ways,
>>>> such as a SPI-mux on a physical SPI-NOR so that the BMC can take
>>>> the NOR
>>>> at specific times during update but otherwise it is given to the
>>>> host
>>>> CPUs.  This is exactly what we do on x86 systems.
>>>
>>>Ouch.  So on x86 boxen you might actually have two "BMCs" -- the
>>>proprietary one inside the CPU that starts in seconds and provides
>>>base services like SPI Flash mapping to CPU address space, and the
>>>external OpenBMC one that can run in parallel without interfering
>>>with host start.  Adding a mux is then a hack needed on top, since
>>>you can't really communicate with the proprietary stack in the
>>>required manner.
>>>
>> 
>> I'd say their cpu doesn't require the bmc to boot, it also means
>> they trust their system to not melt without bmc monitoring.
>
>I'd argue it's really a bit of semantics. :)  x86 systems have a sort
>of proto-BMC built right in to every single CPU, in the form of the
>ME/PSP and its associated firmware, that can provide various
>functions including (IIRC) thermal control.  On the ARM side, you're
>probably right, they're a bit more primitive in terms of just mapping
>Flash directly to the CPU address space on low end parts, though I
>think (?) the modern higher end parts are back to a sort of "security
>manager" BMC-analogue providing these basic services to the host CPU.
>
>Regardless, POWER does stick out like a sore thumb for shoving these
>low level functions into the high level "full-stack" BMC.
>Architecturally, it may not have been the best decision, but I do
>understand it sped time to market etc.   Fortunately, it's also
>something we can work to fix.

Hostboot can probably boot a decent way up with just a readonly
mapping of the flash.  Either copy the image to ram or just the
ioctl to the flash chip if the pnor flash is full image.

>
>>>For systems like POWER that lack the proprietary internal "BMC", I
>>>guess there are a few ways we could address the problem:
>>>
>>>1.) Speed up OpenBMC load -- this sounds like it would end up being
>>>completely supported by one or two vendors alone, and subject to
>>>breakage from the other vendors that simply don't have any concerns
>>>around OpenBMC start time since their platforms aren't visibly
>>>affected by it.  It's also unlikely to come into the desired
>>>>sub-10s
>>>range.
>>>
>>>2.) Split the BMC into "essential" and "nice to have" services,
>>>>much
>>>like the other platforms.  Painful, as it now requires even more
>>>parts on the mainboard.
>>>
>>>3.) Keep the single BMC device, but split it into two software
>>>stacks, one that can load nearly instantly and start providing
>>>essential services, and another than can load more slowly.  This
>>>would effectively require two separate CPUs inside the BMC, which
>>>we
>>>actually do have in the AST2500.  I haven't done any digging though
>>>to see if the second CPU is powerful enough to implement the HIOMAP
>>>protocol at speed.
X>>>
>>>> Having said all of that, there is certainly some performance
>>>> improvements that can be done, but nobody has taken up the torch
>>>> on
>>>> it.
>>>> A big low-hanging fruit in my mind is the file system compression
>>>> being
>>>> xz or gzip is very computationally intensive.  I did some work,
>>>> with
>>>> Nick Terrell, to switch to zstd on our systems for both the
>>>> kernel
>>>> initramfs and UBI and saw significant boot time improvements.
>>>> The
>>>> upstream enablement for this appears to have landed as of v5.9 so
>>>> we
>>>> could certainly start enabling it here now.
>>>> 
>>>>
>>>INVALID URI REMOVED
>>>linux-2Dkbuild_20200730190841.2071656-2D7-2Dnickrterrell-40gmail.co
>m_
>>>&d=DwIFaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=bvv7AJEECoRKBU02rcu4F5DWd-EwX8
>As
>>>2xrXeO9ZSo4&m=2O37p_XR8IO9jl4psZwnU-fmhndTW41NpqMXsT9Or6w&s=DF7yGqf
>SE
>>>-V5_j_DgmASLOgLpkfjcJpCK5xsJW3avqY&e=
>>>> 
>> 
>> In addition to compression options there are tradeoffs on how much
>> is
>> copied to ram vs how much is read from the flash possibly
>> repeatedly.
>> If you add secure boot the time goes up.
>
>Yeah, I'm really coming around to the idea that we need to embrace
>the split architecture every other system uses.  The LPC bridge and
>base power / fan controls really should be running independently on
>the ColdFire core, not on the main "full stack" BMC ARM core, and
>even for Kestrel we're exploring something similar (though in that
>case, it's mainly so that the host doesn't die if we accidentally
>crash the main CPU).

Have you looked at starting hiomap early?  and telling hostboot
assume the whole image is there until you need to write?

Can you get by with a fixed fan through memory post where 
hostboot is running a single core?

>
>>>>> == Host boot status indications ==
>>>>> 
>>>>> Any ODM that makes server products has had to deal with the
>>>>> psychological "dead
>>>>> server effect", where lack of visible progress during boot
>>>>> causes
>>>>> spurious
>>>>> callouts / RMAs.  It's even worse on desktop, especially if
>>>>> server-type
>>>>> hardware is used inside the machine.  We've worked around this a
>>>>> few times with
>>>>> our "IPL observer" services, and really do need this
>>>>> functionality
>>>>> in OpenBMC.
>>>>> The current version we have is both front panel lights and a
>>>>> progress bar on
>>>>> the BMC boot monitor (VGA/HDMI), and this is something we're
>>>>> willing to
>>>>> contribute upstream.
>>>> 
>>>> Great!  Let's get that merged!
>>>
>>>Sounds good!  The files aren't too complex:
>>>
>>>INVALID URI REMOVED
>>>_git_blackbird-2Dskeleton_tree_pyiplobserver&d=DwIFaQ&c=jf_iaSHvJOb
>Tb
>>>x-siA1ZOg&r=bvv7AJEECoRKBU02rcu4F5DWd-EwX8As2xrXeO9ZSo4&m=2O37p_XR8
>IO
>>>9jl4psZwnU-fmhndTW41NpqMXsT9Or6w&s=zLtrjaE2hHjV3z9ar0gcJVvZ9Uzwxinf
>ed
>>>AOMEWs04s&e=
>>>INVALID URI REMOVED
>>>_git_blackbird-2Dskeleton_tree_pyiplledmonitor&d=DwIFaQ&c=jf_iaSHvJ
>Ob
>>>Tbx-siA1ZOg&r=bvv7AJEECoRKBU02rcu4F5DWd-EwX8As2xrXeO9ZSo4&m=2O37p_X
>R8
>>>IO9jl4psZwnU-fmhndTW41NpqMXsT9Or6w&s=AOWB1Ja82thvSZFO81WfIj7MJtg5Te
>ZN
>>>8wpT_EpG_Zo&e=
>>>
>>>Is the skeleton repository the best place for a merge request?
>> 
>> hmm, as prototype code in python, maybe.   I don't think many
>> current
>> systems ship python.  Also upstream Yocto removed all support for
>> python 2.
>> 
>> In addition I see a mix of "copy the data" and "transform the data"
>> in the same script, such as
>> 
>> updateIPLLeds(self, initial_start, status_changed)
>> 
>> with
>>            # Show major ISTEP on LED bank
>>            # On Talos we only have three LEDs plus a fourth
>> indicator modification
>>            # bit, but the major ISTEPs range from 2 to 21
>>            # Try to condense that down to something more readily
>> displayable
>> 
>> 
>> [ After some thought, its ok to be in the output code, as it's
>> formatting the data for the display. ]
>> 
>> 
>> The upstream post interface logs the post codes, and display is
>> a separate function.  The ipl_status_monitor seems to mix
>> monitoring
>> the port 80 snoops with other logic to determine the system state
>> eg is the host up?.
>> 
>> Also both scripts extensivly use popen to handle device
>> communication
>> and some communication to other services (kill to post code).
>> 
>> 
>>>
>>>> I do think some others have support for a 7-seg display with the
>>>> postcodes going to it already.  I think this is along those same
>>>> lines.
>>>> It might just be another back-end for our existing post code
>>>> daemon
>>>> to
>>>> replicate them to the VGA and/or blink morse code on an LED.
>>>
>>>OK, so this is what we ran into before.  Where is this support
>>>in-tree, and do we need to reimplement our system to match what
>>>already exists (by extension, extending the other vendor code since
>>>our observer is more detailed in terms of status etc.), or would we
>>>be allowed to provide a competing solution to this other support,
>>>letting ODMs pick which one they wanted?
>>>
>> 
>> Our upstream code is at
>> INVALID URI REMOVED
>> mc_phosphor-2Dhost-2Dpostd&d=DwICaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=bvv7AJ
>> EECoRKBU02rcu4F5DWd-EwX8As2xrXeO9ZSo4&m=xjFkq8WLDkwI0WAERUAgRudWVVcq4
>> 10qqWVjl-Ka4-M&s=VJweaMcJOe7tEVYUZkBMijTisbfRMkTVN9ga4s-d8Xw&e= 
>> for the snoop readers and the LED segment drivers, and the history
>> and Dbus owner is
>> INVALID URI REMOVED
>> mc_phosphor-2Dpost-2Dcode-2Dmanager&d=DwICaQ&c=jf_iaSHvJObTbx-siA1ZOg
>> &r=bvv7AJEECoRKBU02rcu4F5DWd-EwX8As2xrXeO9ZSo4&m=xjFkq8WLDkwI0WAERUAg
>> RudWVVcq410qqWVjl-Ka4-M&s=esXQVng9zvQlGkdhvk0dH12NrhHkOGgAsIqi_MLaPRY
>> &e= .
>> 
>> To catalog the source of the host and bmc there is
>>
>INVALID URI REMOVED
>mc_phosphor-2Dstate-2Dmanager_blob_master_obmcutil&d=DwICaQ&c=jf_iaSH
>vJObTbx-siA1ZOg&r=bvv7AJEECoRKBU02rcu4F5DWd-EwX8As2xrXeO9ZSo4&m=xjFkq
>8WLDkwI0WAERUAgRudWVVcq410qqWVjl-Ka4-M&s=brrhp8N0QwEZNOkddBmqUH6OoufV
>H5DQxo4GAPBCMbo&e= 
>> 
>> In addition to phosphor-misc for "one file projects" there is
>> openbmc-tools for handy tools which may be more developer focused.
>
>So it sounds like we'd need to rewrite this as a set of patches for
>phosphor-post-code-manager?  Would they actually be merged or would
>we run into resistance to extending the functionality of that system
>for our use case?

Actually I think the manager would stay, and you might be adding
an appliation similar to the 7 segemnt led driver in the the
phosphor-host-postd to take the data snooped from port 80 and
formatting it for your display.

>
>>>>> == IPMI / BMC permissions ==
>>>>> 
>>>>> An item that's come up recently is that, at least on our older
>>>>> OpenBMC versions,
>>>>> there's a complete disconnect between the BMC's shell user
>>>>> database and the
>>>>> IPMI user database.
>> 
>> Mostly true, in part because the IPMI password for RCMP+ must be
>> stored on the BMC (reversiably encrypted for our implementation).
>> Note improper storage of this was an area of one or more CVEs.
>> 
>> In addition it has a limit of 20 characters in a password and 8
>> users.
>> 
>>>>> Resetting the BMC root password isn't possible from IPMI
>>>>> on the host, and setting up IPMI doesn't seem possible from the
>>>>>>BMC shell.  If
>> 
>> In our current code we have pam hooks that save the password
>> during a change, if the user is in the ipmi group and the
>> password is short enough (or returns an error).
>> 
>>>>> IPMI support is something OpenBMC provides alongside Redfish, it
>>>>> needs to be
>>>>> better integrated -- we're dealing with multiple locked-out BMC
>>>>> issues at the
>>>>> moment at various customer sites, and the recovery method is
>>>>> painful at best
>>>>> when it should be as simple as an ipmitool command from the host
>>>>> terminal.
>>>> 
>>>> I suspect most of this is a matter of IPMI command support and/or
>>>> enabling
>>>> those commands to the host IPMI path.  Most of us are fairly
>>>> untrusting
>>>> of IPMI (and the Host itself), so there hasn't been work to do
>>>> anything
>>>> here.  As long as whatever you're proposing can be disabled for
>>>> models
>>>> where we distrust the Host, it seems like these would be accepted
>>>> as
>>>> well.
>> 
>> 
>> Our current Redfish has multiple users and can enable and
>> disable users to have ipmi access and set their password.
>> 
>> 
>> Of course this just moves the goal posts to the Redfish
>> admin login, but in addition to mTLS certificate based
>> trust (which should be customized to the customer),
>> 
>> Redfish has the concept of a host firmware and os logins
>> including a binding for EFI to specify adapter path and
>> network in addition to read-once magic efi variables.  I
>> know OpenPOWER boxes don't have EFI but the information
>> could be exposed in a similar fashion.  As far as I know
>> we have not yet implemented these users in our Redfish
>> server.
>
>Honestly Redfish is something that we might just want to move to, and
>officially / formally drop network IPMI support.  Probably the
>biggest issue with that comes right back down to needing
>communication between the host and BMC, however -- ipmitool shortcuts
>the whole BMC/host network isolation problem (described above) by
>using the USB interface.  Is there a way to use Redfish over USB in a
>similar manner?

DEPRECATED ===== skip this for below

As I mentioned, the Redfish specification explictly 
talks about having a login for the firmware and the 
booted OS, and requirements for the admin to allow 
or disallow the IDs.  In addition it talks about how 
the information is presented to a EFI boot.  The model 
generates a unique password for each boot using special 
EFI variables, include designation of network path 
including the concept of USB network or pci slot and 
function, IP information, etc.  It uses special 
read-once efi variables to protect the password from 
casual snooping.

I don't think we (OpenBMC) have that implemented this  
magic user, but would anticipate that it would be 
accepted.

Also, for OpenPOWER we would likely want to define 
an OF binding.   Thinking about this, due to the 
desire to clear after fetch something like the 
SYSPARMS api that can request a value optinally 
from the service processor might be approprate, 
even though that is currently a FSP only interface. 

https://github.com/open-power/skiboot/blob/master/doc/device-tree/ibm%2Copal/sysparams.rst
https://github.com/open-power/skiboot/blob/master/doc/opal-api/opal-param-89-90.rst

Another alternative would be the secvar interface 
if that could be common with userspace expecting 
the efi variables, but that would have to be 
multiplexed with the current secvar backend for 
secure boot management. 

DEPRECATED === END ===

The Redfish spec was updated to have a IPMI 
command to create a Bootstrapping credential 
that can then be used until disabled and will 
be invalidated by a Host Reset or Service 
Reset.   The expectation is this temporary 
role will be used to create a permanent 
account.  This service is only available on a 
designated interface and can be disabled or 
enabled from the Redfish HostInterface 
representation.

I believe this too would have to be 
implimented, but exposing the information
to an openfirmware client is much easier 
as it could be a few properties in the 
device tree path.  USB networks are 
identified by vendor, device, and serial
and the ecm device serial is generated
using the bmc machine-id.

=====

For in-box communication, IPMI is being replaced with 
PLDM and MCTP, as 
IPMI is being replaced in future stacks by PLDM and MCTP, 
as Redfish is expecting a reliable transport and is 
string based so quite verbose.

https://github.com/openbmc/docs/blob/master/designs/pldm-stack.md
https://github.com/openbmc/docs/blob/master/designs/mctp/

>
>Thanks!
>
>