Aspeed SuperIO runtime management

Wed Oct 4 11:34:27 AEDT 2023

Hello all,

I was hoping to gather any thoughts in the community on how best to deal
with a problem I've encountered on my latest OpenBMC platform port (but
which I think might be relevant to some other systems as well).

For reasons I don't fully understand but that I think are orthogonal to
this particular issue, the platform in question can't use the Aspeed
VUART, and so instead uses two SUARTs configured back-to-back via the
UART mux to provide the host's serial console.  The host's firmware thus
enables its UART early in the host boot sequence, which requires that
the AST2500's built-in SuperIO device be enabled (SCU70[20]=0).
Unfortunately that exposes the BMC to some of the CVE-2019-6260
("pantsdown") vulnerabilities, which is a pretty big downside, and one
that I'd like to minimize as much as I can.

The SuperIO only really *needs* to be enabled during the window of time
in which the host firmware performs the UART-enable sequence; once it's
up and running I can manually disable it without any adverse effects.
So what I'd ideally like is to have the BMC enabling and disabling the
SuperIO at runtime, turning it on only when it's expected to be needed
and then turning it back off so as to minimize the exposure to known
security holes (while in general I wouldn't like the BMC to consider the
BIOS/UEFI code as "trusted", it's hopefully at least less actively
hostile than whatever might be running when the host OS is booted).

To that end, what I've currently got consists of:

 1. A kernel tweak (currently hacked onto the aspeed-socinfo driver) to
    expose the SuperIO enable/disable state as a read/write sysfs file,
    and

 2. A patch to x86-power-control using that file to enable the SuperIO
    when the host's POST-complete signal is deasserted (and disable it
    when it's asserted).

Aside from being a bit of a kludge (and a fairly special-purpose one at
that), the major drawback with this approach is that it seems kind of
inherently racy.  When the host resets and the POST-complete signal
deasserts, there's nothing synchronizing the BMC and the host to ensure
that the BMC does in fact enable the SuperIO before the host tries to
access it when it goes to enable the UART.  In the stress-testing I've
done (including swamping the BMC with artificial CPU & interrupt load) I
haven't ever seen it "lose" the race, but I don't have a terribly
accurate sense of how tight the window of time really is.

So what I'm wondering here is:

 1. Does anyone know of any better ways of handling this problem?

 2. If not and this is the best option we've got, are there better
    implementation options that might be palatable for potential
    upstreaming (more appropriate places to put the kernel side, a way
    to make the userspace side less of a hard-coded hack, etc.), or is
    this doomed by its nature to live out-of-tree?

Thanks,
Zev