BMCWeb policy for HTTPS site identity certificate

Wed Jul 29 12:31:31 AEST 2020

On Tue, Jul 28, 2020 at 10:03 AM Michael Richardson <mcr at sandelman.ca> wrote:
>
>
> Ed Tanous <ed at tanous.net> wrote:
>     >> "certificate is missing" is pretty much unambiguous.
>
>     > Unfortunately, this ambiguity comes with the territory.  On first
>     > boot, bmcweb has no certificate, and doesn't know the difference
>     > between "missing" and "was never there".  Regardless, to bring up TLS
>     > it needs _some_ certificate, so the original behavior was that it
>
> This is reasonable behaviour, but given that browsers are trying very hard to
> make the certificate exception box go away, this does not really help
> long-term in my opinion.

I'd still be very surprised if this ever happened to browsers in the
long term without any control server side.  I get where their
motivation is, and I agree with it in principle, but without some
mechanism for initial embedded system provisioning, I don't know how
you completely disable that bypass.  With that said, I'm not a browser
developer, so if it does happen, we'll need to figure out another way
to handle initial boot and provisioning.  If you have a proposal for
how to handle this without self signed certs, it would be an
interesting discussion to have.

>
> Missing means: "ENOFILE", not "Can we use this certificate file for starting
> up an SSL Connect".

Today, that's not the definition bmcweb uses to determine whether to
generate a new cert.  I can't tell if you're proposing a different
behavior here, or making a statement about current behavior.  If the
file is present but corrupt, or inaccessible due to permissions, what
are you proposing the behavior should be?
Flash corruption does happen, and in that case, we need a way to bring
up the (sometimes only) configuration interface in a way that is
usable to exchange the certs with valid ones, even if it's sub-optimal
for security.

>
>     >> "bad format" depends a bit upon evolution of libraries.
>
>     > Today this is defined as the above.  "Can we use this certificate file
>     > for starting up an SSL context?"  If the answer is no, we regenerate.
>     > In theory, the only library we rely on for this is OpenSSL, which I
>     > would hope doesn't have a backward incompatible evolution in this
>     > area.
>
> Yes, it does.
> For instance, you can't load 1024-bit RSA keys with 1.1.1.
> It refuses to start.

When I get a free second, I'll look up where we landed on "should we
allow 1k keys" discussion we had a long time back.  I know we had
talked about disallowing them and I think the conclusion was that we
disallow them at upload time.  With that said, maybe 2k keys fail to
load in the future?

> Meanwhile, 1.0.x does not have any ECDSA support,
bmcweb has never targeted OpenSSL 1.0, and has always generated self
signed EC keys so this shouldn't be an issue in practice, but your
point about "could've broken us if" is well taken.

> and you won't find this out
> until the TLS session actually tries to start, at which point, it logs an
> obsure message to stderr, and returns an error that most programs don't know
> what to do with.
> (And the TCP connection just ends)

I could've sworn that EVP_PKEY_get1_RSA returns NULL if it's an EC key
(which is a call that bmcweb explicitly checks).  That call is one of
our "can we build an SSL context" checks today.  Maybe OpenSSL 1.0 is
different?  Regardless, it's really hard to talk about backward
compatibility with hypothetical openbmc + openssl 1.0 builds that to
my knowledge have never existed.  If this situation presents itself in
the future on another OpenSSL upgrade, I suspect that is the best time
to discuss it.

>
>     >> In particular, a new version of libssl might support some new algorithm, and
>     >> then should the firmware be rolled back, it will "bad format".
>
>     > In this hypothetical, you're thinking about a new, non x509
>     > certificate file format?  I vote let's cross that bridge when we get
>
> Nope, not about non-X.509.
> Algorithms and keysize changes.

Agreed, there are possible changes that could break us in the future
(if openssl stops accepting 2k keys for example).

>
>     > there, as it seems like there's a lot more discussion that would need
>     > to happen around upgrades and downgrades.  Today the assumption we
>     > make is that x509 certificate reading is backward and forward
>     > compatible since the begining of openbmc, which, to my knowledge, it
>     > is.
>
> Until... it isn't.
> But, the proposal would have considered a certificate with an invalid date as
> being invalid, and generated a new one.

Yes, I do not believe date, nor cert chain should be used under the
definition of "valid"; "Can we use this certificate file for starting
up an SSL context?" answers yes, even if the date and/or cert chain is
invalid, so I think the definition still works.
With that said, I think all of the above is covered by general idea of
"upgrades are guaranteed, downgrades are best effort" that most BMC
implementations (including OpenBMC at this point) tend to take.  (yes,
sometimes we break the upgrade path and have to fix it).  I don't
think I've seen anywhere in the project where we include both a
forward and backward path for nonvolatile schema migrations.  Are you
proposing something different we should do to handle these types of
situations?

With all of this said, I'm open to the possibility that we have a
backward incompatible openssl change that invalidates the cert.  Do
you think you could code up a patch with what you're hoping the
behavior to be?  It might be easier to approach it from a patchset.

>
>     >> So I suggest that the certificate+keypair is never deleted, but may be renamed.
>     >> I think that we could have a debate about getting telemetry about bad
>     >> certificates back via HTTP.
>
>     > We can have a discussion, but I suspect a lot of people would be very
>     > against using unencrypted HTTP for this purpose.
>
> I agree.
> So, how do you get information at this point?
>

I'm not following.  At the point where you've downgraded, and your key
is no longer valid?  bmcweb will regenerate a self-signed one, and a
user can connect to the HTTPS port insecurely.  Hopefully their next
step is to set up a valid key again, but I don't know how to force
that on people.  Is there a better behavior you'd like?  Brick the
system until it's factory reset?