[Skiboot] [RFC] docs: Specify V3 of the mbox protocol
Andrew Jeffery
andrew at aj.id.au
Thu Sep 14 11:57:07 AEST 2017
> > > +The host may lock an area of flash using the MARK_LOCKED command.
> > > Any attempt
> > > +to mark dirty or erased this area of flash must fail with the
> > > LOCKED_ERROR
> > > +response code. The host may open a write window which contains a
> > > locked area
> > > +of flash however changes to a locked area of flash must never be
> > > written back
> > > +to the backing data source (i.e. that area of flash must be
> > > treated as read
> > > +only). An attempt to lock an area of flash which is not clean in
> > > the current
> > > +window must fail with PARAM_ERROR. (V3)
> >
> > I think we need to consider how locked regions interact with the in-
> > memory
> > caching of the flash data.
> >
> > The BMC doesn't know that the memory backing the LPC window has been
> > written until it receives a MARK_DIRTY from the host. MARK_DIRTY is
> > not
> > valid on a read window or a locked region, but that doesn't mean that
> > changes to a locked region can't persist in memory due to caching to
> > avoid flash access (which was a part of the motivation for memory-
> > based
> > booting to begin with). This may naively fool host firmware into
> > thinking it is
> > accessing unmolested data as the region is locked, whereas the region
> > could
> > contain anything, just it will never be written to flash.
>
> Correct
>
> >
> > Should we require window requests intersecting a locked region always
> > have at
> > least the locked region positively match the on-flash data? That way
> > the
> > firmware that requested a region be locked could ensure it has
> > whatever was on
> > the flash at the time it was locked by explicitly closing the window
> > (read or
> > write) once finished with it, which requires a window be opened again
> > for any
> > subsequent access. At that point (open) we can do the locked region
> > intesection
> > calculation and check the data in the window as necessary.
>
> I guess this brings about an interesting question, when a window is
> requested is it guaranteed that the window provided in return by the
> BMC contains the same contents as the flash for the requested area?
I think expecting otherwise would be strange. What way would the host
have to validate that? It could keep its own hashes of window content,
but it would need to hash the content itself after opening the window
which may be a circular problem (e.g. host reboot if the daemon doesn't
invalidate the cache across the reboot operation).
> It
> would seem intuitively that it should. The first time a window is
> opened that is true, but with in memory caching the host could make a
> change that it never tells the bmc about and the contents diverge, and
> (assuming the window isn't evicted) persist across window close/open
> calls.
Yep.
>
> Is this the desired behaviour?
No, not in my opinion. It doesn't fit the principle of least surprise.
> It's kind of the only behaviour which
> makes the in-memory caching an effective performance improvement, if we
> reloaded every window every time then we may as well only have one
> window in memory anyway. Maybe we can do this a better way (see below).
>
> It's worth noting that the current flash locking daemon implementation
> will explicitly reload an entire window whenever one which contains any
> locked regions is requested - ensuring (for at least as long as the
> access to the flash device is exclusive) the integrity of the locked
> regions. This could of course be optimised to only reload the locked
> region.
Right, and if we take up the techniques discussed below we may even be
able to eliminate that.
>
> >
> > Locked regions could thus be a performance hit if we always load the
> > locked
> > regions from flash, but surely we prefer (some) integrity over
> > performance.
> >
> > Alternatively, you could cryptographically hash the per-block content
> > of the
> > on-flash locked region during the MARK_LOCKED operation, then stash
> > the hash
> > values away. On a CREATE_{READ,WRITE}_WINDOW operation you could hash
> > any
> > intersecting, in-cache locked blocks and compare to the stashed hash
> > values to
> > incur only one set of flash accesses (initial hash calculations)
> > rather than n.
>
> This seems like a nice balance between performance and data integrity.
>
> I guess this brings about the question again as to whether we care
> about data changes we aren't told about. Do we store a hash of every
> window and before providing a cached copy verify the hash and reload
> the entire window from flash again if there is a discrepancy. The host
> isn't allowed to rely on the persistence of data it never tells the BMC
> about, but is this something the BMC should be checking?
I think so, again in support of the principle of least surprise, and
the possible security implications. I proposed verifying the integrity
of only the locked regions but it might be worth extending that to
entire windows depending on the performance. We should do some
measurements.
I think implementations also need to be noisy in the case of a
discrepancy, logging on the BMC side and possibly issuing a BMC Event
(taking another one of our reserved bits for the purpose). The
modifications are pretty much limited to accidental scribbling (bugs),
or malicious intent (probably enabled by more bugs). Either way,
someone should be notified.
> > > +Command:
> > > > > > + GET_FLASH_NAME (V3)
> > > > > > + Arguments:
> > > > > > + Args 0: Flash ID
> > > > > > + Response:
> > > + Args 0-10: Flash Name / UID
> >
> > Hopefully 11 bytes is enough. Well, 10 I guess - should we specify
> > that the
> > value at index 10 be '\0'? I feel like not requiring that could lead
> > to
> > undesirable implementations.
>
> Depends on whether we require names be 11 bytes, be null terminated, or
> just contain trailing nulls for names shorter than 11 bytes.
>
> If we're going to waste an argument on the null terminator we may as
> well just make the first argument the length of the name.
I think that's a better idea generally. It doesn't assume C-string
semantics, though it's likely that the implementations will require
them, which was what I was looking out for.
>
> Is this something the protocol should care about. Maybe it's just up to
> the daemon and host implementation to put something here that they both
> understand.
Probably not, it was just a thought. It's a balance against avoiding
potential vulnerabilities and over-specifying behaviour.
>
> >
> > I guess we don't want to go down the path of continued responses or
> > magic
> > windows to accommodate larger names?
>
> I'd prefer not :)
>
Same :) Just throwing the question out there for discussion
> >
> > > > > > + Notes:
> > > > > > + Describes a flash with some kind of identifier
> > > useful to the
> > > > > > + host system. This is typically a null-padded
> > > string.
> > > +
> > > +Command:
> > > > > > + MARK_LOCKED (V3)
> > > > > > + Arguments:
> > > > > > + Args 0-1: Flash offset to lock (blocks)
> > > + Args 2-3: Number to lock at offset (blocks)
> >
> > I guess it's hard to get away from using block-sized granuality. Do
> > we
> > currently have any partitions that we might lock that are less than a
> > block
> > size? Or are all partitions expected to be block-size aligned?
>
> This was the motivation for allowing the host to request a block size,
> so that they can hopefully request something which allows locking at
> the required granularity. Currently all partitions are aligned to flash
> erase size (which is what we set block size to anyway).
Okay. It might be worth making a note in the documentation for
MBOX_GET_INFO saying as much.
>
> It's not really worth allowing locking at a finer granularity than
> block size since this is how changes (mark dirty) are specified and so
> it would be impossible to write back any part of a block if a single
> byte in that block were locked. I'm in favour of using block size for
> now and changing it in a later version if this becomes an issue.
Right, it's either blocks for everything or bytes for everything.
Consistency is good. I don't think we want to go through the effort of
changing that aspect of the spec right now.
Cheers,
Andrew
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: This is a digitally signed message part
URL: <http://lists.ozlabs.org/pipermail/skiboot/attachments/20170914/7e171cfe/attachment.sig>
More information about the Skiboot
mailing list