New design for file system layout and code update

Joel Stanley joel at jms.id.au
Mon Aug 21 21:22:04 AEST 2017


Hi Milton,

Thanks for the write-up. We weren't yet looking for something as comprehensive,
but thanks for the attention to detail here. Before we get too far ahead though,
there are a few questions about the higher-level design (quotes are from your
doc):

Firstly, a clarification about the kernel: your doc mentions that
it'll be present (as the raw kernel binary) in a UBI volume, but
the discussion today you mentioned that it will be a raw flash area.
Can you clear that up?

> All distribution images (except Das U-boot and its environemnt)
> will be stored in seperate ubi volumes named by their image type.
> A hash of version identifiers will be generated during deployment
> to make unique image names.
>
> The kernel and its device tree will be stored in a fit image
> in a ubi volume.

Am I correct in assuming that this allows us to be flexible in the
partition sizing? ie, we can perform single-volume updates at runtime
that may be larger than the space originally allocated for that volume?

[ie, with the partitioned mtd scheme that we have now, increasing
a partition beyond its original size isn't possible without
having to shuffle the data around]

If so, this would be a fairly significant reason for adopting the
UBI-volume-based approach.

> To support a total flash chip failure, each flash chip will
> contain an independent ubi device.  The mtd_concat driver will not
> be used to form an ubi device that spans flash chip boundaries.
>
> Both the primary chip (mtd label bmc) and alternate chip (mtd
> label alt) will contain a complete copy of u-boot, and a redundant
> environment, and kernel images.  For space reasons root squashfs
> volumes may be on a different ubi device than the kernel.

Do we really need this though? It seems that the functionality that the
hardware provides here (flipping to a completely independent backup flash)
gives us a simple, fail-safe mechanism for disaster recovery.

Using some components from "primary" and some from "secondary" seems
like an invitation for incompatibility issues.

If we do continue with this approach, how will it interact with the
auto-fallback via chip-select?

> Since all binaries will exist in the root file system, systemd can
> be started directly from the squashfs without an intermediate
> initramfs.  Eliminating the initramfs will remove the requirement
> to build and store it, however, it also requires the bootloader
> to will need to specify how to locate the root filesystem.

So we'd also be moving from the current initramfs-based flashing
mechanism to something that requires the rootfs to be working, right?
That is a bit concerning, as it means that more infrastrcuture
needs to be working & correct to be able to boot *anything*,
including recovery. Or have I read the design incorrectly here?

How about something like:

 - completely independent primary & backup images, and use the
   BMC's watchdog to revert to the backup on severe boot problems

 - kernel and initramfs as (raw) UBI volumes, with u-boot having
   support to boot from those

 - rootfs (ro) + /var (rw) [plus whatever else is required] as
   UBI volumes, likely with squashfs for ro and jffs2 for rw.

That means:

 - we still have a fairly simple path to boot to initramfs, which
   allows for system recovery

 - kernel, initramfs and filesystem sizes are not fixed, and can be
   modified during firmware update

How does that sound?

Cheers,

Joel


More information about the openbmc mailing list