[RFC KERNEL] initoverlayfs - a scalable initial filesystem

Neal Gompa neal at gompa.dev
Tue Dec 12 01:17:04 AEDT 2023


On Mon, Dec 11, 2023 at 8:46 AM Eric Curtin <ecurtin at redhat.com> wrote:
>
> Hi All,
>
> We have recently been working on something called initoverlayfs, which
> we sent an RFC email to the systemd and dracut mailing lists to gather
> feedback. This is an exploratory email as we are unsure if a solution
> like this fits in userspace or kernelspace and we would like to gather
> feedback from the community.
>
> To describe this briefly, the idea is to use erofs+overlayfs as an
> initial filesystem rather than an initramfs. The benefits are, we can
> start userspace significantly faster as we do not have to unpack,
> decompress and populate a tmpfs upfront, instead we can rely on
> transparent decompression like lz4hc instead. What we believe is the
> greater benefit, is that we can have less fear of initial filesystem
> bloat, as when you are using transparent decompression you only pay
> for decompressing the bytes you actually use.
>
> We implemented the first version of this, by creating a small
> initramfs that only contains storage drivers, udev and a couple of 100
> lines of C code, just enough userspace to mount an erofs with
> transient overlay. Then we build a second initramfs which has all the
> contents of a normal everyday initramfs with all the bells and
> whistles and convert this into an erofs.
>
> Then at boot time you basically transition to this erofs+overlayfs in
> userspace and everything works as normal as it would in a traditional
> initramfs.
>
> The current implementation looks like this:
>
> ```
> From the filesystem perspective (roughly):
>
> fw -> bootloader -> kernel -> mini-initramfs -> initoverlayfs -> rootfs
>
> From the process perspective (roughly):
>
> fw -> bootloader -> kernel -> storage-init   -> init ----------------->
> ```
>
> But we have been asking the question whether we should be implementing
> this in kernelspace so it looks more like:
>
> ```
> From the filesystem perspective (roughly):
>
> fw -> bootloader -> kernel -> initoverlayfs -> rootfs
>
> From the process perspective (roughly):
>
> fw -> bootloader -> kernel -> init ----------------->
> ```
>
> The kind of questions we are asking are: Would it be possible to
> implement this in kernelspace so we could just mount the initial
> filesystem data as an erofs+overlayfs filesystem without unpacking,
> decompressing, copying the data to a tmpfs, etc.? Could we memmap the
> initramfs buffer and mount it like an erofs? What other considerations
> should be taken into account?
>
> Echo'ing Lennart we must also "keep in mind from the beginning how
> authentication of every component of your process shall work" as
> that's essential to a couple of different Linux distributions today.
>
> We kept this email short because we want people to read it and avoid
> duplicating information from elsewhere. The effort is described from
> different perspectives in the systemd/dracut RFC email and github
> README.md if you'd like to learn more, it's worth reading the
> discussion in the systemd mailing list:
>
> https://marc.info/?l=systemd-devel&m=170214639006704&w=2
>
> https://github.com/containers/initoverlayfs/blob/main/README.md
>
> We also received feedback informally in the community that it would be
> nice if we could optionally use btrfs as an alternative.
>
> Is mise le meas/Regards,
>
> Eric Curtin
>

Adding linux-btrfs@ to the discussion, because I think it'd be useful
to include them for what handling btrfs as an alternative to
erofs+overlayfs would look like.



--
真実はいつも一つ!/ Always, there's only one truth!


More information about the Linux-erofs mailing list