[RFC] fs-verity and encryption for EROFS

Gao Xiang hsiangkao at linux.alibaba.com
Wed Dec 21 17:41:40 AEDT 2022


Hi folks,

(As Eric suggested, I post it on list now..)

In order to outline what we could do next to benefit various image-based
distribution use cases (especially for signed+verified images and
confidential computing), I'd like to discuss two potential new
features for EROFS: verification and encryption.

- Verification

As we're known that currently dm-verity is mainly used for read-only
devices to keep the image integrity.  However, if we consider an
image-based system with lots of shared blobs (no matter they are
device-based or file-based).  IMHO, it'd be better to have an in-band
(rather than a device-mapper out-of-band) approach to verify such blobs.

In particular, currently in container image use cases, an EROFS image
can consist of

  - one meta blob for metadata and filesystem tree;

  - several data-shared blobs with chunk-based de-duplicated data (in
    layers to form the incremental update way; or some other ways like
    one file-one blob)

Currently data blobs can be varied from (typically) dozen blobs to (in
principle) 2^16 - 1 blobs.  dm-verity setup is much hard to cover such
usage but that distribution form is more and more common with the
revolution of containerization.

Also since we have EROFS over fscache infrastructure, file-based
distribution makes dm-verity almost impossible as well. Generally we
could enable underlayfs fs-verity I think, but considering on-demand
lazy pulling from remote, such data may be incomplete before data is
fully downloaded. (I think that is also almost like what Google did
fs-verity for incfs.)  In addition, IMO it's not good if we rely on
features of a random underlay fs with generated tree from random
hashing algorithm and no original signing (by image creator).

My preliminary thought for EROFS on verification is to have blob-based
(or device-based) merkle trees but makes such image integrity
self-contained so that Android, embedded, system rootfs, and container
use cases can all benefit from it.. 

Also as a self-containerd verfication approaches as the other Linux
filesystems, it makes bootloaders and individual EROFS image unpacker
to support/check image integrity and signing easily...

It seems the current fs-verity codebase can almost be well-fitted for
this with some minor modification.  If possible, we could go further
in this way.

- Encryption

I also have some rough preliminary thought for EROFS encryption.
(Although that is not quite in details as verification.)  Currently we
have full-disk encryption and file-based encryption, However, in order
to do finer data sharing between encrypted data (it seems hard to do
finer data de-duplication with file-based encryption), we could also
consider modified convergence encryption, especially for image-based
offline data.

In order to prevent dictionary attack, the key itself may not directly be
derived from its data hashing, but we could assign some random key
relating to specific data as an encrypted chunk and find a way to share
these keys and data in a trusted domain.

The similar thought was also shown in the presentation of AWS Lambda
sparse filesystem, although they don't show much internal details:
https://youtu.be/FTwsMYXWGB0

Anyway, for encryption, it's just a preliminary thought but we're happy
to have a better encryption solution for data sharing for confidential
container images... 

Thanks,
Gao Xiang


More information about the Linux-erofs mailing list