Feature request: erofs-utils mkfs: Efficient way to pipe only file metadata

Mike Baynton mike at mbaynton.com
Tue Feb 20 14:15:35 AEDT 2024


Hello Gao,
Thanks for your quick reply and thoughts on the matter. Yeah, formats
like those you referenced look like basically the right idea as well.

I did just want to point out a few requirements I have that I'd wager
would apply to anyone trying to use EROFS in the way I am. That is, to
use EROFS as a layer of an overlay filesystem that includes inodes for
each and every file required by an application or container, but that
uses xattrs interpreted by overlayfs to point to specific files
containing the desired data.

So the requirements are
1. Format needs to support xattrs in the input to mkfs.erofs.
2. Emphasis on performance of generating 10s of thousands of inodes
and dentries.

1. is because you must set trusted.overlay.redirect and
trusted.overlay.metacopy xattrs on each file. 2. is because those that
land at EROFS for this versus, say, just writing tens of thousands of
sparse files out to ext4, are probably here for the performance
(Making this part of the startup process for a container seems likely
for example, and container startups should be fast.)

So I actually think tar's header and extended header records format is
already better suited to the task. You can encode xattrs and, though
it might not be a major slowdown, you avoid converting things like
mode and file size to symbolic ascii representations and back. It's
also reasonably easy to find software to assist in correctly
generating these structures.

Regards,
Mike

On Sun, Feb 18, 2024 at 10:44 PM Gao Xiang <hsiangkao at linux.alibaba.com> wrote:
>
> Hi Mike,
>
> On 2024/2/19 11:37, Mike Baynton wrote:
> > Hello erofs developers,
> > I am integrating erofs with overlayfs in a manner similar to what
> > composefs is doing. So, I am interested in making erofs images
> > containing only file metadata and extended attributes, but no file
> > data, as in $ mkfs.erofs --tar=i (thanks for that!)
>
> Thanks for your interest in EROFS too.
>
> >
> > However, I would like to construct the erofs image from a set of files
> > selected dynamically by another program. This leads me to prefer
> > sending an unseekable stream to mkfs.erofs so that file selection and
> > image generation can run concurrently, instead of first making a
> > complete tarball and then making the erofs image. In this case, it
> > becomes necessary to transfer each file's worth of data through the
> > stream after each header only so that the tarball reader in tar.c does
> > not become desynchronized with the expected offset of the next tar
> > header.
>
> I wonder if it's possible to use a modified prototype-like [1] format
> which mkfs.xfs [2] currently supports with "-p".  This prototype can
> be passed with a pipe instead.
>
> [1] http://uw714doc.sco.com/en/man/html.4/prototype.4.html
> [2] https://man7.org/linux/man-pages/man8/mkfs.xfs.8.html
>
> >
> > A very straightforward solution that seems to be working just fine for
> > me is to simply introduce a new optarg for --tar that indicates the
> > input data will be simply a series of tar headers / metadata without
> > actual file data. This implies index mode and additionally prevents
> > the skipping of inode.size worth of bytes after each header:
> >
> > diff --git a/include/erofs/tar.h b/include/erofs/tar.h
> > index a76f740..3d40a0f 100644
> > --- a/include/erofs/tar.h
> > +++ b/include/erofs/tar.h
> > @@ -46,7 +46,7 @@ struct erofs_tarfile {
> >
> >    int fd;
> >    u64 offset;
> > - bool index_mode, aufs;
> > + bool index_mode, headeronly_mode, aufs;
> >   };
> >
> >   void erofs_iostream_close(struct erofs_iostream *ios);
> > diff --git a/lib/tar.c b/lib/tar.c
> > index 8204939..e916395 100644
> > --- a/lib/tar.c
> > +++ b/lib/tar.c
> > @@ -584,7 +584,7 @@ static int tarerofs_write_file_index(struct
> > erofs_inode *inode,
> >    ret = tarerofs_write_chunkes(inode, data_offset);
> >    if (ret)
> >    return ret;
> > - if (erofs_iostream_lskip(&tar->ios, inode->i_size))
> > + if (!tar->headeronly_mode && erofs_iostream_lskip(&tar->ios, inode->i_size))
> >    return -EIO;
> >    return 0;
> >   }
> > diff --git a/mkfs/main.c b/mkfs/main.c
> > index 6d2b700..a72d30e 100644
> > --- a/mkfs/main.c
> > +++ b/mkfs/main.c
> > @@ -122,7 +122,7 @@ static void usage(void)
> >          " --max-extent-bytes=#  set maximum decompressed extent size #
> > in bytes\n"
> >          " --preserve-mtime      keep per-file modification time strictly\n"
> >          " --aufs                replace aufs special files with
> > overlayfs metadata\n"
> > -       " --tar=[fi]            generate an image from tarball(s)\n"
> > +       " --tar=[fih]           generate an image from tarball(s) or
> > tarball header data\n"
> >          " --ovlfs-strip=[01]    strip overlayfs metadata in the target
> > image (e.g. whiteouts)\n"
> >          " --quiet               quiet execution (do not write anything
> > to standard output.)\n"
> >   #ifndef NDEBUG
> > @@ -514,11 +514,13 @@ static int mkfs_parse_options_cfg(int argc, char *argv[])
> >    cfg.c_extra_ea_name_prefixes = true;
> >    break;
> >    case 20:
> > - if (optarg && (!strcmp(optarg, "i") ||
> > - !strcmp(optarg, "0") || !memcmp(optarg, "0,", 2))) {
> > + if (optarg && (!strcmp(optarg, "i") || (!strcmp(optarg, "h") ||
> > + !strcmp(optarg, "0") || !memcmp(optarg, "0,", 2)))) {
> >    erofstar.index_mode = true;
> >    if (!memcmp(optarg, "0,", 2))
> >    erofstar.mapfile = strdup(optarg + 2);
> > + if (!strcmp(optarg, "h"))
> > + erofstar.headeronly_mode = true;
> >    }
> >    tar_mode = true;
> >    break;
> >
> > Using this requires generation of tarball-ish streams that can be
> > slightly difficult to cajole tar libraries into creating, but it does
> > work if you do it. I can imagine much more complex alternative ways to
> > do this too, such as supporting sparse tar files or supporting some
> > whole new input format.
>
> I think you could just fill zero to use the current index mode now.
> But yes, it could be inefficient if some files are huge.
>
> >
> > Would some version of this feature be interesting and useful? If so,
> > is the simple way good enough? It wouldn't preclude future addition of
> > things like a sparse tar reader.
>
> Yes, I think it's useful to support a simple prototype-like format, but
> it might take time on my own since there are some other ongoing stuffs
> to be landed (like multi-threading mkfs support.)
>
> Thanks,
> Gao Xiang
>
> >
> > Regards,
> > Mike


More information about the Linux-erofs mailing list