[PATCH v4] erofs-utils: mkfs: support fragment deduplication
Yue Hu
zbestahu at gmail.com
Fri Dec 2 22:10:42 AEDT 2022
On Thu, 01 Dec 2022 20:49:46 +0800
"Yue Hu" <huyue2 at coolpad.com> wrote:
> Add a missing change:
> - change to generate a ctx for duplicate fragment in compression.
>
> On Thu, 1 Dec 2022 19:16:15 +0800
> Yue Hu <zbestahu at gmail.com> wrote:
>
> > From: Yue Hu <huyue2 at coolpad.com>
> >
> > Previously, there's no fragment deduplication when this feature is
> > introduced. Let's support it now.
> >
> > We intend to dedupe the fragments before compression, so that duplicate
> > parts will not be written into packed inode.
> >
> > With this patch, for Linux 5.10.1 + 5.10.87 source code:
> >
> > [before]
> > 32k pcluster + T0 + lz4hc,12 + fragment 450M
> > 64k pcluster + T0 + lz4hc,12 + fragment 434M
> > 128k pcluster + T0 + lz4hc,12 + fragment 426M
> > 32k pcluster + T0 + lz4hc,12 + fragment + dedupe 368M
> > 64k pcluster + T0 + lz4hc,12 + fragment + dedupe 380M
> > 128k pcluster + T0 + lz4hc,12 + fragment + dedupe 395M
> >
> > [after]
> > 32k pcluster + T0 + lz4hc,12 + fragment 311M
> > 64k pcluster + T0 + lz4hc,12 + fragment 295M
> > 128k pcluster + T0 + lz4hc,12 + fragment 287M
> > 32k pcluster + T0 + lz4hc,12 + fragment + dedupe 286M
> > 64k pcluster + T0 + lz4hc,12 + fragment + dedupe 281M
> > 128k pcluster + T0 + lz4hc,12 + fragment + dedupe 278M
> >
> > Tested on SquashFS (which uses level 12 by default for lz4hc):
> >
> > 32k block + lz4hc 332M
> > 64k block + lz4hc 304M
> > 128k block + lz4hc 283M
> > 256k block + lz4hc 273M
> > 256k block + lz4hc + noI 278M
> >
> > Suggested-by: Gao Xiang <hsiangkao at linux.alibaba.com>
> > Signed-off-by: Yue Hu <huyue2 at coolpad.com>
> > ---
> > v4:
> > - renaming include tofcrc/new_fragmentsize
> > - move fixup into ctx
> > - use may_fixing to check packing fragment or not
> > - move sb/inode flag + 64bits case from erofs_pack_fragments() to new
> > helper erofs_fragments_commit()
> > - move recompress ahead of may_inline case when compressing succeeds
> > - update commit message/code comments
> > - note that decompress will fail when enable ztailpacking at the same
> > time, need some time to debug
No need to care may_inline case if we find duplicate fragment.
- bool may_inline = (cfg.c_ztailpacking && final);
+ bool may_inline = (cfg.c_ztailpacking && final &&
+ !inode->fragment_size);
Should be included in v5.
> >
> > v3:
More information about the Linux-erofs
mailing list