[PATCH 0/8] erofs-utils: collection for fragments and dedupe

Gao Xiang hsiangkao at linux.alibaba.com
Tue Sep 27 01:25:03 AEST 2022


This is a collection patchset to resolve conflicts for the following
features:
https://lore.kernel.org/r/cover.1663065968.git.huyue2@coolpad.com (fragments)
and
https://lore.kernel.org/r/20220909045821.104499-1-ZiyangZhang@linux.alibaba.com (dedupe)

Except that the fragment feature still has some TODOs. For example,
duplicated fragment data can still exist in the packed file and
needs to be optimized to avoid fragment data duplication. However
such improvement is purely a userspace strategy improvement and I
think Yue Hu will continue working on this.

I've tested fragments + dedupe with the following testcase:

dataset: Linux 5.10 + Linux 5.10.87 source code
pcluster size: 4k

compressed vanilla		658845696
fragment			634245120
dedupe				488689664
dedupe + fragment               425848832

Gao Xiang (2):
  erofs-utils: introduce z_erofs_inmem_extent
  erofs-utils: fuse: introduce partial-referenced pclusters

Yue Hu (4):
  erofs-utils: fuse: support interlaced uncompressed pcluster
  erofs-utils: lib: support fragments data decompression
  erofs-utils: mkfs: support interlaced uncompressed data layout
  erofs-utils: mkfs: support fragments

Ziyang Zhang (2):
  erofs-utils: lib: add rb-tree implementation
  erofs-utils: mkfs: introduce global compressed data deduplication

 include/erofs/compress.h   |  11 +-
 include/erofs/config.h     |   4 +-
 include/erofs/decompress.h |   3 +
 include/erofs/dedupe.h     |  39 +++
 include/erofs/fragments.h  |  28 ++
 include/erofs/inode.h      |   1 +
 include/erofs/internal.h   |  14 +
 include/erofs_fs.h         |  33 ++-
 lib/Makefile.am            |   5 +-
 lib/compress.c             | 296 +++++++++++++++------
 lib/data.c                 |  29 ++-
 lib/decompress.c           |  19 +-
 lib/dedupe.c               | 176 +++++++++++++
 lib/fragments.c            |  65 +++++
 lib/inode.c                |  58 ++++-
 lib/rb_tree.c              | 512 +++++++++++++++++++++++++++++++++++++
 lib/rb_tree.h              | 104 ++++++++
 lib/rolling_hash.h         |  60 +++++
 lib/super.c                |   1 +
 lib/zmap.c                 |  59 ++++-
 mkfs/main.c                |  77 +++++-
 21 files changed, 1492 insertions(+), 102 deletions(-)
 create mode 100644 include/erofs/dedupe.h
 create mode 100644 include/erofs/fragments.h
 create mode 100644 lib/dedupe.c
 create mode 100644 lib/fragments.c
 create mode 100644 lib/rb_tree.c
 create mode 100644 lib/rb_tree.h
 create mode 100644 lib/rolling_hash.h

-- 
2.24.4



More information about the Linux-erofs mailing list