[PATCH 0/2] erofs-utils: improve performance of mkfs with dedupe
Sandeep Dhavale
dhavale at google.com
Fri May 24 07:01:29 AEST 2024
I got a report from AOSP user that performance of mkfs.erofs with dedupe
option that mkfs.erofs time increased to very high number. For example
creation of 8GB uncompressed erofs image increased from 36seconds to
27minutes when dedupe was enabled. After profiling mkfs.erofs for sample
data, I observed that the actual increased in time was coming from
erofs_blob_exit() and debugging further it showed that real inefficiency
was coming from hashmap_iter_first() which starts scanning for the first
element from tablepos = 0 always.
The following patches solve this by
- creating a helper function to disable hashmap shrinking
- using hashmap_iter_next() to avoid scanning from 0 and as rehashing is
disabled it is guaranteed to go through all the elements even while
doing hashmap_remove().
Test results now show order of magnitude improvements for larger
filesystem size.
You can verify the improvements with below steps
$ mkdir fs_data
$ dd if=/dev/urandom of=fs_data/random_file.bin bs=1M count=8192
$ time mkfs.erofs --chunksize=4096 erofs_dedupe.img fs_data
fs_size Before After Improvement
1G 23s 7s 3.2x
2G 81s 15s 5.4x
4G 272s 31s 8.77x
8G 1252s 61s 20.52x
Thanks,
Sandeep
Sandeep Dhavale (2):
erofs-utils: lib: provide helper to disable hashmap shrinking
erofs-utils: lib: improve freeing hashmap in erofs_blob_exit()
include/erofs/hashmap.h | 4 ++++
lib/blobchunk.c | 8 +++++++-
2 files changed, 11 insertions(+), 1 deletion(-)
--
2.45.1.288.g0e0cd299f1-goog
More information about the Linux-erofs
mailing list