[PATCH 2/2] erofs-utils: lib: improve freeing hashmap in erofs_blob_exit()

Sandeep Dhavale dhavale at google.com
Fri May 24 07:01:31 AEST 2024


Depending on size of the filesystem being built there can be huge number
of elements in the hashmap. Currently we call hashmap_iter_first() in
while loop to iterate and free the elements. However technically
correct, this is inefficient in 2 aspects.

- As we are iterating the elements for removal, we do not need overhead of
rehashing.
- Second part which contributes hugely to the performance is using
hashmap_iter_first() as it starts scanning from index 0 throwing away
the previous successful scan. For sparsely populated hashmap this becomes
O(n^2) in worst case.

Lets fix this by disabling hashmap shrink which avoids rehashing
and use hashmap_iter_next() which is now guaranteed to iterate over
all the elements while removing while avoiding the performance pitfalls
of using hashmap_iter_first().

Test with random data shows performance improvement as:

fs_size  Before   After
1G 	 23s 	  7s
2G 	 81s      15s
4G	 272s     31s
8G 	 1252s	  61s

Signed-off-by: Sandeep Dhavale <dhavale at google.com>
---
 lib/blobchunk.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/lib/blobchunk.c b/lib/blobchunk.c
index 645bcc1..8082aa4 100644
--- a/lib/blobchunk.c
+++ b/lib/blobchunk.c
@@ -548,11 +548,17 @@ void erofs_blob_exit(void)
 	if (blobfile)
 		fclose(blobfile);
 
-	while ((e = hashmap_iter_first(&blob_hashmap, &iter))) {
+	/* Disable hashmap shrink, effectively disabling rehash.
+	 * This way we can iterate over entire hashmap efficiently
+	 * and safely by using hashmap_iter_next() */
+	hashmap_disable_shrink(&blob_hashmap);
+	e = hashmap_iter_first(&blob_hashmap, &iter);
+	while (e) {
 		bc = container_of((struct hashmap_entry *)e,
 				  struct erofs_blobchunk, ent);
 		DBG_BUGON(hashmap_remove(&blob_hashmap, e) != e);
 		free(bc);
+		e = hashmap_iter_next(&iter);
 	}
 	DBG_BUGON(hashmap_free(&blob_hashmap));
 
-- 
2.45.1.288.g0e0cd299f1-goog



More information about the Linux-erofs mailing list