[PATCH 2/2] erofs-utils: lib: improve freeing hashmap in erofs_blob_exit()
Gao Xiang
hsiangkao at linux.alibaba.com
Fri May 24 15:28:21 AEST 2024
Hi Sandeep,
On 2024/5/24 05:01, Sandeep Dhavale wrote:
> Depending on size of the filesystem being built there can be huge number
> of elements in the hashmap. Currently we call hashmap_iter_first() in
> while loop to iterate and free the elements. However technically
> correct, this is inefficient in 2 aspects.
>
> - As we are iterating the elements for removal, we do not need overhead of
> rehashing.
> - Second part which contributes hugely to the performance is using
> hashmap_iter_first() as it starts scanning from index 0 throwing away
> the previous successful scan. For sparsely populated hashmap this becomes
> O(n^2) in worst case.
>
> Lets fix this by disabling hashmap shrink which avoids rehashing
> and use hashmap_iter_next() which is now guaranteed to iterate over
> all the elements while removing while avoiding the performance pitfalls
> of using hashmap_iter_first().
>
> Test with random data shows performance improvement as:
>
> fs_size Before After
> 1G 23s 7s
> 2G 81s 15s
> 4G 272s 31s
> 8G 1252s 61s
Sigh.. BTW, in the long term, I guess we might need to
find a better hashmap implementation (with MIT or BSD
license) instead of this one.
>
> Signed-off-by: Sandeep Dhavale <dhavale at google.com>> ---
> lib/blobchunk.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/lib/blobchunk.c b/lib/blobchunk.c
> index 645bcc1..8082aa4 100644
> --- a/lib/blobchunk.c
> +++ b/lib/blobchunk.c
> @@ -548,11 +548,17 @@ void erofs_blob_exit(void)
> if (blobfile)
> fclose(blobfile);
>
> - while ((e = hashmap_iter_first(&blob_hashmap, &iter))) {
> + /* Disable hashmap shrink, effectively disabling rehash.
> + * This way we can iterate over entire hashmap efficiently
> + * and safely by using hashmap_iter_next() */
I will fix up the comment style manually, otherwise it
looks good to me...
Reviewed-by: Gao Xiang <hsiangkao at linux.alibaba.com>
Thanks,
Gao Xiang
More information about the Linux-erofs
mailing list