[PATCH 2/2] erofs-utils: lib: improve freeing hashmap in erofs_blob_exit()

Gao Xiang hsiangkao at linux.alibaba.com
Fri May 24 15:28:21 AEST 2024


Hi Sandeep,

On 2024/5/24 05:01, Sandeep Dhavale wrote:
> Depending on size of the filesystem being built there can be huge number
> of elements in the hashmap. Currently we call hashmap_iter_first() in
> while loop to iterate and free the elements. However technically
> correct, this is inefficient in 2 aspects.
> 
> - As we are iterating the elements for removal, we do not need overhead of
> rehashing.
> - Second part which contributes hugely to the performance is using
> hashmap_iter_first() as it starts scanning from index 0 throwing away
> the previous successful scan. For sparsely populated hashmap this becomes
> O(n^2) in worst case.
> 
> Lets fix this by disabling hashmap shrink which avoids rehashing
> and use hashmap_iter_next() which is now guaranteed to iterate over
> all the elements while removing while avoiding the performance pitfalls
> of using hashmap_iter_first().
> 
> Test with random data shows performance improvement as:
> 
> fs_size  Before   After
> 1G 	 23s 	  7s
> 2G 	 81s      15s
> 4G	 272s     31s
> 8G 	 1252s	  61s

Sigh.. BTW, in the long term, I guess we might need to
find a better hashmap implementation (with MIT or BSD
license) instead of this one.

> 
> Signed-off-by: Sandeep Dhavale <dhavale at google.com>> ---
>   lib/blobchunk.c | 8 +++++++-
>   1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/blobchunk.c b/lib/blobchunk.c
> index 645bcc1..8082aa4 100644
> --- a/lib/blobchunk.c
> +++ b/lib/blobchunk.c
> @@ -548,11 +548,17 @@ void erofs_blob_exit(void)
>   	if (blobfile)
>   		fclose(blobfile);
>   
> -	while ((e = hashmap_iter_first(&blob_hashmap, &iter))) {
> +	/* Disable hashmap shrink, effectively disabling rehash.
> +	 * This way we can iterate over entire hashmap efficiently
> +	 * and safely by using hashmap_iter_next() */

I will fix up the comment style manually, otherwise it
looks good to me...

Reviewed-by: Gao Xiang <hsiangkao at linux.alibaba.com>

Thanks,
Gao Xiang


More information about the Linux-erofs mailing list