[PATCH v3] erofs-utils: mkfs: support fragment deduplication

Yue Hu zbestahu at gmail.com
Wed Nov 30 20:16:36 AEDT 2022


On Wed, 30 Nov 2022 17:02:27 +0800
Gao Xiang <hsiangkao at linux.alibaba.com> wrote:

> On Wed, Nov 30, 2022 at 04:55:34PM +0800, Yue Hu wrote:
> 
> ...
> 
> > > > @@ -782,22 +846,25 @@ int erofs_write_compressed_file(struct erofs_inode *inode, int fd)
> > > >  	ctx.head = ctx.tail = 0;
> > > >  	ctx.clusterofs = 0;
> > > >  	ctx.e.length = 0;
> > > > -	remaining = inode->i_size;
> > > > +	ctx.e.compressedblks = 0;
> > > > +	ctx.pclustersize = z_erofs_get_max_pclusterblks(inode) * EROFS_BLKSIZ;
> > > > +	ctx.remaining = inode->i_size - inode->fragment_size;
> > > >  
> > > > -	while (remaining) {
> > > > -		const u64 readcount = min_t(u64, remaining,
> > > > +	while (ctx.remaining) {
> > > > +		const u64 readcount = min_t(u64, ctx.remaining,
> > > >  					    sizeof(ctx.queue) - ctx.tail);
> > > > +		bool fixup = ret < 0;    
> > > 
> > > 	drop this one;
> > >   
> > > >  
> > > >  		ret = read(fd, ctx.queue + ctx.tail, readcount);
> > > >  		if (ret != readcount) {
> > > >  			ret = -errno;
> > > >  			goto err_bdrop;
> > > >  		}
> > > > -		remaining -= readcount;
> > > > +		ctx.remaining -= readcount;
> > > >  		ctx.tail += readcount;
> > > >  
> > > > -		ret = vle_compress_one(&ctx, !remaining);    
> > > 
> > > 					     ret == -EAGAIN  
> > 
> > Just move 'fixup' in 'ctx'?  
> 
> Yes.
> 
> >   
> > >   
> > > > -		if (ret)
> > > > +		ret = vle_compress_one(&ctx, fixup);
> > > > +		if (ret && ret != -EAGAIN)
> > > >  			goto err_free_idata;
> > > >  	}
> > > >  	DBG_BUGON(ctx.head != ctx.tail);
> > > > @@ -807,6 +874,16 @@ int erofs_write_compressed_file(struct erofs_inode *inode, int fd)
> > > >  	DBG_BUGON(compressed_blocks < !!inode->idata_size);
> > > >  	compressed_blocks -= !!inode->idata_size;
> > > >  
> > > > +	if (inode->fragment_size && ctx.e.compressedblks) {    
> > > 
> > > why not moving into z_erofs_fragments_dedupe_update()?  
> > 
> > I consider this before, it's a bit awkward for me due to non updating case.
> > 
> > Let me reconsider.  
> 
> What does that mean? "non updating case", could you please write down
> some formal words to describe this case?

"update" means we will update the fragment (size and offset), so "non updating" means
duplicate fragment found in dedupe will not be updated.

Anyway, i will think more about it.

Thanks.

> 
> Thanks,
> Gao Xiang



More information about the Linux-erofs mailing list