[PATCH v2] mkfs: Fix input offset counting in headerball mode

Gao Xiang hsiangkao at linux.alibaba.com
Tue Nov 12 12:05:39 AEDT 2024


Hi Mike,

On 2024/11/12 00:48, Mike Baynton wrote:
> When using --tar=headerball, most files included in the headerball are
> not included in the EROFS image. mkfs.erofs typically exits prematurely,
> having processed non-USTAR blocks as USTAR and believing they are
> end-of-archive markers. (Other failure modes are probably also possible
> if the input stream doesn't look like end-of-archive markers at the
> locations that are being read.)
> 
> This is because we lost correct count of bytes that are read from the
> input stream when in headerball (or ddtaridx) modes. We were assuming that
> in these modes no data would be read following the ustar block, but in
> case of things like PAX headers, lots more data may be read without
> incrementing tar->offset.
> 
> This corrects by always incrementing the offset counter, and then
> decrementing it again in the one case where headerballs differ -
> regular file data blocks are not present.
> 
> Signed-off-by: Mike Baynton <mike at mbaynton.com>
> ---
> 
> Thanks Gao for the suggestion, looks good to me and tests ok on my
> sample headerball inputs. Let me know if you want me to resubmit this
> with Co-developed-by / Signed-off-by you.

I will add "erofs-utils:" prefix to the patch subject but no need
to add "Co-developed-by" tag.

Btw, if some converter for headerball files from tarballs is
available in public? It'd be better to get some tests for this
feature.  `ddtaridx` is designed by some other team in Alibaba
so I don't have a valid simple generator/converter too...

Thanks,
Gao Xiang

> 
>   lib/tar.c | 5 +++--
>   1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/tar.c b/lib/tar.c
> index b32abd4..990c6cb 100644
> --- a/lib/tar.c
> +++ b/lib/tar.c
> @@ -808,13 +808,14 @@ out_eot:
>   	}
>   
>   	dataoff = tar->offset;
> -	if (!(tar->headeronly_mode || tar->ddtaridx_mode))
> -		tar->offset += st.st_size;
> +	tar->offset += st.st_size;
>   	switch(th->typeflag) {
>   	case '0':
>   	case '7':
>   	case '1':
>   		st.st_mode |= S_IFREG;
> +		if (tar->headeronly_mode || tar->ddtaridx_mode)
> +			tar->offset -= st.st_size;
>   		break;
>   	case '2':
>   		st.st_mode |= S_IFLNK;



More information about the Linux-erofs mailing list