[PATCH v2 1/2] erofs-utils: lib: validate ZSTD frame content size in decompression
Utkal Singh
singhutkal015 at gmail.com
Tue Mar 17 21:00:59 AEDT 2026
Thanks for the direction, Gao Xiang.
Understood — switching to ZSTD streaming APIs (ZSTD_decompressStream)
would eliminate the ZSTD_getFrameContentSize() /
ZSTD_getDecompressedSize() dependency entirely and align
erofs-utils with the kernel implementation.
I'll work on a v3 using the streaming approach.
- Utkal
On Tue, 17 Mar 2026 at 15:23, Gao Xiang <hsiangkao at linux.alibaba.com> wrote:
>
>
> On 2026/3/17 12:55, Utkal Singh wrote:
> > ZSTD_getFrameContentSize() reads the content size from the ZSTD
> > frame header in the compressed data. This is untrusted on-disk
> > metadata, independent from the extent map that provides
> > rq->decodedlength via z_erofs_map_blocks_iter().
> >
> > A crafted EROFS image can set the extent map to claim a decoded
> > length larger than the actual ZSTD frame content size. When this
> > happens, a buffer of the (smaller) frame content size is allocated
> > and decompressed into, but the subsequent memcpy copies
> > rq->decodedlength bytes from it -- a potential out-of-bounds read.
> >
> > Additionally, the ZSTD_getDecompressedSize() legacy fallback
> > returns 0 for frames without a content size field. This leads to
> > malloc(0) followed by out-of-bounds access on the returned pointer.
> >
> > Reject frames where the reported content size is zero or smaller
> > than the expected decoded length.
> >
> > Reproducer:
> > mkdir testdir
> > python3 -c "open('testdir/f','wb').write(b'A'*131072)"
> > mkfs.erofs -zzstd test.erofs testdir/
> > python3 -c "d=bytearray(open('test.erofs','rb').read());\
> > p=d.find(b'\x28\xb5\x2f\xfd');d[p+4]=0x20;d[p+5]=0x01;\
> > open('test.erofs','wb').write(d)"
> > fsck.erofs --extract=out test.erofs
> > # Expected: ZSTD frame content size 1 < decoded length 131072
> >
> > Signed-off-by: Utkal Singh <singhutkal015 at gmail.com>
> > ---
> > lib/decompress.c | 7 +++++++
> > 1 file changed, 7 insertions(+)
> >
> > diff --git a/lib/decompress.c b/lib/decompress.c
> > index 3e7a173..fb81039 100644
> > --- a/lib/decompress.c
> > +++ b/lib/decompress.c
> > @@ -48,7 +48,14 @@ static int z_erofs_decompress_zstd(struct
> z_erofs_decompress_req *rq)
> > #else
> > total = ZSTD_getDecompressedSize(src + inputmargin,
> > rq->inputsize - inputmargin);
> > + if (!total)
> > + return -EFSCORRUPTED;
>
> hmm, that is the difference between the kernel and erofs-utils
> implementation.
>
> the kernel uses zstd streaming APIs, so it won't malloc()
> a new buffer in advance, actually I think erofs-utils should
> switch to streaming APIs too, in order to avoid
>
> ZSTD_getFrameContentSize() and ZSTD_getDecompressedSize()
>
> dependencies as you said in the commit message.
>
> Thanks,
> Gao Xiang
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ozlabs.org/pipermail/linux-erofs/attachments/20260317/f303fccb/attachment.htm>
More information about the Linux-erofs
mailing list