EROFS sometimes not filling tail page with zeroes
Gao Xiang
hsiangkao at linux.alibaba.com
Thu Aug 31 16:59:31 AEST 2023
(+cc erofs mailing list)
On 2023/8/31 14:32, keltargw wrote:
> My test program is this:
>
> #include <stdio.h>
> #include <fcntl.h>
> #include <sys/types.h>
> #include <sys/stat.h>
> #include <unistd.h>
> #include <sys/io.h>
> #include <sys/mman.h>
>
> int main(int argc, char const *argv[])
> {
> unsigned char *f;
> int size;
> struct stat s;
> const char * file_name = argv[1];
> int fd = open (argv[1], O_RDONLY);
> if(!fd) {
> return 0;
> }
>
> int status = fstat (fd, & s);
> size = s.st_size;
> if(!(s.st_mode & S_IFREG)) {
> return 0;
> }
>
> f = (char *) mmap (0, size, PROT_READ, MAP_PRIVATE, fd, 0);
> if(!f) {
> return 0;
> }
> #if 0
> for(int i = 0; i < 128; ++i) {
> fprintf(stderr, "%x(%c) ", f[size+i-64], f[size+i-64]);
> }
> fprintf(stderr, "\n");
> #endif
> if(size % 4096 && f[size] != 0) {
> fprintf(stderr, "%s failure\n", argv[1]);
> return 1;
> }
>
> return 0;
> }
>
>
> I run it on known "wrong" files, or via `finderofs_mount_point -type f -exec ./mmap_test {} \;` for the entire file system.
>
> The patch you suggested seems to fix the issue, it is no longer reproducible or any file.
Actually I think this issue was fixed in commit e4c1cf523d82
("erofs: tidy up z_erofs_do_read_page()") by accident:
https://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs.git/commit/?id=e4c1cf523d820730a86cae2c6d55924833b6f7ac
So it seems that it will be fixed in v6.6-rc1.
Let me find a way to backport this partial part to old LTS
kernels.
Thanks,
Gao Xiang
>
> Thanks!
>
>
> On Thu, 31 Aug 2023 at 05:06, Gao Xiang <hsiangkao at linux.alibaba.com <mailto:hsiangkao at linux.alibaba.com>> wrote:
>
> Hi,
>
> On 2023/8/30 18:24, keltargw wrote:
> > Hello. I'm not sure if you're the right person to ask this about, I noticed most of erofs commits are done by you. I noticed problematic behaviour of EROFS driver with LZ4 compression enabled: on some files, if I do mmap(), the tail of the last page (after file ends) is not filled with zeroes but contains something gabage-like. This breaks e.g. clang-cpp as it uses mmap extensively, if headers are stored in lz4-compressed erofs. In my observations, it happens if file is lz4 compressed but last segment (not sure what terminology is used there) is not. It happens all the times on e.g. `/usr/include/stdlib.h` and `/usr/include/wchar.h`, as well as `/usr/include/linux/fs.h` (but on this file it is triggered a bit differently), and many other files. In my test I packed my entire system to lz4hc erofs (different lz4 options trigger problem on different files) and mounted it to subdirectory, but I could reproduce it on different distros and on latest 6.5 kernel. It is not
> > reproducible on uncompressed erofs.
> > I've deduced the problem to this small patch, but while it fixes the problem it doesn't look like it is the best place to do zeroing. Could you take a look at it please?
>
> I think it may have the issue but do you have some simple
> mmap reproducer for me to try if any? Otherwise, I have
> to write a reproducer myself.
>
> >
> > If this is not appropriate way to communicate or I should've asked a different person, could you please direct me the right way?
>
> Nope, I will check the issue. Could you check if the
> following diff fixes your problem:
>
> diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c
> index de4f12152b62..9c9350eb1704 100644
> --- a/fs/erofs/zdata.c
> +++ b/fs/erofs/zdata.c
> @@ -1038,6 +1038,8 @@ static int z_erofs_do_read_page(struct z_erofs_decompress_frontend *fe,
> cur = end - min_t(erofs_off_t, offset + end - map->m_la, end);
> if (!(map->m_flags & EROFS_MAP_MAPPED)) {
> zero_user_segment(page, cur, end);
> + ++spiltted;
> + tight = false;
> goto next_part;
> }
> if (map->m_flags & EROFS_MAP_FRAGMENT) {
>
> Thanks,
> Gao Xiang
>
More information about the Linux-erofs
mailing list