Problem in EROFS: Not able to read the files after mount
Gao Xiang
hsiangkao at aol.com
Fri Mar 20 22:16:36 AEDT 2020
Hi Saumya,
On Fri, Mar 20, 2020 at 01:30:39PM +0530, Saumya Panda wrote:
> Hi Gao,
> I am trying to evaluate Erofs on my device. Right now SquashFS is used
> for system files. Hence I am trying to compare Erofs with SquashFs. On my
> device with the below environment I am seeing Erofs is 3 times faster than
> SquashFS 128k (I used enwik8 (100MB) as testing file)) while doing Seq
> Read. Your test result shows it is near to SquasFs 128k. How Erofs is so
> fast for Seq Read? I also tested it on Suse VM with low memory(free
> memory 425MB) and I am seeing Erofs is pretty fast.
>
> Also Can you tell me how to run FIO on directory instead of files ?
> fio -filename=$i -rw=read -bs=4k -name=seqbench
Thanks for your detailed words.
Firstly, I cannot think out some way to run FIO on directory directly.
And maybe some numbers below are still strange in my opinion.
OK, Actually, I don't want to leave a lot of (maybe aggressive) comments
publicly to compare one filesystem with other filesystems, such as EROFS
vs squashfs (or ext4 vs f2fs). But there are actually some exist materials
which did this before, if you have some extra time, you could read through
the following reference materials about EROFS (although some parts are outdated):
[1] https://static.sched.com/hosted_files/kccncosschn19chi/ce/EROFS%20file%20system_OSS2019_Final.pdf
[2] https://www.usenix.org/system/files/atc19-gao.pdf
The reason why I think in this way is that (Objectively, I think) people
have their own judgement / insistance on every stuffs. But okay, there are
some hints why EROFS behaves well in this email (compared with Squashfs, but
I really want to avoid such aggressive topics):
o EROFS has carefully designed critial paths, such as async decompression
path. that partly answers your question about sequential read behavior;
o EROFS has well-designed compression metadata (called EROFS compacted
index). Each logic compressed block only takes 2-byte metadata on average
(high information entropy, so no need to compress compacted indexes again)
and it supports random read without pervious meta dependence. In contrast,
the on-disk metadata of Squashfs doesn't support random read (and even
metadata itself could be compressed), which means you have to cached more
metadata in memory for random read, or you'll stand with its bad metadata
random access performance. some hint: see ondisk blocklist, index cache
and read_blocklist();
o EROFS firstly uses fixed-sized output compression in filesystem field.
By using fixed-sized output compression, EROFS can easily implement
in-place decompression (or at least in-place I/O), which means that it
doesn't allocate physical pages for most cases, therefore fewer memory
reclaim/compaction possibility and keeps useful file-backed page cache
as much as possible;
o EROFS has designed on-disk directory format, it supports directory
random access compared with current Squashfs;
In a word, I don't think the current on-disk squashfs is a well-designed
stuff in the long term. In other words, EROFS is a completely different
stuff either from its principle, the on-disk format and runtime
implementation...)
By the way, the pervious link
https://blog.csdn.net/scnutiger/article/details/102507596
was _not_ written by me. I just noticed it by chance, I think
it was written by some Chinese kernel developer from some other
Android vendor.
And FIO cannot benchmark all cases, heavy memory workload
doesn't completely equal to low memory as well.
However, there is my FIO test script to benchmark different fses:
https://github.com/erofs/erofs-openbenchmark/blob/master/fio-benchmark.sh
for reference. Personally, I think it's reasonable.
It makes more sense to use designed dynamic model. Huawei interally uses
several well-designed light/heavy workloads to benchmark the whole system.
In addition, I noticed many complaints about Squashfs, e.g:
https://forum.snapcraft.io/t/squashfs-is-a-terrible-storage-format/9466
I don't want to comment the whole content itself. But for such runtime
workloads, I'd suggest using EROFS instead and see if it performs better
(compared with any configuration of squashfs+lz4).
There are many ongoing stuffs to do, but I'm really busy recently. After
implementing LZMA and larger compress cluster, I think EROFS will be more
useful, but it needs to be carefully designed first in order to avoid
further complexity of the whole solution.
Sorry about my English, hope it of some help..
Thanks,
Gao Xiang
>
> Test on Embedded Device:
>
> Total Memory 5.5 GB:
>
> Free Memory 1515
>
> No Swap
>
>
> $: /fio/erofs_test]$ free -m
>
> total used free shared buff/cache
> available
>
> Mem: 5384 2315 1515 1378 1553
> 1592
>
> Swap: 0 0 0
>
>
>
>
>
> Seq Read
>
>
>
> Rand Read
>
>
>
>
>
> squashFS 4k
>
>
>
> 51.8MB/s
>
> 1931msec
>
> 45.7MB/s
>
> 2187msec
>
>
>
> SquashFS 128k
>
>
>
> 116MB/s
>
> 861msec
>
> 14MB/s
>
> 877msec
>
>
>
> SquashFS 1M
>
>
>
> 124MB/s-124MB/s
>
> 805msec
>
> 119MB/s
>
> 837msec
>
>
>
>
>
> Erofs 4k
>
>
>
> 658MB/s-658MB/s
>
> 152msec
>
>
>
> 103MB
>
> 974msec
>
>
>
>
>
>
>
> Test on Suse VM:
>
>
> Total Memory 1.5 GB:
>
> Free Memory 425
>
> No Swap
>
> localhost:/home/saumya/Documents/erofs_test # free -m
> total used free shared buff/cache
> available
> Mem: 1436 817 425 5 192
> 444
> Swap: 0 0 0
>
>
>
>
>
>
> Seq Read
>
>
>
> Rand Read
>
>
>
>
>
> squashFS 4k
>
>
>
> 30.7MB/s
>
> 3216msec
>
> 9333kB/s
>
> 10715msec
>
>
>
> SquashFS 128k
>
>
>
> 318MB/s
>
> 314msec
>
> 5946kB/s
>
> 16819msec
>
>
>
>
>
>
>
>
>
>
>
> Erofs 4k
>
>
>
> 469MB/s
>
> 213msec
>
>
>
> 11.9MB/s
>
> 8414msec
>
>
>
>
>
>
>
>
>
>
>
> On Wed, Jan 29, 2020 at 10:30 AM Gao Xiang <hsiangkao at aol.com> wrote:
>
> > On Wed, Jan 29, 2020 at 09:43:37AM +0530, Saumya Panda wrote:
> > >
> > > localhost:~> fio --name=randread --ioengine=libaio --iodepth=16
> > > --rw=randread --bs=4k --direct=0 --size=512M --numjobs=4 --runtime=240
> > > --group_reporting --filename=/mnt/enwik9_erofs/enwik9
> > >
> > > randread: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T)
> > > 4096B-4096B, ioengine=libaio, iodepth=16
> >
> > And I don't think such configuration is useful to calculate read
> > ampfication
> > since you read 100% finally, use multi-thread without memory limitation
> > (all
> > compressed data will be cached, so the total read is compressed size).
> >
> > I have no idea what you want to get via doing comparsion between EROFS and
> > Squashfs. Larger block size much like readahead in bulk. If you benchmark
> > uncompressed file systems, you will notice such filesystems cannot get such
> > high 100% randread number.
> >
> > Thank,
> > Gao Xiang
> >
> >
>
> --
> Thanks,
> Saumya Prakash Panda
More information about the Linux-erofs
mailing list