[RFC PATCH 0/1] erofs: change to use asynchronous io for fscache readahead

Gao Xiang hsiangkao at linux.alibaba.com
Fri Apr 29 22:36:16 AEST 2022

Hi Xin,

On Fri, Apr 29, 2022 at 07:38:48AM +0800, Xin Yin wrote:
> Hi Jeffle & Xiang
> I have tested your fscache,erofs: fscache-based on-demand read semantics 
> v9 patches sets https://www.spinics.net/lists/linux-fsdevel/msg216178.html.
> For now , it works fine with the nydus image-service. After the image data 
> is fully loaded to local storage, it does have great IO performance gain 
> compared with nydus V5 which is based on fuse.

Yeah, thanks for your interest and efforts. Actually I'm pretty sure you
could observe CPU, bandwidth and latency improvement on the dense deployed
scenarios since our goal is to provide native performance when the data is
ready, as well as image on-demand read, flexible cache data management to
end users.

> For 4K random read , fscache-based erofs can get the same performance with 
> the original local filesystem. But I still saw a performance drop in the 4K 
> sequential read case. And I found the root cause is in erofs_fscache_readahead() 
> we use synchronous IO , which may stall the readahead pipelining.

Yeah, that is a known TODO, in principle, when such part of data is locally
available, it will have the similar performance (bandwidth, latency, CPU
loading) as loop device. But we don't implement asynchronous I/O for now,
since we need to make the functionality work first, so thanks for your
patch addressing this.

> I have tried to change to use asynchronous io during erofs fscache readahead 
> procedure, as what netfs did. Then I saw a great performance gain.
> Here are my test steps and results:
> - generate nydus v6 format image , in which stored a large file for IO test.
> - launch nydus image-service , and  make image data fully loaded to local storage (ext4).
> - run fio with below cmd.
> fio -ioengine=psync -bs=4k -size=5G -direct=0 -thread -rw=read -filename=./test_image  -name="test" -numjobs=1 -iodepth=16 -runtime=60

Yeah, although I can see what you mean (to test buffered I/O), the
argument is still somewhat messy (maybe because we don't support
fscache-based direct I/O for now. That is another TODO but with
low priority.)

> v9 patches: 202654 KB/s
> v9 patches + async readahead patch: 407213 KB/s
> ext4: 439912 KB/s

May I ask if such ext4 image is through a loop device? If not, that is
reasonable. Anyway, it's not a big problem for now, we could optimize
it later since it should be exactly the same finally.

And I will drop a message to Jeffle for further review since we're
closing to another 5-day national holiday.

Thanks again!
Gao Xiang

More information about the Linux-erofs mailing list