[RFCv2] erofs-utils:code for detecting and tracking holes in uncompressed sparse files.

Pratik Shinde pratikshinde320 at gmail.com
Thu Dec 26 16:42:09 AEDT 2019


Thanks Gao.

Now I understand the purpose.
So with i_format we will be able to recognize which path to take. i.e fast
path (flat mode) or slow path(i.e to search through extent list).
I am working on it.

--Pratik.

On Tue, Dec 24, 2019 at 4:46 PM Gao Xiang <gaoxiang25 at huawei.com> wrote:

> On Tue, Dec 24, 2019 at 04:15:47PM +0530, Pratik Shinde wrote:
> > Hi Gao,
> >
> > No no. What I am saying is - in the current code (excluding all my
> changes)
> > the block lookup will happens in constant time. with only hole list it
>
> Not only lookup but other interfaces such as fiemap, that is why called
> flat mode and fast path.
>
> > won't be O(1) time but rather we have to traverse the holes list. (say in
> > binary search way).
> > what I don't understand is - what is the purpose of tracking data
> extents.
> > hope you get it.
>
> Mode plain and inline are called flat modes, which is the most common
> case of regular and dir files. You can see that's the fastest path for
> most file accesses (minimum metadata).
>
> The reason why don't extend the flat modes but introduce another new
> sparse mode for 3 main reasons:
>  1) introduce a complete enhanced new extent table (or later B+-tree);
>  2) we don't even know how many holes in the file if we only read
>     inode base metadata, some extra header (no matter extent or hole
>     header) need to be readed in advance;
>  3) Old kernel backward compatibility need to be considered, not all
>     files are sparsed, and we need to get them work properly, and rest
>     files are sparsed, we need to block such files from accessed by
>     old kernels;
>
> Note that i_format is for such use, so we can introduce sparse mode
> with some enhanced on-disk representation (but with more metadata
> read amplification than flat modes).
>
> So if files without holes it should be considered as flat modes (fast
> path), and then considering the slow path --- upcoming sparse mode.
>
> The purpose of tracking data extents is we could then use it
> for deduping, repeated data or data redirect. Hole can only be 0
> though.
>
> Thanks,
> Gao Xiang
>
> >
> > --Pratik.
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ozlabs.org/pipermail/linux-erofs/attachments/20191226/65f86399/attachment.htm>


More information about the Linux-erofs mailing list