[RFC PATCH] staging: erofs: add document
Chao Yu
yuchao0 at huawei.com
Mon Jan 14 13:13:39 AEDT 2019
Hi Xiang,
Nice work!
Few trivial comments as below, anyway please add:
Reviewed-by: Chao Yu <yuchao0 at huawei.com>
On 2019/1/12 18:35, Gao Xiang wrote:
> This documents key feature, design, and usage of erofs.
>
> Signed-off-by: Gao Xiang <hsiangkao at aol.com>
> ---
> .../erofs/Documentation/filesystems/erofs.txt | 160 +++++++++++++++++++++
> 1 file changed, 160 insertions(+)
> create mode 100644 drivers/staging/erofs/Documentation/filesystems/erofs.txt
>
> diff --git a/drivers/staging/erofs/Documentation/filesystems/erofs.txt b/drivers/staging/erofs/Documentation/filesystems/erofs.txt
> new file mode 100644
> index 000000000000..f1d6a9701caa
> --- /dev/null
> +++ b/drivers/staging/erofs/Documentation/filesystems/erofs.txt
> @@ -0,0 +1,160 @@
> +Overview
> +========
> +
> +EROFS file-system stands for Enhanced Read-Only File System. Different
> +from other read-only file systems, it aims to be designed for flexibility,
> +scalability, but be kept simple and high performance.
> +
> +Here is the main features of EROFS:
> + - Little endian on-disk design;
> +
> + - 4KB block size and therefore maximum 16TB address space;
> +
> + - Metadata and data could be mixed by design;
> +
> + - 2 inode versions for different requirements:
> + v1 v2
> + Inode metadata size: 32 bytes 64 bytes
> + Max file size: 4 GB 16 EB (limited by max. vol size)
> + Max uids/gids: 65536 4294967296
> + File creation time: no yes (64 + 32-bit timestamp)
> + Max hard links: 65536 4294967296
> + Metadata reserved: 4 14
> +
> + - Support extended attributes (xattrs)
> +
> + - Support xattr inline and tail-end data inline for all files;
> +
> + - Support transparent data compression as an option:
> + LZ4 algorithm with 4 KB fixed-output compression for high performance;
> +
> +The following git tree provides the file system user-space tools under
> +development (ex, formatting tool mkfs.erofs):
> +>> git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs-utils.git
> +
> +Bugs and patches are welcome, please help kindly us and send them to
> +the following mailing list:
> +>> linux-erofs mailing list <linux-erofs at lists.ozlabs.org>
> +
> +Note that EROFS is still working in progress as a Linux staging driver,
> +Cc the staging mailing list is really recommended:
> +>> Linux Driver Project Developer List <devel at driverdev.osuosl.org>
> +
> +Mount options
> +=============
> +
> +fault_injection=%d Enable fault injection in all supported types with
> + specified injection rate.
Supported injection type:
Type_Name Type_Value
FAULT_KMALLOC 0x000000001
> +(no)user_xattr Setup Extended User Attributes. Note: xattr is enabled
> + by default if CONFIG_EROFS_FS_XATTR is selected.
> +(no)acl Setup POSIX Access Control List. Note: acl is enabled
> + by default if CONFIG_EROFS_FS_POSIX_ACL is selected.
> +
> +On-disk details
> +===============
> +
> +Summary
> +-------
> +Different from other read-only file systems, an EROFS volume is designed
> +to be as simple as possible:
> +
> + |-> aligned with the block size
> + ____________________________________________________________
> + | |SB| | ... | Metadata | ... | Data | Metadata | ... | Data |
> + |_|__|_|_____|__________|_____|______|__________|_____|______|
> + 0 +1K
> +
> +All data areas should be aligned with the block size, but metadata areas
> +may not. All metadatas can be now observed in two different spaces (views):
> + 1) Inode metadata space
> + Each valid inode should be aligned with an inode slot, which is a fixed
> + value (32 bytes) and designed to be kept in line with v1 inode size.
> +
> + Each inode can be directly found with the following formula:
> + inode offset = meta_blkaddr * block_size + 32 * nid
> +
> + |-> aligned with 8B
> + |-> followed closely
> + + meta_blkaddr blocks |-> another slot
> + _____________________________________________________________________
> + | ... | inode | xattrs | extents | data inline | ... | inode ...
> + |________|_______|(optional)|(optional)|__(optional)_|_____|__________
> + |-> aligned with the inode slot size
> + . .
> + . .
> + . .
> + . .
> + . .
> + . .
> + .____________________________________________________|-> aligned with 4B
> + | xattr_ibody_header | shared xattrs | inline xattrs |
> + |____________________|_______________|_______________|
> + |-> 12 bytes <-|->x * 4 bytes<-| .
> + . . .
> + . . .
> + . . .
> + ._______________________________.______________________.
> + | id | id | id | id | ... | id | ent | ... | ent| ... |
> + |____|____|____|____|______|____|_____|_____|____|_____|
> + |-> aligned with 4B
> + |-> aligned with 4B
> +
> + Inode could be 32 or 64 bytes, which can be distinguished from a common
> + field which all inode versions have -- i_advise:
> +
> + __________________ __________________
> + | i_advise | | i_advise |
> + |__________________| |__________________|
> + | ... | | ... |
> + | | | |
> + |__________________| 32 bytes | |
> + | |
> + |__________________| 64 bytes
> +
> + Xattrs, extents, data inline are followed by the corresponding inode with
> + proper alignes, and they could be optional for different data mappings,
> + currently there are totally 3 valid data mappings:
> +
> + 1) flat file data without data inline (no extent);
> + 2) fixed-output size data compression (must have extents);
> + 3) flat file data with tail-end data inline (no extent);
> +
> + The size of the optional xattrs is indicated by i_xattr_count in inode
> + header. Large xattrs or xattrs shared by many different files can be
> + stored in shared xattrs metadata rather than inlined right after inode.
> +
> + 2) Shared xattrs metadata space
> + Shared xattrs space is similar to the above inode space, started with
> + a specific block indicated by xattr_blkaddr, organized one by one with
> + proper align.
> +
> + Each share xattr can be found by the following formula:
> + xattr offset = xattr_blkaddr * block_size + 4 * xattr_id
> +
> + |-> aligned by 4 bytes
> + + xattr_blkaddr blocks |-> aligned with 4 bytes
> + _________________________________________________________________________
> + | ... | xattr_entry | xattr data | ... | xattr_entry | xattr data ...
> + |________|_____________|_____________|_____|______________|_______________
> +
> +Directories
> +-----------
> +All directories are now organized in a compact on-disk format. Note that
> +each directory block is divided into index and name areas in order to
> +support random file lookup, and all directory entries are strictly written
> +in alphabetical order in order to support improved prefix binary search
> +algorithm.
> +
> +
> + +--------------------------+
> + / |
> + / +-------------+----------------+
> + / / \|/namelen1 \|/ namelenN-1
| |
v v
> + ____________+______________+___________________________________________
> +| dirent | dirent | ... | dirent | filename | filename | ... | filename |
> +|____0___|____1___|_____|___N-1__|____0_____|____1_____|_____|___N-1____|
> + \ /|\ * could have
^
|
> + \ | trailing '\0'
> + \ |
> + +------------------------+ namelen0
> +
>
More information about the Linux-erofs
mailing list