[PATCH v6 2/2] erofs: add multiple device support

Gao Xiang xiang at kernel.org
Sun Oct 17 15:15:24 AEDT 2021


Hi Chao,

On Sun, Oct 17, 2021 at 10:10:15AM +0800, Chao Yu wrote:
> On 2021/10/14 16:10, Gao Xiang wrote:
> > In order to support multi-layer container images, add multiple
> > device feature to EROFS. Two ways are available to use for now:
> > 
> >   - Devices can be mapped into 32-bit global block address space;
> >   - Device ID can be specified with the chunk indexes format.
> > 
> > Note that it assumes no extent would cross device boundary and mkfs
> > should take care of it seriously.
> > 
> > In the future, a dedicated device manager could be introduced then
> > thus extra devices can be automatically scanned by UUID as well.
> > 
> > Cc: Chao Yu <chao at kernel.org>
> > Reviewed-by: Liu Bo <bo.liu at linux.alibaba.com>
> > Signed-off-by: Gao Xiang <hsiangkao at linux.alibaba.com>
> > ---
> > changes since v5:
> >   - update the outdated comment of on-disk device id;
> >   - add some description about device_id_mask: which is calculated by
> >     using valid bits of extra_devices + 1. Thus the rest bits can be
> >     used for userdata to record extra information.
> > 
> >   Documentation/filesystems/erofs.rst |  12 ++-
> >   fs/erofs/Kconfig                    |  24 +++--
> >   fs/erofs/data.c                     |  73 ++++++++++---
> >   fs/erofs/erofs_fs.h                 |  22 +++-
> >   fs/erofs/internal.h                 |  35 ++++++-
> >   fs/erofs/super.c                    | 156 ++++++++++++++++++++++++++--
> >   fs/erofs/zdata.c                    |  20 +++-
> >   7 files changed, 296 insertions(+), 46 deletions(-)
> > 
> > diff --git a/Documentation/filesystems/erofs.rst b/Documentation/filesystems/erofs.rst
> > index b97579b7d8fb..01df283c7d04 100644
> > --- a/Documentation/filesystems/erofs.rst
> > +++ b/Documentation/filesystems/erofs.rst
> > @@ -19,9 +19,10 @@ It is designed as a better filesystem solution for the following scenarios:
> >      immutable and bit-for-bit identical to the official golden image for
> >      their releases due to security and other considerations and
> > - - hope to save some extra storage space with guaranteed end-to-end performance
> > -   by using reduced metadata and transparent file compression, especially
> > -   for those embedded devices with limited memory (ex, smartphone);
> > + - hope to minimize extra storage space with guaranteed end-to-end performance
> > +   by using compact layout, transparent file compression and direct access,
> > +   especially for those embedded devices with limited memory and high-density
> > +   hosts with numerous containers;
> >   Here is the main features of EROFS:
> > @@ -51,7 +52,9 @@ Here is the main features of EROFS:
> >    - Support POSIX.1e ACLs by using xattrs;
> >    - Support transparent data compression as an option:
> > -   LZ4 algorithm with the fixed-sized output compression for high performance.
> > +   LZ4 algorithm with the fixed-sized output compression for high performance;
> > +
> > + - Multiple device support for multi-layer container images.
> >   The following git tree provides the file system user-space tools under
> >   development (ex, formatting tool mkfs.erofs):
> > @@ -87,6 +90,7 @@ cache_strategy=%s      Select a strategy for cached decompression from now on:
> >   dax={always,never}     Use direct access (no page cache).  See
> >                          Documentation/filesystems/dax.rst.
> >   dax                    A legacy option which is an alias for ``dax=always``.
> > +device=%s              Specify a path to an extra device to be used together.
> >   ===================    =========================================================
> >   On-disk details
> > diff --git a/fs/erofs/Kconfig b/fs/erofs/Kconfig
> > index 14b747026742..addfe608d08e 100644
> > --- a/fs/erofs/Kconfig
> > +++ b/fs/erofs/Kconfig
> > @@ -6,16 +6,22 @@ config EROFS_FS
> >   	select FS_IOMAP
> >   	select LIBCRC32C
> >   	help
> > -	  EROFS (Enhanced Read-Only File System) is a lightweight
> > -	  read-only file system with modern designs (eg. page-sized
> > -	  blocks, inline xattrs/data, etc.) for scenarios which need
> > -	  high-performance read-only requirements, e.g. Android OS
> > -	  for mobile phones and LIVECDs.
> > +	  EROFS (Enhanced Read-Only File System) is a lightweight read-only
> > +	  file system with modern designs (e.g. no buffer heads, inline
> > +	  xattrs/data, chunk-based deduplication, multiple devices, etc.) for
> > +	  scenarios which need high-performance read-only solutions, e.g.
> > +	  smartphones with Android OS, LiveCDs and high-density hosts with
> > +	  numerous containers;
> > -	  It also provides fixed-sized output compression support,
> > -	  which improves storage density, keeps relatively higher
> > -	  compression ratios, which is more useful to achieve high
> > -	  performance for embedded devices with limited memory.
> > +	  It also provides fixed-sized output compression support in order to
> > +	  improve storage density as well as keep relatively higher compression
> > +	  ratios and implements in-place decompression to reuse the file page
> > +	  for compressed data temporarily with proper strategies, which is
> > +	  quite useful to ensure guaranteed end-to-end runtime decompression
> > +	  performance under extremely memory pressure without extra cost.
> > +
> > +	  See the documentation at <file:Documentation/filesystems/erofs.rst>
> > +	  for more details.
> >   	  If unsure, say N.
> > diff --git a/fs/erofs/data.c b/fs/erofs/data.c
> > index 9db829715652..808234d9190c 100644
> > --- a/fs/erofs/data.c
> > +++ b/fs/erofs/data.c
> > @@ -89,6 +89,7 @@ static int erofs_map_blocks(struct inode *inode,
> >   	erofs_off_t pos;
> >   	int err = 0;
> > +	map->m_deviceid = 0;
> >   	if (map->m_la >= inode->i_size) {
> >   		/* leave out-of-bound access unmapped */
> >   		map->m_flags = 0;
> > @@ -135,14 +136,8 @@ static int erofs_map_blocks(struct inode *inode,
> >   		map->m_flags = 0;
> >   		break;
> >   	default:
> > -		/* only one device is supported for now */
> > -		if (idx->device_id) {
> > -			erofs_err(sb, "invalid device id %u @ %llu for nid %llu",
> > -				  le16_to_cpu(idx->device_id),
> > -				  chunknr, vi->nid);
> > -			err = -EFSCORRUPTED;
> > -			goto out_unlock;
> > -		}
> > +		map->m_deviceid = le16_to_cpu(idx->device_id) &
> > +			EROFS_SB(sb)->device_id_mask;
> >   		map->m_pa = blknr_to_addr(le32_to_cpu(idx->blkaddr));
> >   		map->m_flags = EROFS_MAP_MAPPED;
> >   		break;
> > @@ -155,11 +150,55 @@ static int erofs_map_blocks(struct inode *inode,
> >   	return err;
> >   }
> > +int erofs_map_dev(struct super_block *sb, struct erofs_map_dev *map)
> > +{
> > +	struct erofs_dev_context *devs = EROFS_SB(sb)->devs;
> > +	struct erofs_device_info *dif;
> > +	int id;
> > +
> > +	/* primary device by default */
> > +	map->m_bdev = sb->s_bdev;
> > +	map->m_daxdev = EROFS_SB(sb)->dax_dev;
> > +
> > +	if (map->m_deviceid) {
> > +		down_read(&devs->rwsem);
> > +		dif = idr_find(&devs->tree, map->m_deviceid - 1);
> > +		if (!dif) {
> > +			up_read(&devs->rwsem);
> > +			return -ENODEV;
> > +		}
> > +		map->m_bdev = dif->bdev;
> > +		map->m_daxdev = dif->dax_dev;
> > +		up_read(&devs->rwsem);
> > +	} else if (devs->extra_devices) {
> > +		down_read(&devs->rwsem);
> > +		idr_for_each_entry(&devs->tree, dif, id) {
> > +			erofs_off_t startoff, length;
> > +
> > +			if (!dif->mapped_blkaddr)
> > +				continue;
> > +			startoff = blknr_to_addr(dif->mapped_blkaddr);
> > +			length = blknr_to_addr(dif->blocks);
> > +
> > +			if (map->m_pa >= startoff &&
> > +			    map->m_pa < startoff + length) {
> > +				map->m_pa -= startoff;
> > +				map->m_bdev = dif->bdev;
> > +				map->m_daxdev = dif->dax_dev;
> > +				break;
> 
> File won't locate in multidevices, right? otherwise it needs to shrink mapped length
> as well.

Thanks for your review.

File can be located in multi-devices. But it's intended as I mentioned in the commit
message, each extent won't cross devices, which is guaranteed by mkfs seriously.
Otherwise, it's more complicated to handle (especially for the compression side) and
has no more benefits.

Thanks,
Gao Xiang

> 
> Thanks,


More information about the Linux-erofs mailing list