[PATCH v2] erofs-utils: mkfs: Implement 'dsunit' alignment on blobdev
Friendy.Su at sony.com
Friendy.Su at sony.com
Fri Aug 22 19:19:47 AEST 2025
> + off_t off = lseek(blobfile, 0, SEEK_CUR);
> +
> + erofs_dbg("Try to round up 0x%llx to align on %d blocks (dsunit)",
> + off, sbi->bmgr->dsunit);
> + off = roundup(off, sbi->bmgr->dsunit * erofs_blksiz(sbi));
> + if (lseek(blobfile, off, SEEK_SET) != off) {
> + ret = -errno;
> + erofs_err("lseek to blobdev 0x%llx error", off);
> + goto err;
> + }
> + erofs_dbg("Aligned on 0x%llx", off);
Could we combine these two debugging messages into one?
Here, 'off' is changed after roundup(), we need show both 'before' and 'after' by one variable 'off', it is hard to combine.
Do you have better idea? ^_^
> One tab is not 8 spaces here? it seems indent misalignment.
It's my mis-aligned. I will correct.
Thanks for your comment.
Friendy
________________________________________
From: Su, Friendy <Friendy.Su at sony.com>
Sent: Friday, August 22, 2025 17:05
To: Gao Xiang; linux-erofs at lists.ozlabs.org
Cc: Mo, Yuezhang; Palmer, Daniel (SGC)
Subject: Re: [PATCH v2] erofs-utils: mkfs: Implement 'dsunit' alignment on blobdev
Hi, Gao,
> It should be
> if (sbi->bmgr->dsunit >= 1u << (cfg.c_chunkbits - g_sbi.blkszbits)) {
> }
In main.c, dsunit is set to 0 if warns.
+ if (cfg.c_chunkbits && dsunit && 1u << (cfg.c_chunkbits - g_sbi.blkszbits) < dsunit) {
+ erofs_warn("chunksize %u bytes is smaller than dsunit %u blocks, ignore dsunit !",
+ 1u << cfg.c_chunkbits, dsunit);
+ dsunit = 0;
+ }
so here sbi->bmgr->dsunit is 0.
________________________________________
From: Gao Xiang <hsiangkao at linux.alibaba.com>
Sent: Friday, August 22, 2025 16:55
To: Su, Friendy; linux-erofs at lists.ozlabs.org
Cc: Mo, Yuezhang; Palmer, Daniel (SGC)
Subject: Re: [PATCH v2] erofs-utils: mkfs: Implement 'dsunit' alignment on blobdev
Hi Friendy, On 2025/8/22 16: 42, Friendy Su wrote: > Set proper 'dsunit' to let file body align on huge page on blobdev, > > where 'dsunit' * 'blocksize' = huge page size (2M). > > When do mmap() a file mounted with dax=always,
Hi Friendy,
On 2025/8/22 16:42, Friendy Su wrote:
> Set proper 'dsunit' to let file body align on huge page on blobdev,
>
> where 'dsunit' * 'blocksize' = huge page size (2M).
>
> When do mmap() a file mounted with dax=always, aligning on huge page
> makes kernel map huge page(2M) per page fault exception, compared with
> mapping normal page(4K) per page fault.
>
> This greatly improves mmap() performance by reducing times of page
> fault being triggered.
>
> Considering deduplication, 'chunksize' should not be smaller than
> 'dsunit', then after dedupliation, still align on dsunit.
>
> Signed-off-by: Friendy Su <friendy.su at sony.com>
> Reviewed-by: Yuezhang Mo <Yuezhang.Mo at sony.com>
> Reviewed-by: Daniel Palmer <daniel.palmer at sony.com>
> ---
> lib/blobchunk.c | 15 +++++++++++++++
> man/mkfs.erofs.1 | 15 +++++++++++++++
> mkfs/main.c | 13 +++++++++++++
> 3 files changed, 43 insertions(+)
>
> diff --git a/lib/blobchunk.c b/lib/blobchunk.c
> index bbc69cf..e47afb5 100644
> --- a/lib/blobchunk.c
> +++ b/lib/blobchunk.c
> @@ -309,6 +309,21 @@ int erofs_blob_write_chunked_file(struct erofs_inode *inode, int fd,
> minextblks = BLK_ROUND_UP(sbi, inode->i_size);
> interval_start = 0;
>
> + /* Align file on 'dsunit' */
> + if (sbi->bmgr->dsunit > 1) {
It should be
if (sbi->bmgr->dsunit >= 1u << (cfg.c_chunkbits - g_sbi.blkszbits)) {
}
?
> + off_t off = lseek(blobfile, 0, SEEK_CUR);
> +
> + erofs_dbg("Try to round up 0x%llx to align on %d blocks (dsunit)",
> + off, sbi->bmgr->dsunit);
> + off = roundup(off, sbi->bmgr->dsunit * erofs_blksiz(sbi));
> + if (lseek(blobfile, off, SEEK_SET) != off) {
> + ret = -errno;
> + erofs_err("lseek to blobdev 0x%llx error", off);
> + goto err;
> + }
> + erofs_dbg("Aligned on 0x%llx", off);
Could we combine these two debugging messages into one?
> + }
> +
> for (pos = 0; pos < inode->i_size; pos += len) {
> #ifdef SEEK_DATA
> off_t offset = lseek(fd, pos + startoff, SEEK_DATA);
> diff --git a/man/mkfs.erofs.1 b/man/mkfs.erofs.1
> index 63f7a2f..9075522 100644
> --- a/man/mkfs.erofs.1
> +++ b/man/mkfs.erofs.1
> @@ -168,6 +168,21 @@ the output filesystem, with no leading /.
> .TP
> .BI "\-\-dsunit=" #
> Align all data block addresses to multiples of #.
> +
> +If \fBdsunit\fR and \fBchunksize\fR are both set, \fBdsunit\fR will be ignored
> +if it is bigger than \fBchunksize\fR.
> +
> +This is for keeping alignment after deduplication.
> +If \fBdsunit\fR is bigger, it contains several chunks,
> +
> +E.g. \fBblock-size\fR=4096, \fBdsunit\fR=512 (2M), \fBchunksize\fR=4096
> +
> +Once 1 chunk is deduplicated, the chunks thereafter will not be aligned any
> +longer. In order to achieve the best performance, recommend to set \fBdsunit\fR
> +same as \fBchunksize\fR.
> +
> +E.g. \fBblock-size\fR=4096, \fBdsunit\fR=512 (2M), \fBchunksize\fR=$((4096*512))
> +
> .TP
> .BI "\-\-exclude-path=" path
> Ignore file that matches the exact literal path.
> diff --git a/mkfs/main.c b/mkfs/main.c
> index 30804d1..fcb2b89 100644
> --- a/mkfs/main.c
> +++ b/mkfs/main.c
> @@ -1098,6 +1098,19 @@ static int mkfs_parse_options_cfg(int argc, char *argv[])
> return -EINVAL;
> }
>
> + /*
> + * once align data on dsunit, in order to keep alignment after deduplication
> + * chunksize should be equal to or bigger than dsunit.
> + * if chunksize is smaller than dsunit, e.g. chunksize=4k, dsunit=2M,
> + * once a chunk is deduplicated, all data thereafter will be unaligned.
> + * so ignore dsunit under such case.
> + */
> + if (cfg.c_chunkbits && dsunit && 1u << (cfg.c_chunkbits - g_sbi.blkszbits) < dsunit) {
> + erofs_warn("chunksize %u bytes is smaller than dsunit %u blocks, ignore dsunit !",
> + 1u << cfg.c_chunkbits, dsunit);
One tab is not 8 spaces here? it seems indent misalignment.
Thanks,
Gao Xiang
More information about the Linux-erofs
mailing list