[PATCH] erofs-utils: add --hardlink-dereference option
Gao Xiang
hsiangkao at linux.alibaba.com
Thu Dec 12 20:54:57 AEDT 2024
On 2024/12/12 17:40, Paul M wrote:
> Hi Gao,
>
> On 2024/12/11 17:11, Gao Xiang wrote:
>>
>> Hi Paul,
>>
>> On 2024/12/11 23:07, Paul Meyer wrote:
>>> Add option --hardlink-dereference to dereference hardlinks when
>>> creating an image. Instead of reusing the inode, hardlinks are added
>>> as separate inodes. This is useful for reproducible builds, when the
>>> rootfs is space-optimized using hardlinks on some machines, but not on
>>> others.
>>
>> Thanks for the patch!
>>
>> Yet I fail to parse the feature, why this feature is useful
>> for reproducible builds? IOWs, without this feature (or
>> hardlinks are used), what's the exact impact that you don't
>> want to?
>
> Sure, here is our full use case: We are building an erofs image with Nix.
> Nix stores the rootfs in the nix store (/nix/store). Now there is an option
> in nix to enable store optimization via hardlinks. In case optimization is
> enabled, files with identical content are turned into hardlinks to save space,
> as nix store paths are read-only anyway. If I create a rootfs with
> two identical files, those will be hardlinked on systems with optimization
> enabled, but have different inodes on systems where optimization is disabled.
> When building an erofs, the resulting image will have one inode less on
> the system where files are hardlinked.
> The goal is to make the image bit-by-bit reproducible on both systems.
> By dereferencing hardlinks, we get the exact same image no matter whether
> the system uses hardlink optimizations or not.
>
> There is a comparable tar option with the same name.
Ok, thanks for letting me know.
That sounds more meaningful to me, but
could you update the option as `--hard-dereference` (although
I don't verify the tar behavior) and usage() too?
Otherwise it looks good to me.
Thanks,
Gao Xiang
>
> Thanks,
> Paul
>
>>
>> Thanks,
>> Gao Xiang
>>
>>>
>>> Co-authored-by: Leonard Cohnen <leonard.cohnen at gmail.com>
>>> Signed-off-by: Paul Meyer <katexochen0 at gmail.com>
>>> ---
>>> include/erofs/config.h | 1 +
>>> lib/inode.c | 2 +-
>>> mkfs/main.c | 4 ++++
>>> 3 files changed, 6 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/include/erofs/config.h b/include/erofs/config.h
>>> index cff4cea..8399afb 100644
>>> --- a/include/erofs/config.h
>>> +++ b/include/erofs/config.h
>>> @@ -58,6 +58,7 @@ struct erofs_configure {
>>> bool c_extra_ea_name_prefixes;
>>> bool c_xattr_name_filter;
>>> bool c_ovlfs_strip;
>>> + bool c_hardlink_dereference;
>>>
>>> #ifdef HAVE_LIBSELINUX
>>> struct selabel_handle *sehnd;
>>> diff --git a/lib/inode.c b/lib/inode.c
>>> index 7e5c581..5d181b3 100644
>>> --- a/lib/inode.c
>>> +++ b/lib/inode.c
>>> @@ -1141,7 +1141,7 @@ static struct erofs_inode *erofs_iget_from_srcpath(struct erofs_sb_info *sbi,
>>> * hard-link, just return it. Also don't lookup for directories
>>> * since hard-link directory isn't allowed.
>>> */
>>> - if (!S_ISDIR(st.st_mode)) {
>>> + if (!S_ISDIR(st.st_mode) && (!cfg.c_hardlink_dereference)) {
>>> inode = erofs_iget(st.st_dev, st.st_ino);
>>> if (inode)
>>> return inode;
>>> diff --git a/mkfs/main.c b/mkfs/main.c
>>> index d422787..09e39f5 100644
>>> --- a/mkfs/main.c
>>> +++ b/mkfs/main.c
>>> @@ -85,6 +85,7 @@ static struct option long_options[] = {
>>> {"mkfs-time", no_argument, NULL, 525},
>>> {"all-time", no_argument, NULL, 526},
>>> {"sort", required_argument, NULL, 527},
>>> + {"hardlink-dereference", no_argument, NULL, 528},
>>> {0, 0, 0, 0},
>>> };
>>>
>>> @@ -846,6 +847,9 @@ static int mkfs_parse_options_cfg(int argc, char *argv[])
>>> if (!strcmp(optarg, "none"))
>>> erofstar.try_no_reorder = true;
>>> break;
>>> + case 528:
>>> + cfg.c_hardlink_dereference = true;
>>> + break;
>>> case 'V':
>>> version();
>>> exit(0);
>>
More information about the Linux-erofs
mailing list