[PATCH] erofs-utils: add --hardlink-dereference option

Gao Xiang hsiangkao at linux.alibaba.com
Thu Dec 12 20:54:57 AEDT 2024



On 2024/12/12 17:40, Paul M wrote:
> Hi Gao,
> 
> On 2024/12/11 17:11, Gao Xiang wrote:
>>
>> Hi Paul,
>>
>> On 2024/12/11 23:07, Paul Meyer wrote:
>>> Add option --hardlink-dereference to dereference hardlinks when
>>> creating an image. Instead of reusing the inode, hardlinks are added
>>> as separate inodes. This is useful for reproducible builds, when the
>>> rootfs is space-optimized using hardlinks on some machines, but not on
>>> others.
>>
>> Thanks for the patch!
>>
>> Yet I fail to parse the feature, why this feature is useful
>> for reproducible builds? IOWs, without this feature (or
>> hardlinks are used), what's the exact impact that you don't
>> want to?
> 
> Sure, here is our full use case: We are building an erofs image with Nix.
> Nix stores the rootfs in the nix store (/nix/store). Now there is an option
> in nix to enable store optimization via hardlinks. In case optimization is
> enabled, files with identical content are turned into hardlinks to save space,
> as nix store paths are read-only anyway. If I create a rootfs with
> two identical files, those will be hardlinked on systems with optimization
> enabled, but have different inodes on systems where optimization is disabled.
> When building an erofs, the resulting image will have one inode less on
> the system where files are hardlinked.
> The goal is to make the image bit-by-bit reproducible on both systems.
> By dereferencing hardlinks, we get the exact same image no matter whether
> the system uses hardlink optimizations or not.
> 
> There is a comparable tar option with the same name.

Ok, thanks for letting me know.
That sounds more meaningful to me, but
could you update the option as `--hard-dereference` (although
I don't verify the tar behavior) and usage() too?

Otherwise it looks good to me.

Thanks,
Gao Xiang

> 
> Thanks,
> Paul
> 
>>
>> Thanks,
>> Gao Xiang
>>
>>>
>>> Co-authored-by: Leonard Cohnen <leonard.cohnen at gmail.com>
>>> Signed-off-by: Paul Meyer <katexochen0 at gmail.com>
>>> ---
>>>    include/erofs/config.h | 1 +
>>>    lib/inode.c            | 2 +-
>>>    mkfs/main.c            | 4 ++++
>>>    3 files changed, 6 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/include/erofs/config.h b/include/erofs/config.h
>>> index cff4cea..8399afb 100644
>>> --- a/include/erofs/config.h
>>> +++ b/include/erofs/config.h
>>> @@ -58,6 +58,7 @@ struct erofs_configure {
>>>        bool c_extra_ea_name_prefixes;
>>>        bool c_xattr_name_filter;
>>>        bool c_ovlfs_strip;
>>> +     bool c_hardlink_dereference;
>>>
>>>    #ifdef HAVE_LIBSELINUX
>>>        struct selabel_handle *sehnd;
>>> diff --git a/lib/inode.c b/lib/inode.c
>>> index 7e5c581..5d181b3 100644
>>> --- a/lib/inode.c
>>> +++ b/lib/inode.c
>>> @@ -1141,7 +1141,7 @@ static struct erofs_inode *erofs_iget_from_srcpath(struct erofs_sb_info *sbi,
>>>         * hard-link, just return it. Also don't lookup for directories
>>>         * since hard-link directory isn't allowed.
>>>         */
>>> -     if (!S_ISDIR(st.st_mode)) {
>>> +     if (!S_ISDIR(st.st_mode) && (!cfg.c_hardlink_dereference)) {
>>>                inode = erofs_iget(st.st_dev, st.st_ino);
>>>                if (inode)
>>>                        return inode;
>>> diff --git a/mkfs/main.c b/mkfs/main.c
>>> index d422787..09e39f5 100644
>>> --- a/mkfs/main.c
>>> +++ b/mkfs/main.c
>>> @@ -85,6 +85,7 @@ static struct option long_options[] = {
>>>        {"mkfs-time", no_argument, NULL, 525},
>>>        {"all-time", no_argument, NULL, 526},
>>>        {"sort", required_argument, NULL, 527},
>>> +     {"hardlink-dereference", no_argument, NULL, 528},
>>>        {0, 0, 0, 0},
>>>    };
>>>
>>> @@ -846,6 +847,9 @@ static int mkfs_parse_options_cfg(int argc, char *argv[])
>>>                        if (!strcmp(optarg, "none"))
>>>                                erofstar.try_no_reorder = true;
>>>                        break;
>>> +             case 528:
>>> +                     cfg.c_hardlink_dereference = true;
>>> +                     break;
>>>                case 'V':
>>>                        version();
>>>                        exit(0);
>>



More information about the Linux-erofs mailing list