[PATCH v15 5/9] erofs: introduce the page cache share feature
Gao Xiang
hsiangkao at linux.alibaba.com
Mon Jan 19 18:53:21 AEDT 2026
On 2026/1/19 15:29, Christoph Hellwig wrote:
> On Sat, Jan 17, 2026 at 12:21:16AM +0800, Gao Xiang wrote:
>> Hi Christoph,
>>
>> On 2026/1/16 23:46, Christoph Hellwig wrote:
>>> I don't really understand the fingerprint idea. Files with the
>>> same content will point to the same physical disk blocks, so that
>>> should be a much better indicator than a finger print? Also how does
>>
>> Page cache sharing should apply to different EROFS
>> filesystem images on the same machine too, so the
>> physical disk block number idea cannot be applied
>> to this.
>
> Oh. That's kinda unexpected and adds another twist to the whole scheme.
> So in that case the on-disk data actually is duplicated in each image
> and then de-duplicated in memory only? Ewwww...
On-disk deduplication is decoupled from this feature:
- EROFS can share the same blocks in blobs (multiple
devices) among different images, so that on-disk data
can be shared by refering the same blobs;
- On-disk data won't be deduplicated in image if reflink
is enabled for backing fses, userspace mounters can
trigger background GCs to deduplicate the identical
blocks.
I just tried to say EROFS doesn't limit what's
the real meaning of `fingerprint` (they can be serialized
integer numbers for example defined by a specific image
publisher, or a specific secure hash. Currently,
"mkfs.erofs" will generate sha256 for each files), but
left them to the image builders:
1) if `fingerprint` is distributed as on-disk part of
signed images, as I said, it could be shared within a
trusted domain_id (usually the same image builder) --
that is the top priority thing using dmverity;
Or
2) If `fingerprint` is not distributed in the image
or images are untrusted (e.g. unknown signatures),
image fetchers can scan each inode in the golden
images to generate an extra minimal EROFS
metadata-only image with local calculated
`fingerprint` too, which is much similar to the
current ostree way (parse remote files and calculate
digests).
Thanks,
Gao Xiang
More information about the Linux-erofs
mailing list