[PATCH RFC 2/4] erofs: introduce page cache share feature
Hongzhen Luo
hongzhen at linux.alibaba.com
Sat Jul 5 10:54:48 AEST 2025
On 2025/7/5 05:06, Gao Xiang wrote:
> Hi Christian,
>
> On 2025/7/3 20:23, Christian Brauner wrote:
>
> ...
>
>> diff --git a/fs/erofs/pagecache_share.c b/fs/erofs/pagecache_share.c
>> new file mode 100644
>> index 000000000000..309b33cc6c30
>> --- /dev/null
>> +++ b/fs/erofs/pagecache_share.c
>> @@ -0,0 +1,204 @@
>> +// SPDX-License-Identifier: GPL-2.0-or-later
>> +/*
>> + * Copyright (C) 2024, Alibaba Cloud
>> + */
>> +#include <linux/xxhash.h>
>> +#include <linux/refcount.h>
>> +#include "pagecache_share.h"
>> +#include "internal.h"
>> +#include "xattr.h"
>> +
>> +#define PCS_FPRT_IDX 4
>> +#define PCS_FPRT_NAME "erofs.fingerprint"
>> +#define PCS_FPRT_MAXLEN (sizeof(size_t) + 1024)
>
> One thing I told Hongzhen to work on is that I really
> don't like a hardcode xattr like this.
>
> Because EROFS can store common long xattr prefixes, see:
> https://docs.kernel.org/filesystems/erofs.html#long-extended-attribute-name-prefixes
>
>
> So it would be nice to just record a name prefix in the
> ondisk superblock so that users can use their own xattr
> names for this usage.
>
> For example, users could use "overlay.metacopy" xattr
> as page cache sharing fingerprint to identify different
> inodes if overlayfs fsverity feature is on, see:
> https://docs.kernel.org/filesystems/overlayfs.html#fs-verity-support
Yes, please refer to the latest patch series here:
https://lore.kernel.org/all/20250301145002.2420830-3-hongzhen@linux.alibaba.com/
>
> But if you really don't have more time to know the EROFS
> internals here, you could just leave as-is. I could
> handle myself.
>
>> +
>
> ...
>
>> +}
>> +
>> +/*
>> + * TODO: Hm, could we leverage our fancy new backing file
>> infrastructure
>> + * as for overlayfs and fuse?
>
> If some code can be lifted up as a vfs helper, it would be
> much helpful as the backing file infrastructure was lifted
> from overlayfs.
>
> But I'm not sure if it's really needed for now anyway
> because it's only EROFS-specific, and I only maintain and
> can speak of EROFS.
>
>> + */
>> +static struct file *erofs_pcs_alloc_file(struct file *file,
>> + struct inode *ano_inode)
>> +{
>> + struct file *ano_file;
>> +
>> + ano_file = alloc_file_pseudo(ano_inode, erofs_pcs_mnt,
>> "[erofs_pcs_f]",
>> + O_RDONLY, &erofs_file_fops);
>> + file_ra_state_init(&ano_file->f_ra, file->f_mapping);
>> + ano_file->private_data = EROFS_I(file_inode(file));
>> + return ano_file;
>> +}
>> +
>
> ...
>
>> +
>> +/*
>> + * TODO: Amir, you've got some experience in this area due to overlayfs
>> + * and fuse. Does that work?
>> + */
>
>
>
> Hi Amir,
>
> I do think it will work, if you have chance please help
> take a quick look too.
>
> It's much similar to overlayfs, the difference is that the real
> inodes is not in some other fs, but anon inodes from a pseudo
> sb which shares among the whole host to share page cache for
> containers.
>
>> +static int erofs_pcs_mmap(struct file *file, struct vm_area_struct
>> *vma)
>> +{
>> + struct file *ano_file = file->private_data;
>> +
>> + vma_set_file(vma, ano_file);
>> + vma->vm_ops = &generic_file_vm_ops;
>> + return 0;
>> +}
>> +
>> +const struct file_operations erofs_pcs_file_fops = {
>> + .open = erofs_pcs_file_open,
>> + /*
>> + * TODO: Why doesn't .llseek require similar treatment as
>> + * .read_iter?
>> + */
>
> I don't know some specific reason because it wrote by
> Hongzhen.
>
> Hongzhen is still at work until by the end of the month,
> I hope he could answer some question.
>
>> + .llseek = generic_file_llseek,
>> + .read_iter = erofs_pcs_file_read_iter,
>> + .mmap = erofs_pcs_mmap,
>> + .release = erofs_pcs_file_release,
>> + .get_unmapped_area = thp_get_unmapped_area,
>> + .splice_read = filemap_splice_read,
>> +};
>> diff --git a/fs/erofs/pagecache_share.h b/fs/erofs/pagecache_share.h
>> new file mode 100644
>> index 000000000000..b8111291cf79
>> --- /dev/null
>> +++ b/fs/erofs/pagecache_share.h
>> @@ -0,0 +1,20 @@
>> +/* SPDX-License-Identifier: GPL-2.0-or-later */
>> +/*
>> + * Copyright (C) 2024, Alibaba Cloud
>> + */
>
>
> BTW, it seems that this header is too small, maybe just
> fold them into internal.h.
>
> Thanks,
> Gao Xiang
>
>> +#ifndef __EROFS_PAGECACHE_SHARE_H
>> +#define __EROFS_PAGECACHE_SHARE_H
>> +
>> +#include <linux/fs.h>
>> +#include <linux/mount.h>
>> +#include <linux/rwlock.h>
>> +#include <linux/mutex.h>
>> +#include "internal.h"
>> +
>> +int erofs_pcs_init_mnt(void);
>> +void erofs_pcs_free_mnt(void);
>> +void erofs_pcs_fill_inode(struct inode *inode);
>> +
>> +extern const struct vm_operations_struct generic_file_vm_ops;
>> +
>> +#endif
>>
More information about the Linux-erofs
mailing list