[PATCH v15 5/9] erofs: introduce the page cache share feature

Mon Jan 19 20:53:33 AEDT 2026

On 2026/1/19 17:38, Gao Xiang wrote:
> 
> 
> On 2026/1/19 17:22, Christoph Hellwig wrote:
>> On Mon, Jan 19, 2026 at 04:52:54PM +0800, Gao Xiang wrote:
>>>> To me this sounds pretty scary, as we have code in the kernel's trust
>>>> domain that heavily depends on arbitrary userspace policy decisions.
>>>
>>> For example, overlayfs metacopy can also points to
>>> arbitary files, what's the difference between them?
>>> https://docs.kernel.org/filesystems/overlayfs.html#metadata-only-copy-up
>>>
>>> By using metacopy, overlayfs can access arbitary files
>>> as long as the metacopy has the pointer, so it should
>>> be a priviledged stuff, which is similar to this feature.
>>
>> Sounds scary too.  But overlayfs' job is to combine underlying files, so
>> it is expected.  I think it's the mix of erofs being a disk based file
> 
> But you still could point to an arbitary page cache
> if metacopy is used.
> 
>> system, and reaching out beyond the device(s) assigned to the file system
>> instance that makes me feel rather uneasy.
> 
> You mean the page cache can be shared from other
> filesystems even not backed by these devices/files?
> 
> I admitted yes, there could be different: but that
> is why new mount options "inode_share" and the
> "domain_id" mount option are used.
> 
> I think they should be regarded as a single super
> filesystem if "domain_id" is the same: From the
> security perspective much like subvolumes of
> a single super filesystem.
> 
> And mounting a new filesystem within a "domain_id"
> can be regard as importing data into the super
> "domain_id" filesystem, and I think only trusted
> data within the single domain can be mounted/shared.
> 
>>
>>>>
>>>> Similarly the sharing of blocks between different file system
>>>> instances opens a lot of questions about trust boundaries and life
>>>> time rules.  I don't really have good answers, but writing up the
>>>
>>> Could you give more details about the these? Since you
>>> raised the questions but I have no idea what the threats
>>> really come from.
>>
>> Right now by default we don't allow any unprivileged mounts.  Now
>> if people thing that say erofs is safe enough and opt into that,
>> it needs to be clear what the boundaries of that are.  For a file
>> system limited to a single block device that boundaries are
>> pretty clear.  For file systems reaching out to the entire system
>> (or some kind of domain), the scope is much wider.

btw, I think it's indeed to be helpful to get the boundaries (even
from on-disk formats and runtime features).

But I have to clarify that a single EROFS filesystem instance won'
have access to random block device or files.

The backing device or files are specified by users explicitly when
mounting, like:

  mount -odevice=blob1,device=blob2,...,device=blobn-1 blob0 mnt

And these devices / files will be opened when mounting at once,
no more than that.

May I ask the difference between one device/file and a group of
given devices/files? Especially for immutable usage.

Thanks,
Gao Xiang