[PATCH RFC 4/4] erofs: introduce .fadvise for page cache share

Hongzhen Luo hongzhen at linux.alibaba.com
Sat Jul 5 11:15:27 AEST 2025


On 2025/7/5 05:09, Gao Xiang wrote:
>
>
> On 2025/7/3 20:23, Christian Brauner wrote:
>> From: Hongzhen Luo <hongzhen at linux.alibaba.com>
>>
>> When using .fadvise to release a file's page cache, it frees page cache
>> pages that were first read by this file. To achieve this, an interval
>> tree is added in the inode of that file to track the segments first
>> read by that inode.
>>
>> Signed-off-by: Hongzhen Luo <hongzhen at linux.alibaba.com>
>> Link: 
>> https://lore.kernel.org/20240902110620.2202586-5-hongzhen@linux.alibaba.com
>> Signed-off-by: Christian Brauner <brauner at kernel.org>
>> ---
>>   fs/erofs/data.c            | 38 ++++++++++++++++++++--
>>   fs/erofs/internal.h        |  5 +++
>>   fs/erofs/pagecache_share.c | 81 
>> ++++++++++++++++++++++++++++++++++++++++++++--
>>   fs/erofs/pagecache_share.h |  2 ++
>>   fs/erofs/super.c           |  9 ++++++
>>   5 files changed, 131 insertions(+), 4 deletions(-)
>>
>> diff --git a/fs/erofs/data.c b/fs/erofs/data.c
>> index fb54162f4c54..61a42a95d26b 100644
>> --- a/fs/erofs/data.c
>> +++ b/fs/erofs/data.c
>> @@ -7,6 +7,7 @@
>>   #include "internal.h"
>>   #include <linux/sched/mm.h>
>>   #include <trace/events/erofs.h>
>> +#include "pagecache_share.h"
>>     void erofs_unmap_metabuf(struct erofs_buf *buf)
>>   {
>> @@ -353,6 +354,7 @@ static int erofs_read_folio(struct file *file, 
>> struct folio *folio)
>>   {
>>   #ifdef CONFIG_EROFS_FS_PAGE_CACHE_SHARE
>>       struct erofs_inode *vi = NULL;
>> +    struct interval_tree_node *seg;
>>       int ret;
>>         if (file && file->private_data) {
>> @@ -363,8 +365,22 @@ static int erofs_read_folio(struct file *file, 
>> struct folio *folio)
>>               vi = NULL;
>>       }
>>       ret = iomap_read_folio(folio, &erofs_iomap_ops);
>> -    if (vi)
>> +    if (vi) {
>>           folio->mapping->host = file_inode(file);
>> +        seg = erofs_pcs_alloc_seg();
>> +        if (!seg)
>> +            return -ENOMEM;
>> +        seg->start = folio->index;
>> +        seg->last = seg->start + (folio_size(folio) >> PAGE_SHIFT);
>> +        if (seg->last > (vi->vfs_inode.i_size >> PAGE_SHIFT))
>> +            seg->last = vi->vfs_inode.i_size >> PAGE_SHIFT;
>> +        if (seg->last >= seg->start) {
>> +            mutex_lock(&vi->segs_mutex);
>> +            interval_tree_insert(seg, &vi->segs);
>> +            mutex_unlock(&vi->segs_mutex);
>> +        } else
>> +            erofs_pcs_free_seg(seg);
>> +    }
>
> I don't know what Hongzhen is trying to do in this patch and
> it seems too odd on my side, maybe it needs to reimplement
> this patch later but we should support .fadvise().

The original approach aimed to maintain a first-read interval tree per 
inode, ensuring

that .fadvise would only release cached pages within its own mapped 
ranges, thereby

preventing interference with other file operations. However, this 
introduced unnecessary

complexity. The latest patch series adopts overlayfs-style handling:
https://lore.kernel.org/all/20250301145002.2420830-8-hongzhen@linux.alibaba.com/

Thanks,
Hongzhen

>
> Thanks,
> Gao Xiang
>
>>
>


More information about the Linux-erofs mailing list