[PATCH RFC 4/4] erofs: introduce .fadvise for page cache share

Gao Xiang hsiangkao at linux.alibaba.com
Sat Jul 5 11:25:30 AEST 2025



On 2025/7/5 09:15, Hongzhen Luo wrote:
> 
> On 2025/7/5 05:09, Gao Xiang wrote:
>>
>>
>> On 2025/7/3 20:23, Christian Brauner wrote:
>>> From: Hongzhen Luo <hongzhen at linux.alibaba.com>
>>>
>>> When using .fadvise to release a file's page cache, it frees page cache
>>> pages that were first read by this file. To achieve this, an interval
>>> tree is added in the inode of that file to track the segments first
>>> read by that inode.
>>>
>>> Signed-off-by: Hongzhen Luo <hongzhen at linux.alibaba.com>
>>> Link: https://lore.kernel.org/20240902110620.2202586-5-hongzhen@linux.alibaba.com
>>> Signed-off-by: Christian Brauner <brauner at kernel.org>
>>> ---
>>>   fs/erofs/data.c            | 38 ++++++++++++++++++++--
>>>   fs/erofs/internal.h        |  5 +++
>>>   fs/erofs/pagecache_share.c | 81 ++++++++++++++++++++++++++++++++++++++++++++--
>>>   fs/erofs/pagecache_share.h |  2 ++
>>>   fs/erofs/super.c           |  9 ++++++
>>>   5 files changed, 131 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/fs/erofs/data.c b/fs/erofs/data.c
>>> index fb54162f4c54..61a42a95d26b 100644
>>> --- a/fs/erofs/data.c
>>> +++ b/fs/erofs/data.c
>>> @@ -7,6 +7,7 @@
>>>   #include "internal.h"
>>>   #include <linux/sched/mm.h>
>>>   #include <trace/events/erofs.h>
>>> +#include "pagecache_share.h"
>>>     void erofs_unmap_metabuf(struct erofs_buf *buf)
>>>   {
>>> @@ -353,6 +354,7 @@ static int erofs_read_folio(struct file *file, struct folio *folio)
>>>   {
>>>   #ifdef CONFIG_EROFS_FS_PAGE_CACHE_SHARE
>>>       struct erofs_inode *vi = NULL;
>>> +    struct interval_tree_node *seg;
>>>       int ret;
>>>         if (file && file->private_data) {
>>> @@ -363,8 +365,22 @@ static int erofs_read_folio(struct file *file, struct folio *folio)
>>>               vi = NULL;
>>>       }
>>>       ret = iomap_read_folio(folio, &erofs_iomap_ops);
>>> -    if (vi)
>>> +    if (vi) {
>>>           folio->mapping->host = file_inode(file);
>>> +        seg = erofs_pcs_alloc_seg();
>>> +        if (!seg)
>>> +            return -ENOMEM;
>>> +        seg->start = folio->index;
>>> +        seg->last = seg->start + (folio_size(folio) >> PAGE_SHIFT);
>>> +        if (seg->last > (vi->vfs_inode.i_size >> PAGE_SHIFT))
>>> +            seg->last = vi->vfs_inode.i_size >> PAGE_SHIFT;
>>> +        if (seg->last >= seg->start) {
>>> +            mutex_lock(&vi->segs_mutex);
>>> +            interval_tree_insert(seg, &vi->segs);
>>> +            mutex_unlock(&vi->segs_mutex);
>>> +        } else
>>> +            erofs_pcs_free_seg(seg);
>>> +    }
>>
>> I don't know what Hongzhen is trying to do in this patch and
>> it seems too odd on my side, maybe it needs to reimplement
>> this patch later but we should support .fadvise().
> 
> The original approach aimed to maintain a first-read interval tree per inode, ensuring
> 
> that .fadvise would only release cached pages within its own mapped ranges, thereby
> 
> preventing interference with other file operations. However, this introduced unnecessary
> 
> complexity. The latest patch series adopts overlayfs-style handling:
> https://lore.kernel.org/all/20250301145002.2420830-8-hongzhen@linux.alibaba.com/

Yes, that patch makes more sense for me since mm
code will handle it as page cache ops.

Thanks,
Gao Xiang


More information about the Linux-erofs mailing list