[PATCH RFC 4/4] erofs: introduce .fadvise for page cache share
Gao Xiang
hsiangkao at linux.alibaba.com
Sat Jul 5 11:25:30 AEST 2025
On 2025/7/5 09:15, Hongzhen Luo wrote:
>
> On 2025/7/5 05:09, Gao Xiang wrote:
>>
>>
>> On 2025/7/3 20:23, Christian Brauner wrote:
>>> From: Hongzhen Luo <hongzhen at linux.alibaba.com>
>>>
>>> When using .fadvise to release a file's page cache, it frees page cache
>>> pages that were first read by this file. To achieve this, an interval
>>> tree is added in the inode of that file to track the segments first
>>> read by that inode.
>>>
>>> Signed-off-by: Hongzhen Luo <hongzhen at linux.alibaba.com>
>>> Link: https://lore.kernel.org/20240902110620.2202586-5-hongzhen@linux.alibaba.com
>>> Signed-off-by: Christian Brauner <brauner at kernel.org>
>>> ---
>>> fs/erofs/data.c | 38 ++++++++++++++++++++--
>>> fs/erofs/internal.h | 5 +++
>>> fs/erofs/pagecache_share.c | 81 ++++++++++++++++++++++++++++++++++++++++++++--
>>> fs/erofs/pagecache_share.h | 2 ++
>>> fs/erofs/super.c | 9 ++++++
>>> 5 files changed, 131 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/fs/erofs/data.c b/fs/erofs/data.c
>>> index fb54162f4c54..61a42a95d26b 100644
>>> --- a/fs/erofs/data.c
>>> +++ b/fs/erofs/data.c
>>> @@ -7,6 +7,7 @@
>>> #include "internal.h"
>>> #include <linux/sched/mm.h>
>>> #include <trace/events/erofs.h>
>>> +#include "pagecache_share.h"
>>> void erofs_unmap_metabuf(struct erofs_buf *buf)
>>> {
>>> @@ -353,6 +354,7 @@ static int erofs_read_folio(struct file *file, struct folio *folio)
>>> {
>>> #ifdef CONFIG_EROFS_FS_PAGE_CACHE_SHARE
>>> struct erofs_inode *vi = NULL;
>>> + struct interval_tree_node *seg;
>>> int ret;
>>> if (file && file->private_data) {
>>> @@ -363,8 +365,22 @@ static int erofs_read_folio(struct file *file, struct folio *folio)
>>> vi = NULL;
>>> }
>>> ret = iomap_read_folio(folio, &erofs_iomap_ops);
>>> - if (vi)
>>> + if (vi) {
>>> folio->mapping->host = file_inode(file);
>>> + seg = erofs_pcs_alloc_seg();
>>> + if (!seg)
>>> + return -ENOMEM;
>>> + seg->start = folio->index;
>>> + seg->last = seg->start + (folio_size(folio) >> PAGE_SHIFT);
>>> + if (seg->last > (vi->vfs_inode.i_size >> PAGE_SHIFT))
>>> + seg->last = vi->vfs_inode.i_size >> PAGE_SHIFT;
>>> + if (seg->last >= seg->start) {
>>> + mutex_lock(&vi->segs_mutex);
>>> + interval_tree_insert(seg, &vi->segs);
>>> + mutex_unlock(&vi->segs_mutex);
>>> + } else
>>> + erofs_pcs_free_seg(seg);
>>> + }
>>
>> I don't know what Hongzhen is trying to do in this patch and
>> it seems too odd on my side, maybe it needs to reimplement
>> this patch later but we should support .fadvise().
>
> The original approach aimed to maintain a first-read interval tree per inode, ensuring
>
> that .fadvise would only release cached pages within its own mapped ranges, thereby
>
> preventing interference with other file operations. However, this introduced unnecessary
>
> complexity. The latest patch series adopts overlayfs-style handling:
> https://lore.kernel.org/all/20250301145002.2420830-8-hongzhen@linux.alibaba.com/
Yes, that patch makes more sense for me since mm
code will handle it as page cache ops.
Thanks,
Gao Xiang
More information about the Linux-erofs
mailing list