[PATCH v2] erofs: add a global page pool for lz4 decompression
Gao Xiang
hsiangkao at linux.alibaba.com
Mon Jan 22 15:41:17 AEDT 2024
Hi Chunhai,
On 2024/1/12 09:58, Chunhai Guo wrote:
> On 2024/1/10 14:45, Chunhai Guo wrote:
>> On 2024/1/9 21:08, Gao Xiang wrote:
>>> [你通常不会收到来自 hsiangkao at linux.alibaba.com 的电子邮件。请访问
>>> https://aka.ms/LearnAboutSenderIdentification,以了解这一点为什么很重要]
>>>
>>> Hi Chunhai,
>>>
>>> On 2024/1/9 15:41, Chunhai Guo wrote:
>>>> Using a global page pool for LZ4 decompression significantly reduces
>>>> the
>>>> time spent on page allocation in low memory scenarios.
>>>>
>>>> The table below shows the reduction in time spent on page allocation
>>>> for
>>>> LZ4 decompression when using a global page pool. The results were
>>>> obtained from multi-app launch benchmarks on ARM64 Android devices
>>>> running the 5.15 kernel with an 8-core CPU and 8GB of memory. In the
>>>> benchmark, we launched 16 frequently-used apps, and the camera app was
>>>> the last one in each round. The data in the table is the average
>>>> time of
>>>> camera app for each round.
>>>> After using the page pool, there was an average improvement of 150ms in
>>>> the launch time of the camera app, which was obtained from systrace
>>>> log.
>>>> +--------------+---------------+--------------+---------+
>>>> | | w/o page pool | w/ page pool | diff |
>>>> +--------------+---------------+--------------+---------+
>>>> | Average (ms) | 3434 | 21 | -99.38% |
>>>> +--------------+---------------+--------------+---------+
>>>>
>>>> Based on the benchmark logs, 64 pages are sufficient for 95% of
>>>> scenarios. This value can be adjusted from the module parameter. The
>>>> default value is 0.
>>>>
>>>> This patch currently only supports the LZ4 decompressor, other
>>>> decompressors will be supported in the next step.
>>>>
>>>> Signed-off-by: Chunhai Guo <guochunhai at vivo.com>
>>>
>>> This patch looks good to me, yet we're in the merge window for v6.8.
>>> I will address it after -rc1 is out since no stable tag these days.
>>>
>>> Also it would be better to add some results of changing max_distance
>>> if you have more time to test.
>>
>> OK. I will reply to this email when the experiment is finished.
>
> Dear Xiang,
>
> The experiment is done and table below shows the results. We can find
> that a 16k sliding window reduces 38.2% of time used in page allocation
> for LZ4 decompression compared to a 64k sliding window. However, using a
> global page pool is still far better than both of them.
>
> +--------------+---------------+--------------+---------+
> | | 64k window | 16k window | diff |
> +--------------+---------------+--------------+---------+
> | Average (ms) | 3364 | 2079 | -38.2% |
> +--------------+---------------+--------------+---------+
>
> Thanks,
Let's rebase this onto
commit ("erofs: relaxed temporary buffers allocation on readahead")
I will merge these after the rebase patch is received.
Thanks,
Gao Xiang
More information about the Linux-erofs
mailing list