[PATCH v2] erofs: add a global page pool for lz4 decompression

Gao Xiang hsiangkao at linux.alibaba.com
Mon Jan 22 15:41:17 AEDT 2024


Hi Chunhai,

On 2024/1/12 09:58, Chunhai Guo wrote:
> On 2024/1/10 14:45, Chunhai Guo wrote:
>> On 2024/1/9 21:08, Gao Xiang wrote:
>>> [你通常不会收到来自 hsiangkao at linux.alibaba.com 的电子邮件。请访问
>>> https://aka.ms/LearnAboutSenderIdentification,以了解这一点为什么很重要]
>>>
>>> Hi Chunhai,
>>>
>>> On 2024/1/9 15:41, Chunhai Guo wrote:
>>>> Using a global page pool for LZ4 decompression significantly reduces
>>>> the
>>>> time spent on page allocation in low memory scenarios.
>>>>
>>>> The table below shows the reduction in time spent on page allocation
>>>> for
>>>> LZ4 decompression when using a global page pool.  The results were
>>>> obtained from multi-app launch benchmarks on ARM64 Android devices
>>>> running the 5.15 kernel with an 8-core CPU and 8GB of memory. In the
>>>> benchmark, we launched 16 frequently-used apps, and the camera app was
>>>> the last one in each round. The data in the table is the average
>>>> time of
>>>> camera app for each round.
>>>> After using the page pool, there was an average improvement of 150ms in
>>>> the launch time of the camera app, which was obtained from systrace
>>>> log.
>>>> +--------------+---------------+--------------+---------+
>>>> |              | w/o page pool | w/ page pool |  diff   |
>>>> +--------------+---------------+--------------+---------+
>>>> | Average (ms) |     3434      |      21      | -99.38% |
>>>> +--------------+---------------+--------------+---------+
>>>>
>>>> Based on the benchmark logs, 64 pages are sufficient for 95% of
>>>> scenarios. This value can be adjusted from the module parameter. The
>>>> default value is 0.
>>>>
>>>> This patch currently only supports the LZ4 decompressor, other
>>>> decompressors will be supported in the next step.
>>>>
>>>> Signed-off-by: Chunhai Guo <guochunhai at vivo.com>
>>>
>>> This patch looks good to me, yet we're in the merge window for v6.8.
>>> I will address it after -rc1 is out since no stable tag these days.
>>>
>>> Also it would be better to add some results of changing max_distance
>>> if you have more time to test.
>>
>> OK. I will reply to this email when the experiment is finished.
> 
> Dear Xiang,
> 
> The experiment is done and table below shows the results. We can find
> that a 16k sliding window reduces 38.2% of time used in page allocation
> for LZ4 decompression compared to a 64k sliding window. However, using a
> global page pool is still far better than both of them.
> 
> +--------------+---------------+--------------+---------+
> |              |   64k window  |  16k window  |  diff   |
> +--------------+---------------+--------------+---------+
> | Average (ms) |     3364      |      2079    | -38.2%  |
> +--------------+---------------+--------------+---------+
> 
> Thanks,

Let's rebase this onto
commit ("erofs: relaxed temporary buffers allocation on readahead")

I will merge these after the rebase patch is received.

Thanks,
Gao Xiang


More information about the Linux-erofs mailing list