[PATCH v2] erofs: relaxed temporary buffers allocation on readahead

Fri Jan 26 13:47:18 AEDT 2024

On 2024/1/26 10:41, Chunhai Guo wrote:
> On 2024/1/22 15:42, Chunhai Guo wrote:
>> On 2024/1/22 12:37, Gao Xiang wrote:
>>> [你通常不会收到来自 hsiangkao at linux.alibaba.com 的电子邮件。请访问 https://aka.ms/LearnAboutSenderIdentification，以了解这一点为什么很重要]
>>>
>>> On 2024/1/22 11:49, Chunhai Guo wrote:
>>>> On 2024/1/22 10:07, Gao Xiang wrote:
>>>>> [你通常不会收到来自 hsiangkao at linux.alibaba.com 的电子邮件。请访问 https://aka.ms/LearnAboutSenderIdentification，以了解这一点为什么很重要]
>>>>>
>>>>> On 2024/1/20 22:55, Chunhai Guo wrote:
>>>>>> Even with inplace decompression, sometimes extra temporary buffers are
>>>>>> still needed for decompression.  In low-memory scenarios, it would be
>>>>>> better to try to allocate with GFP_NOWAIT on readahead first. That can
>>>>>> help reduce the time spent on page allocation under memory pressure.
>>>>>>
>>>>>> There is an average reduction of 21% in page allocation time under
>>>>> It would be better to add a table to show the absolute numbers too
>>>>> (like what you did in the global pool commit.)  If it's possible, there
>>>>> is no need to send a update version for this, just reply the updated
>>>>> commit message and I will update the commit manually.
>>>> The table below shows detailed numbers. The reduction I mentioned before
>>>> was not accurate enough. Please help correct the improvement from 21% to
>>>> 20.21%.
>>>>
>>>>
>>>> +--------------+----------------+---------------+---------+
>>>> |              | w/o GFP_NOWAIT | w/ GFP_NOWAIT |  diff   |
>>>> +--------------+----------------+---------------+---------+
>>>> | Average (ms) |     3364       |      2684     | -20.21% |
>>>> +--------------+----------------+---------------+---------+
>>> Did it test without the 16k sliding window change?
>>> https://lore.kernel.org/linux-erofs/69711d55-f7a2-420b-9ba8-fa2921f66a4c@vivo.com
>> The result is tested with 64k sliding window change.
>>
>>> Could you benchmark these two optimizations together to
>>> show the extreme optimized case without a global pool?
>>> With a new table if possible? I will add this to
>>> the commit message too.
>>
>> OK. I will reply to this email when the benchmark is finished.
> 
> The benchmark has been completed and the table below shows that there is
> an average 52.14% reduction in page allocation time with these two
> optimizations.
> 
> +--------------+----------------+---------------+---------+ | | 64k
> window | 16k window | | | | w/o GFP_NOWAIT | w/ GFP_NOWAIT | diff |
> +--------------+----------------+---------------+---------+ | Average
> (ms) | 3364 | 1610 | -52.14% |
> +--------------+----------------+---------------+---------+
> 
> Table below summarizes the results of these three benchmarks.
> 
> +--------------+----------------+----------------+---------------+---------------+
> |              |   64k window   |   16k window   |   64k window  | 16k
> window  |
> |              | w/o GFP_NOWAIT | w/o GFP_NOWAIT | w/ GFP_NOWAIT | w/
> GFP_NOWAIT |
> +--------------+----------------+----------------+---------------+---------------+
> | Average (ms) |     3364       |      2079      |      2684 |
> 1610     |
> +--------------+----------------+----------------+---------------+---------------+
> |     diff     |                |     -38.19%    |     -20.81% |
> -52.14%   |
> +--------------+----------------+----------------+---------------+---------------+

The tables shows in a mess, could you just list the
numbers so I could refine this?

Thanks,
Gao Xiang