[PATCH 00/16] Remove hash page table slot tracking from linux PTE
Aneesh Kumar K.V
aneesh.kumar at linux.vnet.ibm.com
Fri Oct 27 16:27:13 AEDT 2017
On 10/27/2017 10:04 AM, Paul Mackerras wrote:
> On Fri, Oct 27, 2017 at 09:38:17AM +0530, Aneesh Kumar K.V wrote:
>> Hi,
>>
>> With hash translation mode we always tracked the hash pte slot details in linux page table.
>> This occupied space in the linux page table and also limitted our ability to support
>> linux features that require additional PTE bits. This series attempt to lift this
>> limitation by not tracking slot number in linux page table. We still track slot details
>> w.r.t Transparent Hugepage entries because an invalidate there requires us to go through
>> all the 256 hash pte slots. So tracking whether hash page table entry is valid helps us in
>> avoiding a lot of hcalls there. With THP entries we don't keep slot details in the primary
>> linux page table entry but in the second half of page table. Hence tracking slot details
>> for THP doesn't take up space in PTE.
>>
>> Even though we don't track slot, for removing/updating hash page table entry, PAPR hcalls expect
>> hash page table slot details. On pseries we find slot using H_READ hcall using H_READ_4 flags.
>> This implies an additional 2 hcalls in the updatepp and remove paths. The patch series also
>> attempt to limit the impact of this by adding new hcalls that does remove/update of hash page table
>> entry using hash value instead of hash page table slot.
>>
>> Below is the performance numbers observed when running a workload that does the below sequence
>>
>> for(5000) {
>> mmap(128M)
>> touch every page of 2048 page
>> munmap()
>> }
>>
>> The test is run with address randomization off, swap disabled in both host and guest.
>>
>>
>> |------------+----------+---------------+--------------------------+-----------------------|
>> | iterations | platform | without patch | With series and no hcall | With series and hcall |
>> |------------+----------+---------------+--------------------------+-----------------------|
>> | 1 | powernv | | 50.818343 | |
>> | 2 | powernv | | 50.744123 | |
>> | 3 | powernv | | 50.721603 | |
>> | 4 | powernv | | 50.739922 | |
>> | 5 | powernv | | 50.638555 | |
>> | 1 | powernv | 51.388249 | | |
>> | 2 | powernv | 51.789701 | | |
>> | 3 | powernv | 52.240394 | | |
>> | 4 | powernv | 51.432255 | | |
>> | 5 | powernv | 51.392947 | | |
>> |------------+----------+---------------+--------------------------+-----------------------|
>> | 1 | pseries | | | 123.154394 |
>> | 2 | pseries | | | 122.253956 |
>> | 3 | pseries | | | 117.666344 |
>> | 4 | pseries | | | 117.681479 |
>> | 5 | pseries | | | 117.735808 |
>> | 1 | pseries | | 119.424940 | |
>> | 2 | pseries | | 117.663078 | |
>> | 3 | pseries | | 118.345584 | |
>> | 4 | pseries | | 119.620934 | |
>> | 5 | pseries | | 119.463185 | |
>> | 1 | pseries | 122.810867 | | |
>> | 2 | pseries | 115.760801 | | |
>> | 3 | pseries | 115.257030 | | |
>> | 4 | pseries | 116.617884 | | |
>> | 5 | pseries | 117.247036 | | |
>> |------------+----------+---------------+--------------------------+-----------------------|
>>
>
> How do we interpret these numbers? Are they times, or speed? Is
> larger better or worse?
Sorry for not including the details. They are time in seconds. Test case
is a modified mmap_bench included in powerpc/selftest.
>
> Can you give us the mean and standard deviation for each set of 5
> please?
>
powernv without patch
median= 51.432255
stdev = 0.370835
with patch
median = 50.739922
stdev = 0.06419662
pseries without patch
median = 116.617884
stdev = 3.04531023
with patch no hcall
median = 119.42494
stdev = 0.85874552
with patch and hcall
median = 117.735808
stdev = 2.7624151
-aneesh
More information about the Linuxppc-dev
mailing list