[PATCH] powerpc/mm: use _raw variant of page table accessors
Ram Pai
linuxram at us.ibm.com
Thu Jun 2 08:13:11 AEST 2016
On Tue, May 31, 2016 at 04:29:42PM +0530, Aneesh Kumar K.V wrote:
> This switch few of the page table accessor to use the __raw variant
> and does the cpu to big endian conversion of constants. This helps in
> generating better code.
>
> For ex: a pgd_none(pgd) check with and without fix is listed below
>
> Without fix:
> ------------
> 2240: 20 00 61 eb ld r27,32(r1)
> /* PGD level */
> typedef struct { __be64 pgd; } pgd_t;
> static inline unsigned long pgd_val(pgd_t x)
> {
> return be64_to_cpu(x.pgd);
>
> 2244: 22 00 66 78 rldicl r6,r3,32,32
> 2248: 3e 40 7d 54 rotlwi r29,r3,8
> 224c: 0e c0 7d 50 rlwimi r29,r3,24,0,7
> 2250: 3e 40 c5 54 rotlwi r5,r6,8
> 2254: 2e c4 7d 50 rlwimi r29,r3,24,16,23
> 2258: 0e c0 c5 50 rlwimi r5,r6,24,0,7
> 225c: 2e c4 c5 50 rlwimi r5,r6,24,16,23
> 2260: c6 07 bd 7b rldicr r29,r29,32,31
> 2264: 78 2b bd 7f or r29,r29,r5
> if (pgd_none(pgd))
> 2268: 00 00 bd 2f cmpdi cr7,r29,0
> 226c: 54 03 9e 41 beq cr7,25c0 <__get_user_pages_fast+0x500>
>
> With fix:
> ---------
> 2370: 20 00 61 eb ld r27,32(r1)
> if (pgd_none(pgd))
> 2374: 00 00 bd 2f cmpdi cr7,r29,0
> 2378: a8 03 9e 41 beq cr7,2720 <__get_user_pages_fast+0x530>
> break;
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar at linux.vnet.ibm.com>
> ---
> arch/powerpc/include/asm/book3s/64/pgtable-4k.h | 6 +-
> arch/powerpc/include/asm/book3s/64/pgtable-64k.h | 6 +-
> arch/powerpc/include/asm/book3s/64/pgtable.h | 99 +++++++++++++++++-------
> arch/powerpc/include/asm/pgtable-be-types.h | 15 ++++
> 4 files changed, 91 insertions(+), 35 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable-4k.h b/arch/powerpc/include/asm/book3s/64/pgtable-4k.h
> index 71e9abced493..9db83b4e017d 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgtable-4k.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgtable-4k.h
> @@ -11,7 +11,7 @@ static inline int pmd_huge(pmd_t pmd)
> * leaf pte for huge page
> */
> if (radix_enabled())
> - return !!(pmd_val(pmd) & _PAGE_PTE);
> + return !!(pmd_raw(pmd) & cpu_to_be64(_PAGE_PTE));
> return 0;
> }
>
> @@ -21,7 +21,7 @@ static inline int pud_huge(pud_t pud)
> * leaf pte for huge page
> */
> if (radix_enabled())
> - return !!(pud_val(pud) & _PAGE_PTE);
> + return !!(pud_raw(pud) & cpu_to_be64(_PAGE_PTE));
> return 0;
> }
>
> @@ -31,7 +31,7 @@ static inline int pgd_huge(pgd_t pgd)
> * leaf pte for huge page
> */
> if (radix_enabled())
> - return !!(pgd_val(pgd) & _PAGE_PTE);
> + return !!(pgd_raw(pgd) & cpu_to_be64(_PAGE_PTE));
pgd_raw() will not do the endian swapping.
But instead cpu_to_be64(_PAGE_PTE) will now do the endian swapping. So does it
really optimize anything? i tend to think it just moves the endian
swapping overhead from one place to the other. no?
Is cpu_to_be64(constant) faster than cpu_to_be64(variable) ?
RP
More information about the Linuxppc-dev
mailing list