[RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t
Madhavan Srinivasan
maddy at linux.vnet.ibm.com
Thu Dec 4 01:59:37 AEDT 2014
On Tuesday 02 December 2014 03:05 AM, Gabriel Paubert wrote:
> On Thu, Nov 27, 2014 at 05:48:40PM +0530, Madhavan Srinivasan wrote:
>> This patch create the infrastructure to handle the CR based
>> local_* atomic operations. Local atomic operations are fast
>> and highly reentrant per CPU counters. Used for percpu
>> variable updates. Local atomic operations only guarantee
>> variable modification atomicity wrt the CPU which owns the
>> data and these needs to be executed in a preemption safe way.
>>
>> Here is the design of this patch. Since local_* operations
>> are only need to be atomic to interrupts (IIUC), patch uses
>> one of the Condition Register (CR) fields as a flag variable. When
>> entering the local_*, specific bit in the CR5 field is set
>> and on exit, bit is cleared. CR bit checking is done in the
>> interrupt return path. If CR5[EQ] bit set and if we return
>> to kernel, we reset to start of local_* operation.
>>
>> Reason for this approach is that, currently l[w/d]arx/st[w/d]cx.
>> instruction pair is used for local_* operations, which are heavy
>> on cycle count and they dont support a local variant. So to
>> see whether the new implementation helps, used a modified
>> version of Rusty's benchmark code on local_t.
>>
>> https://lkml.org/lkml/2008/12/16/450
>>
>> Modifications:
>> - increated the working set size from 1MB to 8MB,
>> - removed cpu_local_inc test.
>>
>> Test ran
>> - on POWER8 1S Scale out System 2.0GHz
>> - on OPAL v3 with v3.18-rc4 patch kernel as Host
>>
>> Here are the values with the patch.
>>
>> Time in ns per iteration
>>
>> inc add read add_return
>> atomic_long 67 67 18 69
>> irqsave/rest 39 39 23 39
>> trivalue 39 39 29 49
>> local_t 26 26 24 26
>>
>> Since CR5 is used as a flag, have added CFLAGS to avoid CR5
>> for the kernel compilation and CR5 is zeroed at the kernel
>> entry.
>>
>> Tested the patch in a
>> - pSeries LPAR,
>> - Host with patched/unmodified guest kernel
>>
>> To check whether userspace see any CR5 corruption, ran a simple
>> test which does,
>> - set CR5 field,
>> - while(1)
>> - sleep or gettimeofday
>> - chk bit set
>>
>> Signed-off-by: Madhavan Srinivasan <maddy at linux.vnet.ibm.com>
>> ---
>> - I really appreciate feedback on the patchset.
>> - Kindly comment if I should try with any other benchmark or
>> workload to check the numbers.
>> - Also, kindly recommand any know stress test for CR
>>
>> Makefile | 6 ++
>> arch/powerpc/include/asm/exception-64s.h | 21 +++++-
>> arch/powerpc/kernel/entry_64.S | 106 ++++++++++++++++++++++++++++++-
>> arch/powerpc/kernel/exceptions-64s.S | 2 +-
>> arch/powerpc/kernel/head_64.S | 8 +++
>> 5 files changed, 138 insertions(+), 5 deletions(-)
>>
>> diff --git a/Makefile b/Makefile
>> index 00d618b..2e271ad 100644
>> --- a/Makefile
>> +++ b/Makefile
>> @@ -706,6 +706,12 @@ endif
>>
>> KBUILD_CFLAGS += $(call cc-option, -fno-var-tracking-assignments)
>>
>> +ifdef CONFIG_PPC64
>> +# We need this flag to force compiler not to use CR5, since
>> +# local_t type code is based on this.
>> +KBUILD_CFLAGS += -ffixed-cr5
>> +endif
>> +
>> ifdef CONFIG_DEBUG_INFO
>> ifdef CONFIG_DEBUG_INFO_SPLIT
>> KBUILD_CFLAGS += $(call cc-option, -gsplit-dwarf, -g)
>> diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h
>> index 77f52b2..c42919a 100644
>> --- a/arch/powerpc/include/asm/exception-64s.h
>> +++ b/arch/powerpc/include/asm/exception-64s.h
>> @@ -306,7 +306,26 @@ do_kvm_##n: \
>> std r10,0(r1); /* make stack chain pointer */ \
>> std r0,GPR0(r1); /* save r0 in stackframe */ \
>> std r10,GPR1(r1); /* save r1 in stackframe */ \
>> - beq 4f; /* if from kernel mode */ \
>> +BEGIN_FTR_SECTION; \
>> + lis r9,4096; /* Create a mask with HV and PR */ \
>> + rldicr r9,r9,32,31; /* bits, AND with the MSR */ \
>> + mr r10,r9; /* to check for Hyp state */ \
>> + ori r9,r9,16384; \
>> + and r9,r12,r9; \
>> + cmpd cr3,r10,r9; \
>> + beq cr3,66f; /* Jump if we come from Hyp mode*/ \
>> + mtcrf 0x04,r10; /* Clear CR5 if coming from usr */ \
>
> I think you can do better than this, powerpc has a fantastic set
> of rotate and mask instructions. If I understand correctly your
> code you can replace it with the following:
>
> rldicl r10,r12,4,63 /* Extract HV bit to LSB of r10*/
> rlwinm r9,r12,19,0x02 /* Extract PR bit to 2nd to last bit of r9 */
> or r9,r9,10
> cmplwi cr3,r9,1 /* Check for HV=1 and PR=0 */
> beq cr3,66f
> mtcrf 0x04,r10 /* Bits going to cr5 bits are 0 in r10 */
>
>From previous comment, at this point, CR0 already has MSR[PR], and only
HV need to be checked. So I guess there still room for optimization.
But good to learn this seq.
> and obviously similarly for the other instances of this code sequence.
>
> This said, mtcrf is not necessarily the fastest way of setting cr5 if
> you only want to clear the EQ bit. For example comparing the stack pointer
> with 0 should have the same effect.
>
Yes. Will try this out.
>> +FTR_SECTION_ELSE; \
>> + beq 4f; /* if kernel mode branch */ \
>> + li r10,0; /* Clear CR5 incase of coming */ \
>> + mtcrf 0x04,r10; /* from user. */ \
>> + nop; /* This part of code is for */ \
>> + nop; /* kernel with MSR[HV]=0, */ \
>> + nop; /* MSR[PR]=0, so just chk for */ \
>> + nop; /* MSR[PR] */ \
>> + nop; \
>> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE); \
>> +66: beq 4f; /* if from kernel mode */ \
>> ACCOUNT_CPU_USER_ENTRY(r9, r10); \
>> SAVE_PPR(area, r9, r10); \
>> 4: EXCEPTION_PROLOG_COMMON_2(area) \
>> diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
>> index 0905c8d..e42bb99 100644
>> --- a/arch/powerpc/kernel/entry_64.S
>> +++ b/arch/powerpc/kernel/entry_64.S
>> @@ -68,7 +68,26 @@ system_call_common:
>> 2: std r2,GPR2(r1)
>> std r3,GPR3(r1)
>> mfcr r2
>> - std r4,GPR4(r1)
>> +BEGIN_FTR_SECTION
>> + lis r10,4096
>> + rldicr r10,r10,32,31
>> + mr r11,r10
>> + ori r10,r10,16384
>> + and r10,r12,r10
>> + cmpd r11,r10
>> + beq 67f
>> + mtcrf 0x04,r11
>> +FTR_SECTION_ELSE
>> + beq 67f
>> + li r11,0
>> + mtcrf 0x04,r11
>> + nop
>> + nop
>> + nop
>> + nop
>> + nop
>> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
>> +67: std r4,GPR4(r1)
>> std r5,GPR5(r1)
>> std r6,GPR6(r1)
>> std r7,GPR7(r1)
>> @@ -224,8 +243,26 @@ syscall_exit:
>> BEGIN_FTR_SECTION
>> stdcx. r0,0,r1 /* to clear the reservation */
>> END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
>> +BEGIN_FTR_SECTION
>> + lis r4,4096
>> + rldicr r4,r4,32,31
>> + mr r6,r4
>> + ori r4,r4,16384
>> + and r4,r8,r4
>> + cmpd cr3,r6,r4
>> + beq cr3,65f
>> + mtcr r5
>> +FTR_SECTION_ELSE
>> andi. r6,r8,MSR_PR
>> - ld r4,_LINK(r1)
>> + beq 65f
>> + mtcr r5
>> + nop
>> + nop
>> + nop
>> + nop
>> + nop
>> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
>> +65: ld r4,_LINK(r1)
>>
>> beq- 1f
>> ACCOUNT_CPU_USER_EXIT(r11, r12)
>> @@ -234,7 +271,11 @@ END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
>> 1: ld r2,GPR2(r1)
>> ld r1,GPR1(r1)
>> mtlr r4
>> +#ifdef CONFIG_PPC64
>> + mtcrf 0xFB,r5
>> +#else
>> mtcr r5
>> +#endif
>> mtspr SPRN_SRR0,r7
>> mtspr SPRN_SRR1,r8
>> RFI
>> @@ -804,7 +845,66 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
>> */
>> .globl fast_exception_return
>> fast_exception_return:
>> - ld r3,_MSR(r1)
>> +
>> + /*
>> + * Now that we are about to exit from interrupt, lets check for
>> + * cr5 eq bit. If it is set, then we may be in the middle of
>> + * local_t update. In this case, we should rewind the NIP
>> + * accordingly.
>> + */
>> + mfcr r3
>> + andi. r4,r3,0x200
>> + beq 63f
>
> I believe that someonw has already mentioned that this is a very
> convoluted way of testing a cr bit. This can be optimized to bne cr5,63f
>
This. Will change it.
Regards
Maddy
>> +
>> + /*
>> + * Now that the bit is set, lets check for return to User
>> + */
>> + ld r4,_MSR(r1)
>> +BEGIN_FTR_SECTION
>> + li r3,4096
>> + rldicr r3,r3,32,31
>> + mr r5,r3
>> + ori r3,r3,16384
>> + and r3,r4,r3
>> + cmpd r5,r3
>> + bne 63f
>> +FTR_SECTION_ELSE
>> + andi. r3,r4,MSR_PR
>> + bne 63f
>> + nop
>> + nop
>> + nop
>> + nop
>> + nop
>> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
>> +
>> + /*
>> + * Looks like we are returning to Kernel, so
>> + * lets get the NIP and search the ex_table.
>> + * Change the NIP based on the return value
>> + */
>> +lookup_ex_table:
>> + ld r3,_NIP(r1)
>> + bl search_exception_tables
>> + cmpli 0,1,r3,0
>> + bne 62f
>> +
>> + /*
>> + * This is a panic case. Reason is that, we
>> + * have the CR5 bit set, but we are not in
>> + * local_* code and we are returning to Kernel.
>> + */
>> + ld r3,_NIP(r1)
>> + mfcr r4
>> + EMIT_BUG_ENTRY lookup_ex_table, __FILE__,__LINE__,BUGFLAG_WARNING
>> +
>> + /*
>> + * Now save the return fixup address as NIP
>> + */
>> +62: ld r4,8(r3)
>> + std r4,_NIP(r1)
>> + crclr 22
>> +63: ld r3,_MSR(r1)
>> ld r4,_CTR(r1)
>> ld r0,_LINK(r1)
>> mtctr r4
>> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
>> index 72e783e..edb75a9 100644
>> --- a/arch/powerpc/kernel/exceptions-64s.S
>> +++ b/arch/powerpc/kernel/exceptions-64s.S
>> @@ -637,7 +637,7 @@ masked_##_H##interrupt: \
>> rldicl r10,r10,48,1; /* clear MSR_EE */ \
>> rotldi r10,r10,16; \
>> mtspr SPRN_##_H##SRR1,r10; \
>> -2: mtcrf 0x80,r9; \
>> +2: mtcrf 0x90,r9; \
>> ld r9,PACA_EXGEN+EX_R9(r13); \
>> ld r10,PACA_EXGEN+EX_R10(r13); \
>> ld r11,PACA_EXGEN+EX_R11(r13); \
>> diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
>> index d48125d..02e49b3 100644
>> --- a/arch/powerpc/kernel/head_64.S
>> +++ b/arch/powerpc/kernel/head_64.S
>> @@ -347,6 +347,14 @@ __mmu_off:
>> *
>> */
>> __start_initialization_multiplatform:
>> +
>> + /*
>> + * Before we do anything, lets clear CR5 field,
>> + * so that we will have a clean start at entry
>> + */
>> + li r11,0
>> + mtcrf 0x04,r11
>> +
>> /* Make sure we are running in 64 bits mode */
>> bl enable_64b_mode
>>
>
>
> Regards,
> Gabriel
>
More information about the Linuxppc-dev
mailing list