[RFC][PATCH] powerpc/64s: rewriting interrupt entry code
Nicholas Piggin
npiggin at gmail.com
Fri Mar 23 00:05:49 AEDT 2018
Long long post ahead...
I've been playing with rewriting interrupt entry code, this is really
rough patch so far, but it boots mambo. I'll just post it now to get
opinions on the approach.
This implements a new set of exception macros, converts the decrementer
to use them (it's maskable so it covers more cases).
Overall two main points to this work. First is to make the code easier
to understand and hack on, second is to improve performance of the end
result.
For the former case, gas macros are used rather than cpp macros as the
main building block. IMO this really turns out a lot nicer for a few
reasons -- we can conditionally include code by testing args rather
than passing in other macros that define our conditional bits, and we
can use cpp conditional compilation easily inside the gas macros. These
two properties means we don't have bits of asm code scattered through
various macros which call each other and are passed into other macros
etc. The everything is pretty linear and flat. Not having to use big
split line makes things nicer to rejig too.
I tried to make the syntax to do conditional asm a bit nicer, but
couldn't find a great way. It's not *horrible*:
#ifdef CONFIG_KVM_BOOK3S_64_HANDLER
.ifgt \kvm
lbz r25,HSTATE_IN_GUEST(r13)
cmpwi r25,0
bne 1f
.endif
#endif
We could improve it a bit maybe. You could put a cpp wrapper over it:
#ifdef CONFIG_KVM_BOOK3S_64_HANDLER
IF(kvm)
lbz r25,HSTATE_IN_GUEST(r13)
cmpwi r25,0
bne 1f
ENDIF
#endif
Also if anyone actually read the code, the macro invocations are
bare:
INT_ENTRY decrementer,0x80,0,1,PACA_EXGEN,1,0,1,1,1
Again this could be wrapped:
INT_ENTRY(decrementer, 0x80, INT_SRR, INT_REAL, INT_KVMTEST,
INT_NO_CFAR, INT_PPR, INT_TB)
I think this approach will allow the amount of open coded and randomly
used macros to be reduced too. I'd like to really standardize entry a
lot even if it means e.g., some less performance critical interrupts
like MCE and HMI and up saving slightly more regs than they needed to.
Second thing is performance. The biggest concern in entry code is SPR
accesses, then probably loads and stores (assuming we've minimised
branches already). SPR reads should all be done first before any SPR
writes, to avoid scoreboard stalls. SPR writes should be minimised and
so should serialising reads (CFAR, PPR, TB).
So my thinking is:
- Avoid some of these SPR reads if possible. We can avoid saving and
setting PPR if we don't go to general C code (e.g. SLB miss). We
could avoid CFAR for some async interrupts, if we could rely on 0x100
for debug IPIs then important external and doorbell interrupts could
avoid CFAR.
- Start with a bunch of stores to free up GPRs, then do the serializing
SPR reads as soon as possible before the pipeline fills (these reads
have to wait for all previous OPs to complete before they can begin).
- Don't store these SPR reads immediately into the PACA, but keep them
in the GPRs we've just freed. This should make it simpler to keep all
stores close in cache, and importantly it avoids involveing the LSU in
this dependency. Stores interact with barriers, and store queue
resources can be allocated while the store waits for this dependency.
- In some cases (e.g., SLB miss) the CFAR may never be used. If we avoid
storing the value anywhere, the data doesn't end up in a critical
execution path (though it still pushes completion out).
- SPRs can be passed via GPRs through to C interrupt handlers. In this
case we read TB right up front and pass it into timer_interrupt to
avoid a mftb there.
- A number of HSRR interrupts do not clear MSR[RI], so setting it
should be avoided for those. But might as well go one further and
avoid setting MSR[RI]=1 until we're ready to set MSR[EE]=1 so they
can be done at once. It does increase RI=0 window a bit, but we
don't take SLB misses on the kernel stack, and we already deal with
IR=DR=1 && RI=0 case for virt interrupts so we're already exposed
to machine check in translation there.
- Use non-volatile GPRs for scratch registers. This means we can save
non-volatiles before calling a C function just by storing them
immediately to the stack (rather than loading from paca first). It
allows us to call C functions without blowing our scratch registers.
- Load the stack early from the paca so register saving stores to stack
get their dependency as soon as possible.
- Not in this patch and not entirely depending on it, but I would like
to convert kvm interrupt entry over to using this same convention of
PACA_EX save areas and register layout. Existing KVM calls are slower
than they could be because they switch to using HSTATE_SCRATCH etc
and this gets even worse now with more registers saved before the
KVM test. Other benefit is that KVM entry at the moment is not
reentrant-safe (e.g., machine check interrupting a hypervisor doorbell
while KVM is in guest will corrupt scratch space despite MSR[RI]=1).
Using the different paca save areas would solve that.
That's about all I can think of at the moment.
Thanks,
Nick
diff --git a/arch/powerpc/include/asm/exception-64s-new.h b/arch/powerpc/include/asm/exception-64s-new.h
new file mode 100644
index 000000000000..f5fdc49d14c5
--- /dev/null
+++ b/arch/powerpc/include/asm/exception-64s-new.h
@@ -0,0 +1,291 @@
+#ifndef _ASM_POWERPC_EXCEPTION_NEW_H
+#define _ASM_POWERPC_EXCEPTION_NEW_H
+/*
+ * The following macros define the code that appears as
+ * the prologue to each of the exception handlers. They
+ * are split into two parts to allow a single kernel binary
+ * to be used for pSeries and iSeries.
+ *
+ * We make as much of the exception code common between native
+ * exception handlers (including pSeries LPAR) and iSeries LPAR
+ * implementations as possible.
+ */
+#include <asm/head-64.h>
+#include <asm/exception-64s.h>
+
+#define EX_R16 0x00
+#define EX_R17 0x08
+#define EX_R18 0x10
+#define EX_R19 0x18
+#define EX_R20 0x20
+#define EX_R21 0x28
+#define EX_R22 0x30
+#define EX_R23 0x38
+#define EX_R24 0x40
+#define EX_R25 0x48
+#define EX_R26 0x50
+#define EX_R1 0x58
+
+.macro INT_ENTRY name size hsrr virt area kvm cfar ppr tb stack
+ SET_SCRATCH0(r13) /* save r13 */
+ GET_PACA(r13)
+ .ifgt \cfar
+ std r16,\area+EX_R16(r13)
+ .endif
+ .ifgt \ppr
+ std r17,\area+EX_R17(r13)
+ .endif
+ .ifgt \tb
+ std r18,\area+EX_R18(r13)
+ .endif
+ .ifgt \stack
+ std r19,\area+EX_R19(r13)
+ .endif
+ .ifgt \cfar
+ OPT_GET_SPR(r16, SPRN_CFAR, CPU_FTR_CFAR)
+ .endif
+ .if (\size == 0x20)
+ b \name\()_tramp
+ .ifgt \virt
+ .pushsection "virt_trampolines"
+ .else
+ .pushsection "real_trampolines"
+ .endif
+\name\()_tramp:
+ .endif
+
+ .ifgt \ppr
+ OPT_GET_SPR(r17, SPRN_PPR, CPU_FTR_HAS_PPR)
+ .endif
+ .ifgt \tb
+ mftb r18
+ .endif
+ .ifgt \stack
+ ld r19,PACAKSAVE(r13) /* kernel stack to use */
+ .endif
+ std r20,\area+EX_R20(r13)
+ std r21,\area+EX_R21(r13)
+ std r22,\area+EX_R22(r13)
+ std r23,\area+EX_R23(r13)
+ .ifgt \hsrr
+ mfspr r20,SPRN_HSRR0
+ mfspr r21,SPRN_HSRR1
+ .else
+ mfspr r20,SPRN_SRR0
+ mfspr r21,SPRN_SRR1
+ .endif
+ mfcr r22
+ mfctr r23
+ std r24,\area+EX_R24(r13)
+ std r25,\area+EX_R25(r13)
+ .ifgt \stack
+ mr r24,r1
+ .endif
+#ifdef CONFIG_KVM_BOOK3S_64_HANDLER
+ .ifgt \kvm
+ lbz r25,HSTATE_IN_GUEST(r13)
+ cmpwi r25,0
+ bne 1f
+ .endif
+#endif
+#ifdef CONFIG_RELOCATABLE
+ .ifgt \virt
+ LOAD_HANDLER(r25,\name\()_virt)
+ .else
+ LOAD_HANDLER(r25,\name\()_real)
+ .endif
+ mtctr r25
+ bctr
+#ifdef CONFIG_KVM_BOOK3S_64_HANDLER
+ .ifgt \kvm
+1: LOAD_HANDLER(r25,\name\()_kvm)
+ mtctr r25
+ bctr
+ .endif
+#endif
+#else /* CONFIG_RELOCATABLE */
+ .ifgt \virt
+ b \name\()_virt
+ .else
+ b \name\()_real
+ .endif
+#ifdef CONFIG_KVM_BOOK3S_64_HANDLER
+ .ifgt \kvm
+1: b \name\()_kvm
+ .endif
+#endif
+#endif /* CONFIG_RELOCATABLE */
+ .if (\size == 0x20)
+ .popsection
+ .endif
+.endm
+
+.macro INT_ENTRY_RESTORE area cfar ppr tb
+ mtcr r22
+ mtctr r23
+ mr r1,r24
+ .ifgt \cfar
+ ld r16,\area+EX_R16(r13)
+ .endif
+ .ifgt \ppr
+ ld r17,\area+EX_R17(r13)
+ .endif
+ .ifgt \tb
+ ld r18,\area+EX_R18(r13)
+ .endif
+ ld r19,\area+EX_R19(r13)
+ ld r20,\area+EX_R20(r13)
+ ld r21,\area+EX_R21(r13)
+ ld r22,\area+EX_R22(r13)
+ ld r23,\area+EX_R23(r13)
+ ld r24,\area+EX_R24(r13)
+ ld r25,\area+EX_R25(r13)
+.endm
+
+/*
+ * After INT_ENTRY, with r1 set to a valid stack pointer, this macro sets up
+ * the stack frame, saves state into it, restores the NVGPR registers, and
+ * loads the TOC into r2.
+ */
+.macro INT_SETUP_C_CALL area cfar ppr tb
+ std r24,0(r1) /* make stack chain pointer */
+ std r0,GPR0(r1) /* save r0 in stackframe */
+ std r24,GPR1(r1) /* save r1 in stackframe */
+ std r2,GPR2(r1) /* save r2 in stackframe */
+ ld r2,PACATOC(r13) /* get kernel TOC into r2 */
+ GET_SCRATCH0(r0)
+ SAVE_4GPRS(3, r1) /* save r3 - r6 in stackframe */
+ mflr r3
+ mfspr r4,SPRN_XER
+ ld r5,PACACURRENT(r13)
+ ld r6,exception_marker at toc(r2)
+ SAVE_4GPRS(7, r1) /* save r7 - r10 in stackframe */
+ SAVE_2GPRS(11, r1) /* save r11 - r12 in stackframe */
+ std r0,GPR13(r1)
+ std r20,_NIP(r1) /* save SRR0 in stackframe */
+ std r21,_MSR(r1) /* save SRR1 in stackframe */
+ std r22,_CCR(r1) /* save CR in stackframe */
+ std r23,_CTR(r1) /* save CTR in stackframe */
+ std r3,_LINK(r1)
+ std r4,_XER(r1)
+ std r25,_TRAP(r1) /* set trap number */
+ li r3,0
+ std r3,RESULT(r1) /* clear regs->result */
+ std r19,SOFTE(r1)
+ std r6,STACK_FRAME_OVERHEAD-16(r1) /* mark the frame */
+
+ HMT_MEDIUM /* XXX: where to put this? It is NTC SPR write, should go after all SPR reads, late but before NTC SPR read stores?? (cfar, tb, ppr) */
+
+#ifdef CONFIG_TRACE_IRQFLAGS
+ andi. r0,r19,IRQS_DISABLED
+ bne 1f
+ TRACE_DISABLE_INTS /* clobbers volatile registers */
+1:
+#endif
+
+ /* XXX: async calls */
+ FINISH_NAP
+ RUNLATCH_ON
+
+ addi r3,r1,STACK_FRAME_OVERHEAD
+ .ifgt \cfar
+ std r16,ORIG_GPR3(r1)
+ ld r16,\area+EX_R16(r13)
+ .endif
+ .ifgt \ppr
+ std r17,TASKTHREADPPR(r5)
+ ld r17,\area+EX_R17(r13)
+ .endif
+ .ifgt \tb
+ mr r4,r18
+ ld r18,\area+EX_R18(r13)
+ .endif
+ ld r19,\area+EX_R19(r13)
+ ld r20,\area+EX_R20(r13)
+ ld r21,\area+EX_R21(r13)
+ ld r22,\area+EX_R22(r13)
+ ld r23,\area+EX_R23(r13)
+ ld r24,\area+EX_R24(r13)
+ ld r25,\area+EX_R25(r13)
+.endm
+
+.macro INT_COMMON name vec area mask cfar ppr tb
+\name\()_real:
+ ld r25,PACAKMSR(r13) /* MSR value for kernel */
+ xori r25,r25,MSR_RI /* clear MSR_RI */
+ mtmsrd r25,0
+ nop /* Quadword align the virt entry */
+\name\()_virt:
+ andi. r25,r21,MSR_PR
+ mr r1,r19
+ li r19,IRQS_ENABLED
+ li r25,PACA_IRQ_HARD_DIS
+ bne 1f
+ subi r1,r24,INT_FRAME_SIZE
+ .ifgt \mask
+ lbz r19,PACAIRQSOFTMASK(r13)
+ andi. r25,r19,\mask
+ lbz r25,PACAIRQHAPPENED(r13)
+ bne- \name\()_masked_interrupt
+ .else
+ lbz r25,PACAIRQHAPPENED(r13)
+ .endif
+ ori r25,r25,PACA_IRQ_HARD_DIS
+1:
+ stb r25,PACAIRQHAPPENED(r13)
+ li r25,IRQS_ALL_DISABLED
+ stb r25,PACAIRQSOFTMASK(r13)
+ li r25,\vec + 1
+ cmpdi r1,-INT_FRAME_SIZE /* check if r1 is in userspace */
+ bge- bad_stack_common /* abort if it is */
+ INT_SETUP_C_CALL \area \cfar \ppr \tb
+.endm
+
+.macro INT_KVM name hsrr vec area skip cfar ppr tb
+#ifdef CONFIG_KVM_BOOK3S_64_HANDLER
+ .ifgt \skip
+ cmpwi r25,KVM_GUEST_MODE_SKIP
+ beq 1f
+ HMT_MEDIUM /* XXX: where to put this? (see above) */
+ .endif
+ .ifgt \cfar
+ mr r25,r16 /* No CFAR, set it to 0 */
+ .else
+ li r25,0
+ .endif
+ std r25,HSTATE_CFAR(r13)
+ .ifgt \ppr /* No PPR */
+ mr r25,r17
+ .else
+ li r25,0
+ .endif
+ std r25,HSTATE_PPR(r13)
+ INT_ENTRY_RESTORE \area \cfar \ppr \tb
+ std r12,HSTATE_SCRATCH0(r13)
+ mfcr r12
+ sldi r12,r12,32
+ .ifgt \hsrr
+ ori r12,r12,\vec + 0x2
+ .else
+ ori r12,r12,\vec
+ .endif
+ b kvmppc_interrupt
+
+ .ifgt \skip
+1: addi r20,r20,4
+ .ifgt \hsrr
+ mtspr SPRN_HSRR0,r20
+ INT_ENTRY_RESTORE \area \cfar \ppr \tb
+ GET_SCRATCH0(r13)
+ HRFI_TO_KERNEL
+ .else
+ mtspr SPRN_SRR0,r20
+ INT_ENTRY_RESTORE \area \cfar \ppr \tb
+ GET_SCRATCH0(r13)
+ RFI_TO_KERNEL
+ .endif
+ .endif
+#endif
+.endm
+
+#endif /* _ASM_POWERPC_EXCEPTION_NEW_H */
diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h
index 471b2274fbeb..a4d501947097 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -49,11 +49,12 @@
#define EX_PPR 64
#if defined(CONFIG_RELOCATABLE)
#define EX_CTR 72
-#define EX_SIZE 10 /* size in u64 units */
#else
-#define EX_SIZE 9 /* size in u64 units */
#endif
+/* exception-64s-new.h uses 10 */
+#define EX_SIZE 10 /* size in u64 units */
+
/*
* maximum recursive depth of MCE exceptions
*/
diff --git a/arch/powerpc/include/asm/hw_irq.h b/arch/powerpc/include/asm/hw_irq.h
index 855e17d158b1..49fb156aa93a 100644
--- a/arch/powerpc/include/asm/hw_irq.h
+++ b/arch/powerpc/include/asm/hw_irq.h
@@ -54,7 +54,8 @@
extern void replay_system_reset(void);
extern void __replay_interrupt(unsigned int vector);
-extern void timer_interrupt(struct pt_regs *);
+extern void timer_interrupt(struct pt_regs *regs);
+extern void timer_interrupt_new(struct pt_regs *regs, u64 tb);
extern void performance_monitor_exception(struct pt_regs *regs);
extern void WatchdogException(struct pt_regs *regs);
extern void unknown_exception(struct pt_regs *regs);
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 2cb5109a7ea3..db934d29069c 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -995,7 +995,8 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
1: cmpwi cr0,r3,0x900
bne 1f
addi r3,r1,STACK_FRAME_OVERHEAD;
- bl timer_interrupt
+ mftb r4
+ bl timer_interrupt_new
b ret_from_except
#ifdef CONFIG_PPC_DOORBELL
1:
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index b6d1baecfbff..c700a9d7e17a 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -820,11 +820,42 @@ END_FTR_SECTION_IFSET(CPU_FTR_TM)
#endif
-EXC_REAL_MASKABLE(decrementer, 0x900, 0x80, IRQS_DISABLED)
-EXC_VIRT_MASKABLE(decrementer, 0x4900, 0x80, 0x900, IRQS_DISABLED)
-TRAMP_KVM(PACA_EXGEN, 0x900)
-EXC_COMMON_ASYNC(decrementer_common, 0x900, timer_interrupt)
+#include <asm/exception-64s-new.h>
+
+EXC_REAL_BEGIN(decrementer, 0x900, 0x80)
+ /*
+ * decrementer handler:
+ * SRR[01], real, exgen, kvm, !cfar, ppr, tb, stack
+ */
+ INT_ENTRY decrementer,0x80,0,1,PACA_EXGEN,1,0,1,1,1
+EXC_REAL_END(decrementer, 0x900, 0x80)
+
+EXC_VIRT_BEGIN(decrementer, 0x4900, 0x80)
+ /*
+ * decrementer handler:
+ * SRR[01], virt, exgen, kvm, !cfar, ppr, tb, stack
+ */
+ INT_ENTRY decrementer,0x80,0,1,PACA_EXGEN,1,0,1,1,1
+EXC_VIRT_END(decrementer, 0x4900, 0x80)
+
+EXC_COMMON_BEGIN(decrementer_kvm)
+ INT_KVM decrementer,0,0x900,PACA_EXGEN,0,0,1,1
+
+EXC_COMMON_BEGIN(decrementer)
+ INT_COMMON decrementer,0x900,PACA_EXGEN,IRQS_DISABLED,0,1,1
+ bl timer_interrupt_new
+ b ret_from_except_lite
+
+decrementer_masked_interrupt:
+ ori r25,r25,SOFTEN_VALUE_0x900
+ stb r25,PACAIRQHAPPENED(r13)
+ lis r25,0x7fff
+ ori r25,r25,0xffff
+ mtspr SPRN_DEC,r25
+ INT_ENTRY_RESTORE PACA_EXGEN,0,1,1
+ RFI_TO_KERNEL
+EXC_COMMON_ASYNC(decrementer_common, 0x900, timer_interrupt)
EXC_REAL_HV(hdecrementer, 0x980, 0x80)
EXC_VIRT_HV(hdecrementer, 0x4980, 0x80, 0x980)
@@ -842,6 +873,7 @@ EXC_COMMON_ASYNC(doorbell_super_common, 0xa00, unknown_exception)
#endif
+
EXC_REAL(trap_0b, 0xb00, 0x100)
EXC_VIRT(trap_0b, 0x4b00, 0x100, 0xb00)
TRAMP_KVM(PACA_EXGEN, 0xb00)
@@ -1767,6 +1799,26 @@ END_FTR_SECTION_IFSET(CPU_FTR_CFAR)
b 1b
_ASM_NOKPROBE_SYMBOL(bad_stack);
+/*
+ * Here we have detected that the kernel stack pointer is bad.
+ * R9 contains the saved CR, r13 points to the paca,
+ * r10 contains the (bad) kernel stack pointer,
+ * r11 and r12 contain the saved SRR0 and SRR1.
+ * We switch to using an emergency stack, save the registers there,
+ * and call kernel_bad_stack(), which panics.
+ */
+bad_stack_common:
+ ld r1,PACAEMERGSP(r13)
+ subi r1,r1,64+INT_FRAME_SIZE
+ /*
+ * This clobbers r16-r18 for interrupts that use them, but we
+ * never return to userspace.
+ */
+ INT_SETUP_C_CALL PACA_EXGEN,0,0,0
+ bl kernel_bad_stack
+ b .
+_ASM_NOKPROBE_SYMBOL(bad_stack_common);
+
/*
* When doorbell is triggered from system reset wakeup, the message is
* not cleared, so it would fire again when EE is enabled.
@@ -1786,6 +1838,29 @@ doorbell_super_common_msgclr:
PPC_MSGCLRP(3)
b doorbell_super_common
+replay_decrementer:
+ /* XXX: crashes */
+ subi r1,r1,INT_FRAME_SIZE
+ std r1,INT_FRAME_SIZE(r1)
+ std r1,GPR1(r1)
+ std r2,GPR2(r1)
+ ld r5,PACACURRENT(r13)
+ ld r6,exception_marker at toc(r2)
+ std r11,_NIP(r1)
+ std r12,_MSR(r1)
+ std r9,_CCR(r1)
+ std r3,_TRAP(r1)
+ li r3,0
+ std r3,RESULT(r1)
+ lbz r3,PACAIRQSOFTMASK(r13)
+ std r3,SOFTE(r1)
+ std r6,STACK_FRAME_OVERHEAD-16(r1)
+ /* XXX: ppr? */
+ addi r3,r1,STACK_FRAME_OVERHEAD
+ mftb r4
+ bl timer_interrupt_new
+ b ret_from_except_lite
+
/*
* Called from arch_local_irq_enable when an interrupt needs
* to be resent. r3 contains 0x500, 0x900, 0xa00 or 0xe80 to indicate
@@ -1811,6 +1886,7 @@ _GLOBAL(__replay_interrupt)
ori r12,r12,MSR_EE
cmpwi r3,0x900
beq decrementer_common
+// beq replay_decrementer
cmpwi r3,0x500
BEGIN_FTR_SECTION
beq h_virt_irq_common
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index a32823dcd9a4..72b38917fd77 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -100,7 +100,7 @@ static struct clocksource clocksource_timebase = {
};
#define DECREMENTER_DEFAULT_MAX 0x7FFFFFFF
-u64 decrementer_max = DECREMENTER_DEFAULT_MAX;
+u64 decrementer_max __read_mostly = DECREMENTER_DEFAULT_MAX;
static int decrementer_set_next_event(unsigned long evt,
struct clock_event_device *dev);
@@ -535,12 +535,11 @@ void arch_irq_work_raise(void)
#endif /* CONFIG_IRQ_WORK */
-static void __timer_interrupt(void)
+static void __timer_interrupt(u64 now)
{
struct pt_regs *regs = get_irq_regs();
u64 *next_tb = this_cpu_ptr(&decrementers_next_tb);
struct clock_event_device *evt = this_cpu_ptr(&decrementers);
- u64 now;
trace_timer_interrupt_entry(regs);
@@ -549,7 +548,10 @@ static void __timer_interrupt(void)
irq_work_run();
}
+#ifndef CONFIG_PPC_BOOK3S_64
now = get_tb_or_rtc();
+#endif
+
if (now >= *next_tb) {
*next_tb = ~(u64)0;
if (evt->event_handler)
@@ -557,8 +559,9 @@ static void __timer_interrupt(void)
__this_cpu_inc(irq_stat.timer_irqs_event);
} else {
now = *next_tb - now;
- if (now <= decrementer_max)
- set_dec(now);
+ if (now > decrementer_max)
+ now = decrementer_max;
+ set_dec(now);
/* We may have raced with new irq work */
if (test_irq_work_pending())
set_dec(1);
@@ -576,19 +579,18 @@ static void __timer_interrupt(void)
trace_timer_interrupt_exit(regs);
}
+void timer_interrupt(struct pt_regs * regs)
+{
+ timer_interrupt_new(regs, get_tb_or_rtc());
+}
+
/*
* timer_interrupt - gets called when the decrementer overflows,
* with interrupts disabled.
*/
-void timer_interrupt(struct pt_regs * regs)
+void timer_interrupt_new(struct pt_regs * regs, u64 tb)
{
struct pt_regs *old_regs;
- u64 *next_tb = this_cpu_ptr(&decrementers_next_tb);
-
- /* Ensure a positive value is written to the decrementer, or else
- * some CPUs will continue to take decrementer exceptions.
- */
- set_dec(decrementer_max);
/* Some implementations of hotplug will get timer interrupts while
* offline, just ignore these and we also need to set
@@ -596,15 +598,21 @@ void timer_interrupt(struct pt_regs * regs)
* don't replay timer interrupt when return, otherwise we'll trap
* here infinitely :(
*/
- if (!cpu_online(smp_processor_id())) {
+ if (unlikely(!cpu_online(smp_processor_id()))) {
+ u64 *next_tb = this_cpu_ptr(&decrementers_next_tb);
*next_tb = ~(u64)0;
+ set_dec(decrementer_max);
return;
}
/* Conditionally hard-enable interrupts now that the DEC has been
* bumped to its maximum value
*/
- may_hard_irq_enable();
+ if (may_hard_irq_enable()) {
+ set_dec(decrementer_max);
+ get_paca()->irq_happened &= ~PACA_IRQ_HARD_DIS;
+ __hard_irq_enable();
+ }
#if defined(CONFIG_PPC32) && defined(CONFIG_PPC_PMAC)
@@ -615,7 +623,7 @@ void timer_interrupt(struct pt_regs * regs)
old_regs = set_irq_regs(regs);
irq_enter();
- __timer_interrupt();
+ __timer_interrupt(tb);
irq_exit();
set_irq_regs(old_regs);
}
@@ -971,10 +979,11 @@ static int decrementer_shutdown(struct clock_event_device *dev)
/* Interrupt handler for the timer broadcast IPI */
void tick_broadcast_ipi_handler(void)
{
+ u64 now = get_tb_or_rtc();
u64 *next_tb = this_cpu_ptr(&decrementers_next_tb);
- *next_tb = get_tb_or_rtc();
- __timer_interrupt();
+ *next_tb = now;
+ __timer_interrupt(now);
}
static void register_decrementer_clockevent(int cpu)
More information about the Linuxppc-dev
mailing list