[Skiboot] [RFC PATCH] Use two VSX registers to do initial copy of skiboot

Stewart Smith stewart at linux.vnet.ibm.com
Thu May 21 18:40:35 AEST 2015


Note my awful fiddling of MSR so that I can use VSX registers :)

In the hello_world test running in mambo, we get the following reduction
in instruction/cycle count:
Before:
20284943: ** finished running 20284942 instructions **

Single VSX register:
19687022: ** finished running 19641315 instructions **

using 2 vsx registers:
19621488: ** finished running 19575781 instructions **
19621488: ** finished running 19575781 instructions **

65534 fewer cycles & instructions than just 1
709161 fewer than base implementation

using 3 vsx regs:
19883634: ** finished running 19837927 instructions **

Signed-off-by: Stewart Smith <stewart at linux.vnet.ibm.com>
---
 asm/head.S |   24 +++++++++++++++++++-----
 1 file changed, 19 insertions(+), 5 deletions(-)

diff --git a/asm/head.S b/asm/head.S
index fd6e3fb..9b3d7bb 100644
--- a/asm/head.S
+++ b/asm/head.S
@@ -314,16 +314,30 @@ boot_entry:
 	cmpd	%r29,%r30
 	beq	2f
 	LOAD_IMM32(%r3, _sbss - __head)
-	srdi	%r3,%r3,3
+	srdi	%r3,%r3,5
 	mtctr	%r3
+	mfmsr	%r24
+	mfmsr	%r4
+	oris	%r4,%r4, (1<<13)@h
+	oris	%r4,%r4, (1<<23)@h
+	oris	%r4,%r4, (1<<25)@h
+	mtmsr	%r4
 	mr	%r4,%r30
 	mr	%r15,%r30
 	mr	%r30,%r29
-1:	ld	%r0,0(%r4)
-	std	%r0,0(%r29)
-	addi	%r29,%r29,8
-	addi	%r4,%r4,8
+	addi	%r5,%r4,16
+	addi	%r6,%r29,16
+	
+1:	lxvd2x	%vs1,%r0,%r4
+	lxvd2x	%vs2,%r0,%r5
+	stxvd2x	%vs1,%r0,%r29
+	stxvd2x	%vs2,%r0,%r6
+	addi	%r29,%r29,16
+	addi	%r4,%r4,16
+	addi    %r5,%r5,16
+        addi    %r6,%r6,16
 	bdnz	1b
+	mtmsr	%r24
 	sync
 	icbi	0,%r29
 	sync
-- 
1.7.10.4



More information about the Skiboot mailing list