Dave Engebretsen engebret at vnet.ibm.com
Thu Jan 8 02:39:07 EST 2004

Anton Blanchard wrote:

 >> BenH:
>>I tend to think that our spinlocks are so big nowadays that it would
>>probably be worth un-inlining them....

If we uninline them, the advantage of leaf function optimizations are
lost -- it seems like that would be a pretty big hit, right?.  We don't
have any  good data, but it may well be about a wash vs. the 1/2 cache
line of extra instructions introduced for shared processors.

> I prefer out of line slowpath directly below the function rather than
> one single out of line spinlock. It makes profiling much easier, while we
> can backtrace out of the spinlock when doing readprofile profiling, for
> hardware performance monitor profiling we get an address that happened
> somewhere in time and cant do a backtrace.

Isn't this going to result in shared processor locks always stacking the
"mini-frame"?  That is a pretty big hit for what is likely to be a very
common customer configuration.

> static inline void _raw_spin_lock(spinlock_t *lock)
> {


> SPLPAR_spinlock_r##REG :\
> 	stdu	r1,-STACKFRAMESIZE(r1); \
> 	std	r4,SAVE_R4(r1); \
> 	std	r5,SAVE_R5(r1); \
> 	lwz	r5,0x280(REG);	/* load dispatch counter */ \
> 	andi.	r4,5,1; 	/* if even then go back and spin */ \
> 	beq	1f; \
> 	std	r3,SAVE_R3(r1); \
> 	li	3,0xE4;		/* give up the cycles H_CONFER */ \
> 	lhz	4,0x18(REG);	/* processor number */ \
> 	HVSC; \
> 	ld	r3,SAVE_R3(r1); \
> 1:	ld	r4,SAVE_R4(r1); \
> 	ld	r5,SAVE_R5(r1); \
> 	addi	r1,r1,STACKFRAMESIZE; \
> 	blr

What magic results in this ending up at the end of each function?

When Peter & I were just looking at this, he pointed out that lwz
r5,0x2580(0) may not quite have the intended results :)

Also, where in this are cr0, cr1, and xer marked as clobbered?  They are
all volitile over the hcall.


** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/

More information about the Linuxppc64-dev mailing list