spinlocks
Anton Blanchard
anton at samba.org
Sun Jan 11 12:52:30 EST 2004
> If we uninline them, the advantage of leaf function optimizations are
> lost -- it seems like that would be a pretty big hit, right?. We don't
> have any good data, but it may well be about a wash vs. the 1/2 cache
> line of extra instructions introduced for shared processors.
We can execute a large number of instructions in the time it takes to
satisfy one cache miss from memory. A half a cache line is an awfully
large thing to inline for something as common as a spinlock.
The only data we have so far is from Joel, and his results show a small
but noticeable improvement from removing 2 spinlock instructions in our
fast path.
> Isn't this going to result in shared processor locks always stacking the
> "mini-frame"? That is a pretty big hit for what is likely to be a very
> common customer configuration.
Perhaps. I dont see why such a big hit, considering phyp will often end
up swapping contexts in that code path. Im guessing that will take a
long time to complete.
> What magic results in this ending up at the end of each function?
There is only 1 copy of it in the kernel.
> When Peter & I were just looking at this, he pointed out that lwz
> r5,0x2580(0) may not quite have the intended results :)
Thanks, it needs some work still :)
> Also, where in this are cr0, cr1, and xer marked as clobbered? They are
> all volitile over the hcall.
We'll have to add them.
Anton
** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/
More information about the Linuxppc64-dev
mailing list