[linux-fbdev] Re: readl() and friends and eieio on PPC
Paul Mackerras
paulus at cs.anu.edu.au
Wed Aug 11 10:23:46 EST 1999
Jes Sorensen <Jes.Sorensen at cern.ch> wrote:
> This is quite easily solved by putting in mb()'s in the right
> places. This is how it is done for other drivers that are supposed to
> work on the Alpha.
No, this is not an acceptable solution.
On ultrasparc at least, there is a "side-effect" bit in each PTE. If
that bit is set, it tells the cpu not to reorder accesses to that
page. I don't know whether alpha has the same facility, do you?
Anyway, it's hard enough educating device driver writers about the
need for byte-swapping on data in memory that is accessed by DMA.
Trying to get people to scatter mb()'s around their drivers would be a
herculean task (a bit like cleaning out the Augean stables, actually
:-).
Finally, mb() is actually a much stronger constraint than we need in a
device driver, and will slow things down unnecessarily. mb() implies
a strong ordering on all loads and stores to all memory. On the PPC,
mb() translates into the sync instruction, which is much slower than
eieio. For a sync, the cpu actually has to stop and wait for all bus
activity to complete, whereas for an eieio, it just puts a special
kind of entry in the stream of accesses going out to the memory bus.
> Having mb()'s explicitly put into the driver in the right places also
> makes sure that a driver will work on other architectures. Right now a
> driver that is written for the PPC is likely not to work on the Alpha
> if the author expects readl/writel to guarantee write ordering.
Well, if alpha is actually like that, then IMO it is broken.
I did some experiments this morning to test whether having eieio in
readl/writel is actually going to slow you down. The bottom line is
that the eieio introduces *no* measurable reduction in performance. I
used the little program that I have appended below (mtest.c and
mtm.S).
I ran it on my 7600 like this:
mtest 94000000 b420 e1480 200 400 2304 100
mtestn 94000000 b420 e1480 200 400 2304 100
This was with the screen at 1152x870, 16bpp. mtestn is just a symlink
to mtest. The results for 10 runs were:
with eieio: mean 2.825s, s.d. 0.007s
without eieio: mean 2.824s, s.d. 0.027s
I also tried it on my iMac (81000000 a000 b8350 200 400 2048 100) and
got 4.76s both with and without eieio.
So, unless and until you can show me some numbers that show an actual
performance degradation from having the eieio in readl/writel, the
eieio stays.
Paul.
mtest.c:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/mman.h>
extern void move_eieio(int *src, int *dst, int nx, int ny, int pitch);
extern void move_no_eieio(int *src, int *dst, int nx, int ny, int pitch);
main(int ac, char **av)
{
int fd;
unsigned long base, sof, dof;
int nx, ny, pitch;
long ptr;
int nrpt;
int use_eieio;
if (ac < 7) {
fprintf(stderr, "Usage: %s base sof dof nx ny pitch\n", av[0]);
exit(1);
}
base = strtoul(av[1], 0, 16);
sof = strtoul(av[2], 0, 16);
dof = strtoul(av[3], 0, 16);
nx = atoi(av[4]);
ny = atoi(av[5]);
pitch = atoi(av[6]);
nrpt = (ac > 7)? atoi(av[7]): 1;
if ((fd = open("/dev/mem", 2)) < 0) {
perror("/dev/mem");
exit(1);
}
use_eieio = strchr(av[0], 'n') == 0;
printf("%seieio\n", use_eieio? "": "no ");
ptr = mmap(0, 0x200000, PROT_READ|PROT_WRITE, MAP_SHARED, fd, base);
if (ptr == -1) {
perror("mmap");
exit(1);
}
if (use_eieio) {
do {
move_eieio((int *)(ptr + sof), (int *)(ptr + dof),
nx, ny, pitch);
dof += 4;
} while (--nrpt > 0);
} else {
do {
move_no_eieio((int *)(ptr + sof), (int *)(ptr + dof),
nx, ny, pitch);
dof += 4;
} while (--nrpt > 0);
}
exit(0);
}
mtm.S:
/* move_eieio(int *src, int *dst, int nx, int ny, int pitch) */
.globl move_eieio
move_eieio:
mtctr 5
li 8,0
2: lwbrx 0,3,8
eieio
stwbrx 0,4,8
eieio
addi 8,8,4
bdnz 2b
addic. 6,6,-1
blelr
add 3,3,7
add 4,4,7
b move_no_eieio
/* move_no_eieio(int *src, int *dst, int nx, int ny, int pitch) */
.globl move_no_eieio
move_no_eieio:
mtctr 5
li 8,0
2: lwbrx 0,3,8
stwbrx 0,4,8
addi 8,8,4
bdnz 2b
addic. 6,6,-1
blelr
add 3,3,7
add 4,4,7
b move_no_eieio
[[ This message was sent via the linuxppc-dev mailing list. Replies are ]]
[[ not forced back to the list, so be sure to Cc linuxppc-dev if your ]]
[[ reply is of general interest. Please check http://lists.linuxppc.org/ ]]
[[ and http://www.linuxppc.org/ for useful information before posting. ]]
More information about the Linuxppc-dev
mailing list