[patch 05/18] PS3: Fix sparse warnings
    Geoff Levand 
    geoffrey.levand at am.sony.com
       
    Fri Jun  8 00:34:21 EST 2007
    
    
  
Arnd Bergmann wrote:
> On Wednesday 06 June 2007, Geoff Levand wrote:
>> -�������spu->local_store = ioremap(spu->local_store_phys, LS_SIZE);
>> +�������spu->local_store = (__force void *)ioremap(spu->local_store_phys,
>> +����������������������������������������������� � LS_SIZE);
> 
> I haven't noticed this before, but it seems to be a preexisting bug:
> You map the local_store as with the guarded page table bit set, which
> causes a performance degradation when accessing the memory from kernel
> space.
> 
> If you're lucky, your hypervisor knows this and will fix it up for
> you, but I would replace the ioremap call with an
> ioremap_flags(..., _PAGE_NO_CACHE); to be on the safe side.
> 
> If you want to measure the impact, I'd suggest timing a user space
> read() on the mem file of a running SPU context.
Hi Arnd,
I asked Noguchi-san to check the performance and below is his
report and test program.  I'll add the change into my patch set.
-Geoff
-------- Original Message --------
Subject: RE: [patch 05/18] PS3: Fix sparse warnings
Date: Thu, 7 Jun 2007 05:39:43 -0700
From: Noguchi, Masato <Masato.Noguchi at jp.sony.com>
To: Levand, Geoff <Geoffrey.Levand at am.sony.com>
 << A time to read a whole of LS by read system call >>
not patched: avg. 21053.7800 tick ( 263.831830 microseconds )
patched:     avg. 20809.2412 tick ( 260.767434 microseconds )
about 1% faster. 
I think it's a valid difference. (not a measurement error.)
FYI, 
The attached file is source code to measure it.
I run it 10000 times and calc an average.
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/syscall.h>
#include <sys/mman.h>
#include <sys/wait.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>
#include <signal.h>
#include <string.h>
#include <stdint.h>
#include <stdlib.h>
#include <pthread.h>
#define __NR_spe_run 278
#define __NR_spe_create 279
#define LS_SIZE 0x40000
#define SPENODE "/spu/stoplooptest"
#define MFTB(RA) __asm__ volatile("mftb %0":"=r"(RA))
long long do_test(void)
{
	int spefd = -1, lsfd = -1;
	int npc, status;
	long long ret = -1;
	char buf[LS_SIZE];
	int n;
	uint32_t t1, t2;
	/* create context */
	spefd = syscall(__NR_spe_create, SPENODE, 0,
			S_IRUSR | S_IWUSR | S_IXUSR);
	if (spefd < 0) goto out;
	/* run once to assign physical spe */
	npc = 0;
	syscall(__NR_spe_run, spefd, &npc, &status);
	/* get /mem file descriptor */
	lsfd = open(SPENODE "/mem", O_RDWR,
		    S_IRUSR | S_IWUSR);
	if (lsfd < 0) goto out;
	/* read mem */
	MFTB(t1);
	if (read(lsfd, buf, LS_SIZE) != LS_SIZE) {
		goto out;
	}
	MFTB(t2);
	ret = t2 - t1;
 out:
	if ( lsfd >= 0 ) close(lsfd);
	if ( spefd >= 0 ) close(spefd);
	return ret;
}
int main(int argc, char *argv[])
{
	long long r;
	r = do_test();
	printf("%lld\n", r);
	return 0;
}
    
    
More information about the Linuxppc-dev
mailing list