[Cbe-oss-dev] [RFC 1/3] powerpc: __copy_tofrom_user tweaked for Cell
Gunnar von Boehn
VONBOEHN at de.ibm.com
Fri Jun 20 21:36:45 EST 2008
Hi Sanjay,
> I suppose it would still function correctly via the handler, but horribly
slowly.
How important is best performance for the unaligned copy to/from
uncacheable memory?
The challenge of the CELL chip is that X-form of the shift instructions are
microcoded.
The shifts are needed to implement a copy that reads and writes always
aligned.
There is of course the option to not use the X-form of the shift but to
write several copy routines
using immediate shift instructions and to pick the matching copy routine.
This option would of course highly increase the code size of the memcopy
routine.
Kind regards
Gunnar
Sanjay Patel
<sanjay3000 at yahoo
.com> To
Arnd Bergmann <arnd at arndb.de>,
19/06/2008 18:13 Gunnar von
Boehn/Germany/Contr/IBM at IBMDE
cc
Please respond to Mark Nelson <markn at au1.ibm.com>,
sanjay3000 at yahoo. linuxppc-dev at ozlabs.org, Michael
com Ellerman <ellerman at au1.ibm.com>,
cbe-oss-dev at ozlabs.org
Subject
Re: [RFC 1/3] powerpc:
__copy_tofrom_user tweaked for Cell
--- On Thu, 6/19/08, Gunnar von Boehn <VONBOEHN at de.ibm.com> wrote:
> You are right the main copy2user requires that the SRC is
> cacheable.
> IMHO because of the exception on load, the routine should
> fallback to the
> byte copy loop.
>
> Arnd, could you verify that it works on localstore?
Since the main loops use 'dcbz', the destination must also be cacheable.
IIRC, if the destination is write-through or cache-inhibited, the 'dcbz'
will cause an alignment exception. I suppose it would still function
correctly via the handler, but horribly slowly.
--Sanjay
More information about the cbe-oss-dev
mailing list