[SLOF PATCH 1/2] fbuffer: Improve invert-region helper
Segher Boessenkool
segher at kernel.crashing.org
Wed Jul 29 03:04:16 AEST 2015
On Tue, Jul 28, 2015 at 12:19:54PM +0200, Thomas Huth wrote:
> : invert-region ( addr len -- )
> - 0 ?DO dup dup rb@ -1 xor swap rb! 1+ LOOP drop
> -;
> -
> -: invert-region-x ( addr len -- )
> - /x / 0 ?DO dup dup rx@ -1 xor swap rx! xa1+ LOOP drop
> + 2dup or 7 and CASE
> + 0 OF 3 rshift 0 ?DO dup dup rx@ -1 xor swap rx! xa1+ LOOP ENDOF
> + 2 OF 1 rshift 0 ?DO dup dup rw@ -1 xor swap rw! wa1+ LOOP ENDOF
> + 4 OF 2 rshift 0 ?DO dup dup rl@ -1 xor swap rl! la1+ LOOP ENDOF
> + 6 OF 1 rshift 0 ?DO dup dup rw@ -1 xor swap rw! wa1+ LOOP ENDOF
> + dup OF 0 ?DO dup dup rb@ -1 xor swap rb! 1+ LOOP ENDOF
> + ENDCASE
> + drop
> ;
Can you access device memory as 64 bits for all supported devices?
You can get a bigger speedup by writing some of the core blitting
functions in C, btw.
A small simplification:
2dup or 7 and CASE
0 OF 3 rshift 0 ?DO dup dup rx@ -1 xor swap rx! xa1+ LOOP ENDOF
4 OF 2 rshift 0 ?DO dup dup rl@ -1 xor swap rl! la1+ LOOP ENDOF
3 and
2 OF 1 rshift 0 ?DO dup dup rw@ -1 xor swap rw! wa1+ LOOP ENDOF
dup OF 0 ?DO dup dup rb@ -1 xor swap rb! 1+ LOOP ENDOF
ENDCASE
If this code is often called unaligned, it makes more sense to special-
case the begin and end probably.
Segher
More information about the Linuxppc-dev
mailing list