Copy prefetch optimizations and non-coherent caches
Paul Mackerras
paulus at samba.org
Sat Sep 6 23:20:11 EST 2003
Eugene Surovegin writes:
> There are read prefetch optimization in several PPC specific functions
> responsible for copying memory (copy_page, __copy_tofrom_user). Current
> implementations will try to prefetch up to 4 (MAX_COPY_PREFETCH) cache
> lines _after_ the end of the source buffer.
>
> Unfortunately, it's not a good idea on non-coherent cache CPUs. This
> prefetching may establish cache lines for memory ranges that require
> exactly the opposite (e.g. DMA read buffer).
You are right.
> I think we should disable prefetch if CONFIG_NONCOHERENT_CACHE is defined.
> Other more complex solutions are possible, e.g. we can still prefetch our
> own buffer but don't touch anything outside (I'll try to do some
> performance testing to determine whether it's worth the effort :).
The measurements I did on a ppc64 kernel indicated that most
copy_tofrom_user calls were either for relatively small buffers
(i.e. less than 256 bytes) or were page-sized and page-aligned.
Therefore I did two routines, one optimized for small copies that
didn't use any prefetching or dcbz's, and one optimized for page-sized
copies.
We could do something similar on ppc32 - we could do the small copy
case with no prefetching (or maybe we could just prefetch on the first
cache line), plus a page-copy case that does prefetching. If you know
you are doing exactly one page, it shouldn't be hard to set up the
prefetching so you don't prefetch past the end of the source buffer.
In fact it should be possible to code up a relatively simple optimized
copy loop that avoids prefetching outside the source region if we just
assume that the source and destination addresses are
cacheline-aligned, and the size is a multiple of the cacheline size
and is at least 8 (say) cache lines.
Paul.
** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/
More information about the Linuxppc-embedded
mailing list