[PATCH v1 1/3] powerpc: Align bytes before fall back to .Lshort in powerpc memcmp

David Laight David.Laight at ACULAB.COM
Wed Sep 20 20:05:49 AEST 2017


From: Simon Guo
> Sent: 20 September 2017 10:57
> On Tue, Sep 19, 2017 at 10:12:50AM +0000, David Laight wrote:
> > From: wei.guo.simon at gmail.com
> > > Sent: 19 September 2017 11:04
> > > Currently memcmp() in powerpc will fall back to .Lshort (compare per byte
> > > mode) if either src or dst address is not 8 bytes aligned. It can be
> > > opmitized if both addresses are with the same offset with 8 bytes boundary.
> > >
> > > memcmp() can align the src/dst address with 8 bytes firstly and then
> > > compare with .Llong mode.
> >
> > Why not mask both addresses with ~7 and mask/shift the read value to ignore
> > the unwanted high (BE) or low (LE) bits.
> >
> > The same can be done at the end of the compare with any final, partial word.
> 
> Yes. That will be better. A prototyping shows ~5% improvement on 32 bytes
> size comparison with v1. I will rework on v2.

Clearly you have to be careful to return the correct +1/-1 on mismatch.

For systems that can do misaligned transfers you can compare the first
word, then compare aligned words and finally the last word.
Rather like a memcpy() function I wrote (for NetBDSD) that copied
the last word first, then a whole number of words aligned at the start.
(Hope no one expected anything special for overlapping copies.)

	David



More information about the Linuxppc-dev mailing list