[PATCH v2 2/3] powerpc/64: enhance memcmp() with VMX instruction for long bytes comparision

Michael Ellerman mpe at ellerman.id.au
Tue Sep 26 15:34:36 AEST 2017


Cyril Bur <cyrilbur at gmail.com> writes:

> On Sun, 2017-09-24 at 05:18 +0800, Simon Guo wrote:
>> Hi Cyril,
>> On Sat, Sep 23, 2017 at 12:06:48AM +1000, Cyril Bur wrote:
>> > On Thu, 2017-09-21 at 07:34 +0800, wei.guo.simon at gmail.com wrote:
>> > > From: Simon Guo <wei.guo.simon at gmail.com>
>> > > 
>> > > This patch add VMX primitives to do memcmp() in case the compare size
>> > > exceeds 4K bytes.
>> > 
>> > Sorry I didn't see this sooner, I've actually been working on a kernel
>> > version of glibc commit dec4a7105e (powerpc: Improve memcmp performance
>> > for POWER8) unfortunately I've been distracted and it still isn't done.
>> 
>> Thanks for sync with me. Let's consolidate our effort together :)
>> 
>> I have a quick check on glibc commit dec4a7105e. 
>> Looks the aligned case comparison with VSX is launched without rN size
>> limitation, which means it will have a VSX reg load penalty even when the 
>> length is 9 bytes.
>> 
>
> This was written for userspace which doesn't have to explicitly enable
> VMX in order to use it - we need to be smarter in the kernel.

Well the kernel has to do it for them after a trap, which is actually
even more expensive, so arguably the glibc code should be smarter too
and the threshold before using VMX should probably be higher than in the
kernel (to cover the cost of the trap).

But I digress :)

cheers


More information about the Linuxppc-dev mailing list