[PATCH v2 2/3] powerpc/64: enhance memcmp() with VMX instruction for long bytes comparision

Simon Guo wei.guo.simon at gmail.com
Thu Sep 28 04:33:46 AEST 2017


On Wed, Sep 27, 2017 at 09:43:44AM +0000, David Laight wrote:
> From: Segher Boessenkool
> > Sent: 27 September 2017 10:28
> ...
> > You also need nasty code to deal with the start and end of strings, with
> > conditional branches and whatnot, which quickly overwhelms the benefit
> > of using vector registers at all.  This tradeoff also changes with newer
> > ISA versions.
> 
> The goal posts keep moving.
> For instance with modern intel x86 cpus 'rep movsb' is by far the fastest
> way to copy data (from cached memory).
> 
> > Things have to become *really* cheap before it will be good to often use
> > vector registers in the kernel though.
> 
> I've had thoughts about this in the past.
> If the vector registers belong to the current process then you might
> get away with just saving the ones you want to use.
> If they belong to a different process then you also need to tell the
> FPU save code where you've saved the registers.
> Then the IPI code can recover all the correct values.
> 
> On X86 all the AVX registers are caller saved, the system call
> entry could issue the instruction that invalidates them all.
> Kernel code running in the context of a user process could then
> use the registers without saving them.
> It would only need to set a mark to ensure they are invalidated
> again on return to user (might be cheap enough to do anyway).
> Dunno about PPC though.

I am not aware of any ppc instruction which can set a "mark" or provide 
any high granularity flag against single or subgroup of vec regs' validity.
But ppc experts may want to correct me.

Thanks,
- Simon




More information about the Linuxppc-dev mailing list