AW: AW: SPE & Interrupt context (was how to make use of SPE instructions)

Markus Stockhausen stockhausen at collogia.de
Fri Jan 30 20:39:41 AEDT 2015


> Von: Gabriel Paubert [paubert at iram.es]
> Gesendet: Freitag, 30. Januar 2015 09:49
> An: Markus Stockhausen
> Cc: Scott Wood; linuxppc-dev at lists.ozlabs.org; Herbert Xu
> Betreff: Re: AW: SPE & Interrupt context (was how to make use of SPE instructions)
>
> > ...
> > - I must already save several non-volatile registers. Putting the 64 bit values
> > into them would require me to save their contents with evstdd instead of
> > stw. Of course stack alignment to 8 bytes required. So only a few alignment
> > instructions needed additionally during initialization.
> 
> On most PPC ABI the stack is guaranteed to be aligned to a 16 byte
> boundary. In some it may be only 8, but I can't remember any 4 byte
> only alignment.
> 
> I checked my 32 bit kernel images with:
> 
> objdump -d vmlinux |awk '/stwu.*r1,/{print $6,$7}'|sort -u
> 
> and the stack seems to always be 16 byte aligned.
> For 64 bit, use stdu instead of stwu.
> 
> I've also found a few stwux/stdux which are hopefully known
> to be harmless.
>
> Gabriel

A helpful annotation. But now I'm unsure about function usage. SPE seems to be
32bit only and I would use their evxxx instructions. Do you think the following
sequence will be the right way? 

_GLOBAL(ppc_spe_sha256_transform)
  stwu            r1,-128(r1);    /* create stack frame           */
  stw             r24,8(r1);      /* save normal registers        */
  stw             r25,12(r1);                                       
  evstdw          r14,16(r1);     /* We must save non volatile    */
  evstdw          r15,24(r1);    /* registers. Take the chance   */
  evstdw          r16,32(r12);    /* and save the SPE part too    */ \
  ...
  lwz             r24,8(r1);      /* restore normal registers     */ \
  lwz             r25,12(r1);
  evldw           r14,16(r12);     /* restore non-v. + SPE registers      */
  evldw           r15,24(r12);
  evldw           r16,32(r12);
  addi            r1,r1,128;      /* cleanup stack frame          */

Or must I use the kernel provided defines with PPC_STLU r1,-INT_FRAME_SIZE(r1) 
plus SAVE_GPR/SAVE_EVR/REST_GPR/REST_EVR?

Markus
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: InterScan_Disclaimer.txt
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20150130/33a93e4a/attachment.txt>


More information about the Linuxppc-dev mailing list