[PATCH 1/2] powerpc: merge 32-bit and 64-bit _switch implementation

Christophe Leroy christophe.leroy at csgroup.eu
Tue Mar 28 04:46:28 AEDT 2023



Le 25/03/2023 à 14:06, Nicholas Piggin a écrit :
> The _switch stack frame setup are substantially the same, so are the
> comments. The difference in how the stack and current are switched,
> and other hardware and software housekeeping is done is moved into
> macros.
> 
> Signed-off-by: Nicholas Piggin <npiggin at gmail.com>
> ---
> These patches are mostly just shuffling code around. Better? Worse?

I find it nice, at least for PPC32 part.

For PPC32 generated code is almost the same, only a few reordering at 
the start of the function.

Before the change I have:

00000238 <_switch>:
  238:	94 21 ff 30 	stwu    r1,-208(r1)
  23c:	7c 08 02 a6 	mflr    r0
  240:	90 01 00 d4 	stw     r0,212(r1)
  244:	91 a1 00 44 	stw     r13,68(r1)
...
  28c:	93 e1 00 8c 	stw     r31,140(r1)
  290:	90 01 00 90 	stw     r0,144(r1)
  294:	7d 40 00 26 	mfcr    r10
  298:	91 41 00 a8 	stw     r10,168(r1)
  29c:	90 23 00 00 	stw     r1,0(r3)
  2a0:	3c 04 40 00 	addis   r0,r4,16384
  2a4:	7c 13 43 a6 	mtsprg  3,r0
...

After the change I have:

00000000 <_switch>:
    0:	7c 08 02 a6 	mflr    r0
    4:	90 01 00 04 	stw     r0,4(r1)
    8:	94 21 ff 30 	stwu    r1,-208(r1)
    c:	90 23 00 00 	stw     r1,0(r3)
   10:	91 a1 00 44 	stw     r13,68(r1)
...
   58:	93 e1 00 8c 	stw     r31,140(r1)
   5c:	90 01 00 90 	stw     r0,144(r1)
   60:	7c 00 00 26 	mfcr    r0
   64:	90 01 00 a8 	stw     r0,168(r1)
   68:	3c 04 40 00 	addis   r0,r4,16384
   6c:	7c 13 43 a6 	mtsprg  3,r0
...

Everything else is identical.

Not sure, maybe re-using r1 immediately after stwu will introduce latency.

Christophe



More information about the Linuxppc-dev mailing list