[PATCH 1/2] powerpc: merge 32-bit and 64-bit _switch implementation
Christophe Leroy
christophe.leroy at csgroup.eu
Tue Mar 28 04:46:28 AEDT 2023
Le 25/03/2023 à 14:06, Nicholas Piggin a écrit :
> The _switch stack frame setup are substantially the same, so are the
> comments. The difference in how the stack and current are switched,
> and other hardware and software housekeeping is done is moved into
> macros.
>
> Signed-off-by: Nicholas Piggin <npiggin at gmail.com>
> ---
> These patches are mostly just shuffling code around. Better? Worse?
I find it nice, at least for PPC32 part.
For PPC32 generated code is almost the same, only a few reordering at
the start of the function.
Before the change I have:
00000238 <_switch>:
238: 94 21 ff 30 stwu r1,-208(r1)
23c: 7c 08 02 a6 mflr r0
240: 90 01 00 d4 stw r0,212(r1)
244: 91 a1 00 44 stw r13,68(r1)
...
28c: 93 e1 00 8c stw r31,140(r1)
290: 90 01 00 90 stw r0,144(r1)
294: 7d 40 00 26 mfcr r10
298: 91 41 00 a8 stw r10,168(r1)
29c: 90 23 00 00 stw r1,0(r3)
2a0: 3c 04 40 00 addis r0,r4,16384
2a4: 7c 13 43 a6 mtsprg 3,r0
...
After the change I have:
00000000 <_switch>:
0: 7c 08 02 a6 mflr r0
4: 90 01 00 04 stw r0,4(r1)
8: 94 21 ff 30 stwu r1,-208(r1)
c: 90 23 00 00 stw r1,0(r3)
10: 91 a1 00 44 stw r13,68(r1)
...
58: 93 e1 00 8c stw r31,140(r1)
5c: 90 01 00 90 stw r0,144(r1)
60: 7c 00 00 26 mfcr r0
64: 90 01 00 a8 stw r0,168(r1)
68: 3c 04 40 00 addis r0,r4,16384
6c: 7c 13 43 a6 mtsprg 3,r0
...
Everything else is identical.
Not sure, maybe re-using r1 immediately after stwu will introduce latency.
Christophe
More information about the Linuxppc-dev
mailing list