objtool: Seeking help for improving switch table processing

Peter Zijlstra peterz at infradead.org
Tue Jun 27 18:53:31 AEST 2023


On Sat, Jun 24, 2023 at 10:06:23AM +0000, Christophe Leroy wrote:
> Hello Josh and Peter,
> 
> As mentionned in the cover letter of my series "powerpc/objtool: uaccess 
> validation for PPC32 (v3)" [1], a few  switch table lookup fail, and it 
> would help if you had ideas on how to handle them.
> 
> First one is as follows. First switch is properly detected, second is not.
> 
> 0000 00003818 <vsnprintf>:
> ...
> 0054     386c:	3f 40 00 00 	lis     r26,0	386e: R_PPC_ADDR16_HA	.rodata+0x6c
> 0058     3870:	3f 20 00 00 	lis     r25,0	3872: R_PPC_ADDR16_HA	.rodata+0x4c
> 005c     3874:	7f be eb 78 	mr      r30,r29
> 0060     3878:	3b 5a 00 00 	addi    r26,r26,0	387a: R_PPC_ADDR16_LO 
> .rodata+0x6c		<== First switch table address loaded in r26 register
> 0064     387c:	3b 39 00 00 	addi    r25,r25,0	387e: R_PPC_ADDR16_LO 
> .rodata+0x4c		<== Second switch table address loaded in r25 register
> ...
> 009c     38b4:	41 9d 02 64 	bgt     cr7,3b18 <vsnprintf+0x300>	<== 
> Conditional jump to where second switch is
> 00a0     38b8:	55 29 10 3a 	slwi    r9,r9,2
> 00a4     38bc:	7d 39 48 2e 	lwzx    r9,r25,r9
> 00a8     38c0:	7d 29 ca 14 	add     r9,r9,r25
> 00ac     38c4:	7d 29 03 a6 	mtctr   r9
> 00b0     38c8:	4e 80 04 20 	bctr		<== Dynamic switch branch based on r25 
> register
> ...
> 0300     3b18:	39 29 ff f8 	addi    r9,r9,-8
> 0304     3b1c:	55 2a 06 3e 	clrlwi  r10,r9,24
> 0308     3b20:	2b 8a 00 0a 	cmplwi  cr7,r10,10
> 030c     3b24:	89 3f 00 00 	lbz     r9,0(r31)
> 0310     3b28:	41 9d 01 88 	bgt     cr7,3cb0 <vsnprintf+0x498>
> 0314     3b2c:	55 4a 10 3a 	slwi    r10,r10,2
> 0318     3b30:	7d 5a 50 2e 	lwzx    r10,r26,r10
> 031c     3b34:	7d 4a d2 14 	add     r10,r10,r26
> 0320     3b38:	7d 49 03 a6 	mtctr   r10
> 0324     3b3c:	4e 80 04 20 	bctr		<== Dynamic switch branch based on r26 
> register
> ...

Josh is the one that knows most about the jump table stuff, but I think
he's traveling or something like that atm so he might be a little slow.

Is the problem above that both the .rodata references are before the
conditional jump, such that objtool fails to correlate the indirect jump
and .rodata ?

Looking at mark_func_jump_table() that only seems to consider
unconditional jumps wrt jump-tables and the above doesn't match this
pattern.

Worse is that the two jump tables are interleaved, this means the only
way to untangle things is to actually track the register state :/

Specifically, if GCC wanted it could flip the r25 and r26 loads and then
objtool wouldn't be able to match any of them I think. Because at that
point the first jump-table would match the r26 jump-table or so (I
think, I've not fully considered the current code).

Ho-humm... what a tangle.

So for AARGH64 we also had trouble with jump-tables, but LLVM-BOLT
managed to get that working:

  https://github.com/llvm/llvm-project/blob/main/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp#L458

perhaps we can glean a clue there, but I don't immediately see the same
patterns there.

I can't seem to come up with anything better than tracking the register
state, and effectively working back from 'ctr' to a .rodata. That's
going to be a bit of effort though...


More information about the Linuxppc-dev mailing list