[PATCH] powerpc/bug: Remove specific powerpc BUG_ON()

Christophe Leroy christophe.leroy at csgroup.eu
Thu Feb 11 23:26:12 AEDT 2021



Le 11/02/2021 à 12:49, Segher Boessenkool a écrit :
> On Thu, Feb 11, 2021 at 07:41:52AM +0000, Christophe Leroy wrote:
>> powerpc BUG_ON() is based on using twnei or tdnei instruction,
>> which obliges gcc to format the condition into a 0 or 1 value
>> in a register.
> 
> Huh?  Why is that?
> 
> Will it work better if this used __builtin_trap?  Or does the kernel only
> detect very specific forms of trap instructions?
> 
>> By using a generic implementation, gcc will generate a branch
>> to the unconditional trap generated by BUG().
> 
> That is many more instructions than ideal.
> 
>> As modern powerpc implement branch folding, that's even more efficient.
> 
> What PowerPC cpus implement branch folding?  I know none.

Extract from powerpc mpc8323 reference manual:

High instruction and data throughput
— Zero-cycle branch capability (branch folding)
— Programmable static branch prediction on unresolved conditional branches
— Two integer units with enhanced multipliers in thee300c2 for increased integer instruction
throughput and a maximum two-cycle latency for multiply instructions
— Instruction fetch unit capable of fetching two instructions per clock from the instruction cache
— A six-entry instruction queue (IQ) that provides lookahead capability
— Independent pipelines with feed-forwarding that reduces data dependencies in hardware
— 16-Kbyte, four-way set-associative instruction and data caches on the e300c2.
— Cache write-back or write-through operation programmable on a per-page or per-block basis
— Features for instruction and data cache locking and protection
— BPU that performs CR lookahead operations
— Address translation facilities for 4-Kbyte page size, variable block size, and 256-Mbyte
segment size
— A 64-entry, two-way, set-associative ITLB and DTLB
— Eight-entry data and instruction BAT arrays providing 128-Kbyte to 256-Mbyte blocks
— Software table search operations and updates supported through fast trap mechanism
— 52-bit virtual address; 32-bit physical address

Christophe


More information about the Linuxppc-dev mailing list