[PATCH v4 1/3] powerpc/bitops: Use immediate operand when possible

Christophe Leroy christophe.leroy at csgroup.eu
Wed Sep 22 01:15:29 AEST 2021



Le 20/09/2021 à 23:23, Segher Boessenkool a écrit :
> Hi!
> 
> On Mon, Sep 20, 2021 at 10:31:17AM +0200, Christophe Leroy wrote:
>> Today we get the following code generation for bitops like
>> set or clear bit:
>>
>> 	c0009fe0:	39 40 08 00 	li      r10,2048
>> 	c0009fe4:	7c e0 40 28 	lwarx   r7,0,r8
>> 	c0009fe8:	7c e7 53 78 	or      r7,r7,r10
>> 	c0009fec:	7c e0 41 2d 	stwcx.  r7,0,r8
>>
>> 	c000d568:	39 00 18 00 	li      r8,6144
>> 	c000d56c:	7c c0 38 28 	lwarx   r6,0,r7
>> 	c000d570:	7c c6 40 78 	andc    r6,r6,r8
>> 	c000d574:	7c c0 39 2d 	stwcx.  r6,0,r7
>>
>> Most set bits are constant on lower 16 bits, so it can easily
>> be replaced by the "immediate" version of the operation. Allow
>> GCC to choose between the normal or immediate form.
> 
> You can also handle the second sixteen bits (the "shifted" half), by
> using oris etc.  The "%eN" output modifier prints an "s" for this:
>    /* If the low 16 bits are 0, but some other bit is set, write 's'.  */
> But this doesn't handle non-constant arguments, so you're likely better
> off using what you have noe.
> 
>> For clear bits, on 32 bits 'rlwinm' can be used instead of 'andc' for
>> when all bits to be cleared are consecutive.
> 
> Or when all you want to keep are consecutive (you do handle that now :-) )
> 
>> On 64 bits we don't have any equivalent single operation for clearing,
>> single bits or a few bits, we'd need two 'rldicl' so it is not
>> worth it, the li/andc sequence is doing the same.
> 
> You can use rlwinm whenever you want to clear all top 32 bits.
> 
> A sometimes nice idiom is  ori x,x,N ; xori x,x,N  to clear the bits N
> (or oris/xoris).  But it's two insns no matter what (but no spare
> register is needed).

Could be a candidate for a follow-up change if someone want to focus on 
PPC64.

> 
>> Signed-off-by: Christophe Leroy <christophe.leroy at csgroup.eu>
> 
>> +static inline unsigned long test_and_clear_bits(unsigned long mask, volatile unsigned long *_p)
>> +{
>> +	unsigned long old, t;
>> +	unsigned long *p = (unsigned long *)_p;
>> +
>> +	if (IS_ENABLED(CONFIG_PPC32) &&
>> +	    __builtin_constant_p(mask) && is_rlwinm_mask_valid(mask)) {
> 
> is_rlwinm_mask_valid(~mask)?  So that test_and_clear_bits(0, ...) will
> work with rlwinm, and test_and_clear_bits(0xffffffff, ...) will not make
> gas scream bloody murder ("illegal bitmask").  Tha mask you pass to the
> instruction is ~mask after all.

Ok, fixed in v5.

> 
> Looks great except that one nit.  Thanks :-)
> 
> Reviewed-by: Segher Boessenkool <segher at kernel.crashing.org>

Thanks

Christophe


More information about the Linuxppc-dev mailing list