[PATCH v2 2/5] powerpc/lib/sstep: Add popcnt instruction emulation

Matt Brown matthew.brown.dev at gmail.com
Tue Jul 25 10:53:43 AEST 2017


On Mon, Jul 24, 2017 at 8:28 PM, Balbir Singh <bsingharora at gmail.com> wrote:
> On Mon, Jul 24, 2017 at 11:01 AM, Matt Brown
> <matthew.brown.dev at gmail.com> wrote:
>> This adds emulations for the popcntb, popcntw, and popcntd instructions.
>> Tested for correctness against the popcnt{b,w,d} instructions on ppc64le.
>>
>> Signed-off-by: Matt Brown <matthew.brown.dev at gmail.com>
>> ---
>> v2:
>>         - fixed opcodes
>>         - fixed typecasting
>>         - fixed bitshifting error for both 32 and 64bit arch
>> ---
>>  arch/powerpc/lib/sstep.c | 43 ++++++++++++++++++++++++++++++++++++++++++-
>>  1 file changed, 42 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
>> index 87d277f..e6a16a3 100644
>> --- a/arch/powerpc/lib/sstep.c
>> +++ b/arch/powerpc/lib/sstep.c
>> @@ -612,6 +612,35 @@ static nokprobe_inline void do_cmpb(struct pt_regs *regs, unsigned long v1,
>>         regs->gpr[rd] = out_val;
>>  }
>>
>> +/*
>> + * The size parameter is used to adjust the equivalent popcnt instruction.
>> + * popcntb = 8, popcntw = 32, popcntd = 64
>> + */
>> +static nokprobe_inline void do_popcnt(struct pt_regs *regs, unsigned long v1,
>> +                               int size, int ra)
>> +{
>> +       unsigned long long high, low, mask;
>> +       unsigned int n;
>> +       int i, j;
>> +
>> +       high = 0;
>> +       low = 0;
>> +
>> +       for (i = 0; i < (64 / size); i++) {
>> +               n = 0;
>> +               for (j = 0; j < size; j++) {
>> +                       mask = 1UL << (j + (i * size));
>> +                       if (v1 & mask)
>> +                               n++;
>> +               }
>> +               if ((i * size) < 32)
>> +                       low |= n << (i * size);
>> +               else
>> +                       high |= n << ((i * size) - 32);
>> +       }
>> +       regs->gpr[ra] = (high << 32) | low;
>> +}
>
> There's a way to do it in very efficient way via the Giles-Miller
> method of side-ways addition
>
> Please see
>
> http://opensourceforu.com/2012/06/power-programming-bitwise-tips-tricks/
> and lib/hweight.c, you can reuse the code from lib/hweight.c

Oh that's a really cool technique.
We could use that for the parity instructions too.

>
> Balbir Singh


More information about the Linuxppc-dev mailing list