[PATCH] powerpc/lib/sstep: Fix count leading zeros instructions

Segher Boessenkool segher at kernel.crashing.org
Tue Oct 10 01:47:56 AEDT 2017


On Mon, Oct 09, 2017 at 02:43:45PM +0000, David Laight wrote:
> From: Segher Boessenkool
> > Sent: 09 October 2017 15:21
> > On Mon, Oct 09, 2017 at 01:49:26PM +0000, David Laight wrote:
> > > From: Sandipan Das
> > > > Sent: 09 October 2017 12:07
> > > > According to the GCC documentation, the behaviour of __builtin_clz()
> > > > and __builtin_clzl() is undefined if the value of the input argument
> > > > is zero. Without handling this special case, these builtins have been
> > > > used for emulating the following instructions:
> > > >   * Count Leading Zeros Word (cntlzw[.])
> > > >   * Count Leading Zeros Doubleword (cntlzd[.])
> > > >
> > > > This fixes the emulated behaviour of these instructions by adding an
> > > > additional check for this special case.
> > >
> > > Presumably the result is undefined because the underlying cpu
> > > instruction is used - and it's return value is implementation defined.
> > 
> > It is undefined because the result is undefined, and the compiler
> > optimises based on that.  The return value of the builtin is undefined,
> > not implementation defined.
> > 
> > The patch is correct.
> 
> But the code you are emulating might be relying on the (un)defined value
> the cpu instruction gives for zero input rather than the input width.
> 
> Or, put another way, if the return value for a clz instruction with zero
> argument is undefined (as it is on x86 - intel and amd may differ) then the
> emulation can return any value since the code can't care.
> So the conditional is not needed.

The cntlz[wd][.] insn has defined behaviour for 0 input.  It's just the
builtin that does not.  So we shouldn't call the builtin with an input
of 0 -- exactly what this patch does -- and that is all that was wrong.


Segher


More information about the Linuxppc-dev mailing list