[PATCH 0/1] PPC32: fix ptrace() access to FPU registers

Tue Jun 18 22:16:25 AEST 2019

On Tue, 2019-06-18 at 16:42 +1000, Daniel Axtens wrote:
> Radu Rendec <
> radu.rendec at gmail.com
> > writes:
> 
> > On Mon, 2019-06-17 at 11:19 +1000, Daniel Axtens wrote:
> > > Radu Rendec <
> > > radu.rendec at gmail.com
> > > 
> > > > writes:
> > > > Hi Everyone,
> > > > 
> > > > I'm following up on the ptrace() problem that I reported a few days ago.
> > > > I believe my version of the code handles all cases correctly. While the
> > > > problem essentially boils down to dividing the fpidx by 2 on PPC32, it
> > > > becomes tricky when the same code must work correctly on both PPC32 and
> > > > PPC64.
> > > > 
> > > > One other thing that I believe was handled incorrectly in the previous
> > > > version is the unused half of fpscr on PPC32. Note that while PT_FPSCR
> > > > is defined as (PT_FPR0 + 2*32 + 1), making only the upper half visible,
> > > > PT_FPR0 + 2*32 still corresponds to a possible address that userspace
> > > > can pass. In that case, comparing fpidx to (PT_FPSCR - PT_FPR0) would
> > > > cause an invalid access to the FPU registers array.
> > > > 
> > > > I tested the patch on 4.9.179, but that part of the code is identical in
> > > > recent kernels so it should work just the same.
> > > > 
> > > > I wrote a simple test program than can be used to quickly test (on an
> > > > x86_64 host) that all cases are handled correctly for both PPC32/PPC64.
> > > > The code is included below.
> > > > 
> > > > I also tested with gdbserver (test patch included below) and verified
> > > > that it generates two ptrace() calls for each FPU register, with
> > > > addresses between 0xc0 and 0x1bc.
> > > 
> > > Thanks for looking in to this. I can confirm your issue. What I'm
> > > currently wondering is: what is the behaviour with a 32-bit userspace on
> > > a 64-bit kernel? Should they also be going down the 32-bit path as far
> > > as calculating offsets goes?
> > 
> > Thanks for reviewing this. I haven't thought about the 32-bit userspace
> > on a 64-bit kernel, that is a very good question. Userspace passes a
> > pointer, so in theory it could go down either path as long as the
> > pointer points to the right data type.
> > 
> > I will go back to the gdb source code and try to figure out if that case
> > is handled in a special way. If not, it's probably safe to assume that a
> > 32-bit userspace should always go down the 32-bit path regardless of the
> > kernel bitness (in which case I think I have to change my patch).
> 
> It doesn't seem to reproduce on a 64-bit kernel with 32-bit
> userspace. Couldn't tell you why - if you can figure it out from the gdb
> source code I'd love to know! I have, however, tried it - and all the fp
> registers look correct and KASAN doesn't pick up any memory corruption.

I looked at the gdb source code and all it seems to care about is the
architecture it was compiled for. In other words, 32-bit gdb always
assumes 32-bit register layout, regardless of whether it's running on a
32-bit or 64-bit kernel.

So it's no surprise that the problem didn't reproduce and KASAN didn't
pick up anything in your experiment. Since the kernel is 64-bit, it
divides addresses by 8, so all indexes are within bounds. The part that
I don't get is how the FP registers looked correct.

Since you already have a working setup, it would be nice if you could
add a printk to arch_ptrace() to print the address and confirm what I
believe happens (by reading the gdb source code).

So for 32-bit gdb on 64-bit kernel, I think gdb will generate 48 x 4-
byte aligned addresses for the CPU registers, then 32 x 2 x 4-byte
aligned addresses for the FPU registers. The indexes will not overflow,
but access to all registers (CPU and FPU) will be broken because:
 * The kernel always divides by 8. The CPU register address range that
   gdb generates will be half of what the kernel expects and "odd" (i.e.
   non 8-byte aligned) addresses will fail with EIO because the kernel
   code checks that the last 3 bits are all zero.
 * The FPU register indexes will be off by 24. For example, when gdb
   thinks it asks for FPR0, it calculates the address as 4 x 48, but the
   kernel divides by 8, so it will get index 24, which it thinks is a
   CPU register.

When it stops on a breakpoint, gdb seems to read all registers (both CPU
and FPU) in ascending index order. So if you print the address in the
kernel it's actually very easy to verify my theory. I expect addresses
from 0 to 48 x 4 in increments of 4 (for the CPU registers), then
addresses from 48 x 4 to 48 x 4 + 32 x 2 x 4 in increments of 4 (for the
FPU registers).

Best regards,
Radu