[PATCH 0/1] PPC32: fix ptrace() access to FPU registers

Wed Jun 19 22:57:07 AEST 2019

On Wed, 2019-06-19 at 10:36 +1000, Daniel Axtens wrote:
> Andreas Schwab <
> schwab at linux-m68k.org
> > writes:
> 
> > On Jun 18 2019, Radu Rendec <
> > radu.rendec at gmail.com
> > > wrote:
> > 
> > > Since you already have a working setup, it would be nice if you could
> > > add a printk to arch_ptrace() to print the address and confirm what I
> > > believe happens (by reading the gdb source code).
> > 
> > A ppc32 ptrace syscall goes through compat_arch_ptrace.

Right. I completely overlooked that part.

> Ah right, and that (in ptrace32.c) contains code that will work:
> 
> 
> 			/*
> 			 * the user space code considers the floating point
> 			 * to be an array of unsigned int (32 bits) - the
> 			 * index passed in is based on this assumption.
> 			 */
> 			tmp = ((unsigned int *)child->thread.fp_state.fpr)
> 				[FPRINDEX(index)];
> 
> FPRINDEX is defined above to deal with the various manipulations you
> need to do.

Correct. Basically it does the same that I did in my patch: it divides
the index again by 2 (it's already divided by 4 in compat_arch_ptrace()
so it ends up divided by 8), then takes the least significant bit and
adds it to the index. I take bit 2 of the original address, which is the
same thing (because in FPRHALF() the address is already divided by 4).

So we have this in ptrace32.c:

#define FPRNUMBER(i) (((i) - PT_FPR0) >> 1)
#define FPRHALF(i) (((i) - PT_FPR0) & 1)
#define FPRINDEX(i) TS_FPRWIDTH * FPRNUMBER(i) * 2 + FPRHALF(i)

index = (unsigned long) addr >> 2;
(unsigned int *)child->thread.fp_state.fpr)[FPRINDEX(index)]

And we have this in my patch:

fpidx = (addr - PT_FPR0 * sizeof(long)) / 8;
(void *)&child->thread.TS_FPR(fpidx) + (addr & 4)

> Radu: I think we want to copy that working code back into ptrace.c. 

I'm not sure that would work. There's a subtle difference: the code in
ptrace32.c is always compiled on a 64-bit kernel and the user space
calling it is always 32-bit; on the other hand, the code in ptrace.c can
be compiled on either a 64-bit kernel or a 32-bit kernel and the user
space calling it always has the same "bitness" as the kernel.

One difference is the size of the CPU registers. On 64-bit they are 8
byte long and user space knows that and generates 8-byte aligned
addresses. So you have to divide the address by 8 to calculate the CPU
register index correctly, which compat_arch_ptrace() currently doesn't.

Another difference is that on 64-bit `long` is 8 bytes, so user space
can read a whole FPU register in a single ptrace call. 

Now that we are all aware of compat_arch_ptrace() (which handles the
special case of a 32-bit process running on a 64-bit kernel) I would say
the patch is correct and does the right thing for both 32-bit and 64-bit 
kernels and processes.

> The challenge will be unpicking the awful mess of ifdefs in ptrace.c
> and making it somewhat more comprehensible.

I'm not sure what ifdefs you're thinking about. The only that are used
inside arch_ptrace() are PT_FPR0, PT_FPSCR and TS_FPR, which seem to be
correct.

But perhaps it would be useful to change my patch and add a comment just
before arch_ptrace() that explains how the math is done and that the
code must work on both 32-bit and 64-bit, the user space address
assumptions, etc.

By the way, I'm not sure the code in compat_arch_ptrace() handles
PT_FPSCR correctly. It might (just because fpscr is right next to fpr[]
in memory - and that's a hack), but I can't figure out if it accesses
the right half.

Radu