[PATCH v3 2/2] selftests/powerpc: Add a test of the switch_endian() syscall

Sat Mar 28 13:34:33 AEDT 2015

On Thu, 2015-03-26 at 11:54 +0530, Anshuman Khandual wrote:
> On 03/26/2015 06:06 AM, Michael Ellerman wrote:
> > On Wed, 2015-03-25 at 17:02 +0530, Anshuman Khandual wrote:
> >> On 03/25/2015 10:58 AM, Michael Ellerman wrote:
> >>> On Wed, 2015-03-18 at 16:04 +1100, Michael Ellerman wrote:
> >>>> On Tue, 2015-03-17 at 11:35 +0530, Anshuman Khandual wrote:
> >>>>> On 03/17/2015 04:34 AM, Michael Ellerman wrote:
> >>>>>> What are you seeing exactly?
> >>>>>
> >>>>> I am running on a BE PKVM guest but compiling the test case on
> >>>>> a different BE machine which has newer version of the compiler.
> >>>>>
> >>>>> cc (GCC) 4.8.3 20140624
> >>>>>
> >>>>> cc -O2 -Wall -g -nostdlib -m64   -c -o check.o check.S
> >>>>> objcopy -j .text --reverse-bytes=4 -O binary check.o check-reversed.o
> >>>>> hexdump -v -e '/1 ".byte 0x%02X\n"' check-reversed.o > check-reversed.S
> >>>>> cc -O2 -Wall -g -nostdlib -m64    switch_endian_test.S check-reversed.S   -o switch_endian_test
> >>>>>
> >>>>> which looks very similar to the details you have provided above.
> >>>>> Running on guest or host should not make any difference.
> >>>>
> >>>> No it shouldn't.
> >>>>
> >>>> Can you try strace, that should give you the full result code.
> >>>>
> >>>> Also can you try gdb. You can't breakpoint in the wrong-endian region, but it
> >>>> looks like you're getting through that anyway.
> >>>>
> >>>> So try setting a breakpoint at line ~77, and you should be back in BE. Then you
> >>>> can single step and see where it errors out.
> >>>
> >>> Did you try these?
> >>
> >> Yeah. The test program is showing some strange behavior.
> >>
> >> (1) Without strace: It just fails with 176 return code as before
> >> (2) With strace: It works with return code 0 and prints everything !!
> >>
> >> strace ./switch_endian_test
> >> execve("./switch_endian_test", ["./switch_endian_test"], [/* 50 vars */]) = 0
> >> SYS_363(0x5555aaaa5555aaaa, 0x5555aaaa5555aaae, 0x5555aaaa5555aaaf,
> >> 0x5555aaaa5555aab0, 0x5555aaaa5555aab1) = 6149008514797120170
> >> write(1, "Hello wrong-endian world\n", 25Hello wrong-endian world
> >> ) = 25
> >> SYS_363(0x19, 0x10010638, 0x19, 0x5555aaaa5555aab0, 0x5555aaaa5555aab1) = 25
> >> write(1, "Hello right-endian world\n", 25Hello right-endian world
> >> ) = 25
> >> write(1, "success: switch_endian_test\n", 28success: switch_endian_test
> >> ) = 28
> >> exit(0)                                 = ?
> >>
> >> With GDB and breaking at line 77, it exits with a different exit code this time
> > 
> > No that's the same code, 176 == 0260 (octal).
> > 
> >> 30		cmpd    r3,r5
> >> (gdb) 
> >> 31		bne     1f
> >> (gdb) 
> >> 32		addi    r3,r15,6
> >> (gdb) 
> >> 33		cmpd    r3,r6
> >> (gdb) 
> >> 34		bne     1f
> >> (gdb) 
> >> 98	1:	li	r0, __NR_exit
> >> (gdb) 
> >> 99		sc
> >> (gdb) 
> >> [Inferior 1 (process 6456) exited with code 0260]
> > 
> > And that makes sense, it's bailing because r6 doesn't match. In the setup we do:
> > 
> > 	addi	r6, r15, 6
> > 
> > Where r15 is 0x5555aaaa5555aaaa, so:
> > 
> > 	0x5555aaaa5555aaaa + 6 = 0x5555aaaa5555aab0
> > 
> > And when we exit the kernel masks the exit code in r3 with 0xff, so:
> > 
> > 	0x5555aaaa5555aab0 & 0xff = 0xb0 = 176
> > 
> > 
> > So for some reason r6 does not contain our pattern.
> > 
> > Can you do an "info registers" and see what's in r6?
> 
> Sure, here are the details.
> 
> (gdb)
> 98	1:	li	r0, __NR_exit
> (gdb)
> 99		sc
> (gdb) info registers
> r0             0x1	1
> r1             0x3ffffffff360	70368744174432
> r2             0x10018670	268535408
> r3             0x5555aaaa5555aab0	6149008514797120176
> r4             0x5555aaaa5555aaca	6149008514797120202
> r5             0x5555aaaa5555aaaf	6149008514797120175
> 
> r6             0x4000	16384   <<=========================
> 
> r7             0x100002e4	268436196
> r8             0x800000010000d033	9223372041149796403

Sigh. This is just a ■■■■■■ ■■■■■■■ ■■■■ ■■ on my part.

At the end of the checking code we call write(), which is a syscall, and it
clobbers the register state! Duh.

I think the reason you were seeing it and I wasn't is that on my system I have
audit enabled, so we *always* go through the path that restores.

New patch sent.

cheers