oops trying to execute sh
John Charles Tyner
jtyner at cs.ucr.edu
Thu Nov 22 06:54:01 EST 2007
I'm trying to boot linux 2.6.22.9 on an mpc860c rev d4.
When init trys to spawn sh, during the exec, the kernel oopses as seen
below:
## Starting application at 0x00400000 ...
loaded at: 00400000 004EF15C
board data at: 03F9FBC0 03F9FBFC
relocated to: 00404044 00404080
zimage at: 00404E74 004EC662
avail ram: 004F0000 04000000
Linux/PPC load: console=ttyCPM,38400
Uncompressing Linux...done.
Now booting the kernel
Linux version 2.6.22.9 (jtyner at johnnyedge) (gcc version 4.2.1) #113 Wed Nov 21 10:49:36 PST 2007
Zone PFN ranges:
DMA 0 -> 16384
Normal 16384 -> 16384
early_node_map[1] active PFN ranges
0: 0 -> 16384
Built 1 zonelists. Total pages: 16256
Kernel command line: console=ttyCPM,38400
PID hash table entries: 256 (order: 8, 1024 bytes)
Decrementer Frequency = 183750000/60
Console: colour dummy device 80x25
cpm_uart: console: compat mode
Dentry cache hash table entries: 8192 (order: 3, 32768 bytes)
Inode-cache hash table entries: 4096 (order: 2, 16384 bytes)
Memory: 63244k available (880k kernel code, 268k data, 444k init, 0k highmem)
Mount-cache hash table entries: 512
ADDSI: Init
io scheduler noop registered (default)
Serial: CPM driver $Revision: 0.02 $
ttyCPM0 at MMIO 0xc5000a80 (irq = 20) is a CPM UART
mice: PS/2 mouse device common for all mice
Freeing unused kernel memory: 444k init
init started: BusyBox v1.8.0 (2007-11-16 14:24:51 PST)
starting pid 103, tty '': '/bin/sh'
Oops: kernel access of bad area, sig: 11 [#1]
NIP: c0044ed0 LR: c0044ff0 CTR: 00000001
REGS: c3c0bd00 TRAP: 0300 Not tainted (2.6.22.9)
MSR: 00009032 <EE,ME,IR,DR> CR: 30099099 XER: a0008c7f
DAR: ff80103f, DSISR: c0000000
TASK = c0288070[103] 'init' THREAD: c3c0a000
GPR00: c0044ff0 c3c0bdb0 c0288070 ff800fff 00000000 7faf8000 00000000 00000000
GPR08: c01a8f58 c017d91c 00000002 c0179cd0 30099093 1007687c 00000002 c00f8744
GPR16: 00000000 c00f0a64 c011d1ac c00f0aa4 c00f0a90 c0120000 00000001 00000003
GPR24: c3c1ce00 00000000 c0180000 c0247550 00000000 c3c0bdc8 c0179cd0 ff800fff
NIP [c0044ed0] remove_vma+0x14/0x70
LR [c0044ff0] exit_mmap+0xc4/0xf0
Call Trace:
[c3c0bdb0] [c3c0bdc8] 0xc3c0bdc8 (unreliable)
[c3c0bdc0] [c0044ff0] exit_mmap+0xc4/0xf0
[c3c0bdf0] [c000f74c] mmput+0x50/0xd4
[c3c0be00] [c00591f4] flush_old_exec+0x3b8/0x7a8
[c3c0be50] [c0086cc0] load_elf_binary+0x2e8/0x1454
[c3c0bee0] [c005892c] search_binary_handler+0x58/0x12c
[c3c0bf00] [c0059d64] do_execve+0x13c/0x1f0
[c3c0bf20] [c00089b4] sys_execve+0x50/0x90
[c3c0bf40] [c0002a40] ret_from_syscall+0x0/0x38
Instruction dump:
7d808120 38210040 4e800020 83c30000 4bffff18 38a00000 4bffff9c 7c0802a6
9421fff0 bfc10008 90010014 7c7f1b78 <81230040> 83c3000c 2f890000 419e0018
The interesting thing is that r3 points to something funny. While tracing
this problem down, I replaced the remove_vma function with the following:
/*
* Close a vm structure and free it, returning the next.
*/
static struct vm_area_struct * __attribute__((__noinline__)) __remove_vma(struct vm_area_struct *vma)
{
struct vm_area_struct *next = vma->vm_next;
might_sleep();
if (vma->vm_ops && vma->vm_ops->close)
vma->vm_ops->close(vma);
if (vma->vm_file)
fput(vma->vm_file);
mpol_free(vma_policy(vma));
kmem_cache_free(vm_area_cachep, vma);
return next;
}
static struct vm_area_struct *remove_vma(struct vm_area_struct *vma)
{
asm volatile (
"lis 4,-128\n"
"ori 4,4,4095\n"
"tweq 3,4\n"
"lwz 5,0(1)\n"
"tweq 3,4\n"
);
return __remove_vma( vma );
}
With this code, the kernel oopses on the *second* tweq instruction:
Kernel BUG at c0045fd4 [verbose debug info unavailable]
Oops: Exception in kernel mode, sig: 5 [#1]
NIP: c0045fd4 LR: c00460a0 CTR: 00000001
REGS: c3c0bd10 TRAP: 0700 Not tainted (2.6.22.9)
MSR: 00029032 <EE,ME,IR,DR> CR: 30099099 XER: a0008c7f
TASK = c0292b40[103] 'init' THREAD: c3c0a000
GPR00: 00000001 c3c0bdc0 c0292b40 ff800fff ff800fff c3c0bdf0 00000000 00000000
GPR08: c0219398 c017d91c 00000002 c0179cd0 30099093 1007687c 00000002 c00f8744
GPR16: 00000000 c00f0a64 c011d1ac c00f0aa4 c00f0a90 c0120000 00000001 00000003
GPR24: c3c32e00 00000000 c0180000 c0247080 00000000 c3c0bdc8 c0179cd0 c017641c
NIP [c0045fd4] remove_vma+0x10/0x18
LR [c00460a0] exit_mmap+0xc4/0xf0
Call Trace:
[c3c0bdc0] [c0046074] exit_mmap+0x98/0xf0 (unreliable)
[c3c0bdf0] [c000f74c] mmput+0x50/0xd4
[c3c0be00] [c005920c] flush_old_exec+0x3b8/0x7a8
[c3c0be50] [c0086cd8] load_elf_binary+0x2e8/0x1454
[c3c0bee0] [c0058944] search_binary_handler+0x58/0x12c
[c3c0bf00] [c0059d7c] do_execve+0x13c/0x1f0
[c3c0bf20] [c00089b4] sys_execve+0x50/0x90
[c3c0bf40] [c0002a40] ret_from_syscall+0x0/0x38
Instruction dump:
7fe4fb78 4800a0ed 80010014 7fc3f378 7c0803a6 bbc10008 38210010 4e800020
3c80ff80 60840fff 7c832008 80a10000 <7c832008> 4bffff7c 7c0802a6 9421ffd0
The access of memory through r1 seems to corrupt r3, and always with the
same value. The problem isn't necessarily here, though. If I modify my
remove_vma function to cause and correct the problem (by saving r3 prior
to the memory access and restoring it afterwards), I just get the same
problem in some other part of the code, but the oops is always caused
because the base register for some memory access is set to ff800fff.
I applied a recent patch I found that corrects the address returned by
cpm_dpram_addr and its use in cpu_uart_cpm1.h, and I've created my own
platform setup file by copying the mpc866ads setup enough to get the
console uart (SMC1) to work.
If there is any other information I can or need to provide, let me
know. Any help would be greatly appreciated.
Thanks,
John
More information about the Linuxppc-dev
mailing list