ppc64 oops..

Tue Nov 15 16:27:20 EST 2005

On Tue, 15 Nov 2005, Paul Mackerras wrote:
> 
> How much RAM do you have?  That address is in the I/O hole (from 2G to
> 4G).

Hmm. I _thought_ I had just 2GB (possibly 4GB) in this machine, but the 
bootup says

  ...
  [boot]0100 MM Init
  IO Hole assumed to be 80000000 -> ffffffff
  [boot]0100 MM Init Done
  Linux version 2.6.15-rc1-g4060994c (torvalds at g5.osdl.org) (gcc version 4.0.1 200..
  [boot]0012 Setup Arch
  Top of RAM: 0x180000000, Total RAM: 0x100000000
  Memory hole size: 2048MB
  ...
  On node 0 totalpages: 1572864
    DMA zone: 1572864 pages, LIFO batch:64
    DMA32 zone: 0 pages, LIFO batch:2
    Normal zone: 0 pages, LIFO batch:2
    HighMem zone: 0 pages, LIFO batch:2

(I'm now running a newer kernel that has a DMA32 zone, I wasn't running 
that when the oops happened).

Which looks like it thinks I have 6GB. That's what "free" thinks too. 
Cool. I just got 4GB extra memory without even opening the machine!

Magic kernel.

And I just found out how I can instantly crash the kernel again:

	int main(int argc, char **argv)
	{
	        char * buf = malloc(1024*1024*1024);
	        memset(buf, 0, 1024*1024*1024);
	        sleep(100);
	}

I run two of those programs, and on the second one I get an oops again:

  Unable to handle kernel paging request for data at address 0xc0000000ff000000
  Faulting instruction address: 0xc000000000030800
  Oops: Kernel access of bad area, sig: 11 [#1]
  SMP NR_CPUS=2 NUMA POWERMAC
  Modules linked in: autofs
  NIP: C000000000030800 LR: C0000000000971F0 CTR: 0000000000000020
  REGS: c0000001023a38d0 TRAP: 0300   Not tainted  (2.6.15-rc1-g4060994c)
  MSR: 9000000000009032 <EE,ME,IR,DR>  CR: 88000448  XER: 00000000
  DAR: C0000000FF000000, DSISR: 0000000042010000
  TASK = c00000015af957c0[19554] 'a.out' THREAD: c0000001023a0000 CPU: 1
  GPR00: 0000000000000080 C0000001023A3B50 C0000000006C8EF0 C0000000FF000000
  GPR04: 00000000BADB9000 C0000000040E6000 C0000000005B6C00 9000000000009032
  GPR08: C00000017BFB8A00 C0000000006CAD30 C0000000006CDCA0 0000000000000020
  GPR12: 0000000088000442 C0000000005B6C00 0000000000000000 000000001016D918
  GPR16: 00000000100D0000 0000000000000000 00000000100D0000 0000000000000000
  GPR20: C00000007EC566B0 C00000017BC13980 C00000016F836590 00000000BADB9000
  GPR24: 0000000002000000 0000000000000000 0000000000000DC8 C0000000040E6000
  GPR28: C000000006D08DC8 C000000006D08000 C0000000005D2EB8 0000000000000000
  NIP [C000000000030800] .clear_user_page+0x10/0x60
  LR [C0000000000971F0] .__handle_mm_fault+0xda0/0xf10
  Call Trace:
  [C0000001023A3B50] [C000000000097184] .__handle_mm_fault+0xd34/0xf10 (unreliable)
  [C0000001023A3C60] [C000000000496D3C] .do_page_fault+0x4ec/0x7f0
  [C0000001023A3E30] [C000000000004760] .handle_page_fault+0x20/0x54
  Instruction dump:
  4d820020 7c0018a8 7c004878 7c0019ad 40c2fff4 4e800020 60000000 60000000
  e922a810 8169000c 80090004 7d6903a6 <7c001fec> 7c630214 4320fff8 e922a808

ie it seems to have set up the mem_map[] to point all the way down from 
6GB to 0, and then when I've used up the two high GB of memory (the _real_ 
memory in this machine) it starts allocating memory that it doesn't have, 
and that it doesn't have TLB mappings for.

> > (There are other reports of VM-induced problems on -rc1, this is probably 
> > not ppc64-related).
> 
> Looks that way to me...

No, looks like a ppc64 memory setup bug, altough it's quite possibly 
brought on by the PageReserved() removal in the VM layer. 

Andrew, Nick, Hugh, I really think that removing that "PageReserved()" 
test from the page freeing functions was a mistake. I think I'm going to 
add it back in.

I bet this happens on all the other architectures too. The bootup has 
marked pages reserved, and then frees them all. It used to be that the VM 
just silently skipped the reserved pages, now it will add them to the free 
lists..

			Linus