[RFC] Old paca being written by firmware after kexec

David Gibson david at gibson.dropbear.id.au
Fri Oct 7 13:39:39 EST 2005


On Fri, Oct 07, 2005 at 12:03:30PM +1000, Michael Ellerman wrote:
> Hi,
> 
> I'm seeing a bug of sorts when I kexec some kernels. It's exhibiting as page 
> structs being corrupted early during boot:
> 
> freeing bootmem node 0
> Bad page state at __free_pages_ok (in process 'swapper', page 
> c0000000005a7408)
> flags:0x0000000000020004 mapping:0000000000000000 mapcount:0 count:0
> Backtrace:
> Call Trace:
> [c00000000257bc00] [c000000002089c54] .bad_page+0x90/0xec (unreliable)
> [c00000000257bc80] [c00000000208a26c] .__free_pages_ok+0x170/0x174
> [c00000000257bd50] [c0000000025385e8] .free_all_bootmem_core+0x3e8/0x404
> [c00000000257be30] [c000000002534188] .mem_init+0xe0/0x1d8
> [c00000000257bed0] [c00000000252082c] .start_kernel+0x1f8/0x328
> [c00000000257bf90] [c000000002008684] .hmt_init+0x0/0x7c
> Trying to fix it up, but a reboot is needed
> 
> What's happening is that firmware is writing into the lppaca of the old 
> kernel, which is now being used by the second kernel for page structs.
> 
> It seems to be writing into the word starting at paca[x].lppaca.reserved2, 
> which I guess it's allowed to do seeing as it's reserved.
> 
> For kdump this isn't an issue, as the second kernel doesn't reuse the first 
> kernel's memory.
> 
> But for regular kexec it can be a problem. I think we're getting away with it 
> most of the time because a) if you kexec the same kernel then the paca will 
> land in the same spot, b) it only seems to write into those locations for a 
> short while early during boot (presumably until we've set up pointers to the 
> new paca?).
> 
> The only solution I can see is to always allocate the paca in the
> same place.  So that a kexec from one kernel to another always
> results in the paca landing in the same spot.

Heh.. maybe a reason to write iBoot.

Hrm.  I think we should probably split the paca and lppaca in this
case, that way only the lppaca needs to be fixed, at least.

How does the hypervisor find the paca address again?

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/people/dgibson
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20051007/8ed9edb5/attachment.pgp 


More information about the Linuxppc64-dev mailing list