[Linuxppc-dev] boot code rework

Peter Bergner bergner at vnet.ibm.com
Wed Jun 12 06:04:12 EST 2002


Anton Blanchard wrote:

> Ive been seeing a lot of kernels fail to boot on large S85 and p690
> machines lately, so Paul and I sat down today to look at the potential
> problems with early boot. There are a few things:
>
> 1. Must not touch OF at 12MB until we are done with it.
> 2. Must not touch the RTAS region at any time


Yup.



> There were a number of problem scenarios:
>
> 1. The RTAS region can get quite large, on a p690 it was over 5MB
> Since we were allocating the RTAS region directly above the kernel it
> was possible to overlap the OF region. If firmware zeroes this
> region then its very likely we will die right there. (We should
> really be using claim here, it would have picked this bug up)


Yes, this does seem to be a problem if indeed, RTAS can get that large!


> 2. With a very large RTAS region any memory allocated with the klimit
> hack above the RTAS region could end up inside the RTAS region once
> we relocate!


This should _not_ happen... or at least it's not supposed to happen! :)
The kernel memory used to hold RTAS is reserved with a klimit allocation
just like the other early boot time allocations.  However it is instantiated
at the physical address of where it is supposed to end up after the kernel
is relocated.  I took great pains to make sure we didn't copy over the
RTAS region while relocating the kernel and you'll see we skip relocating
the RTAS region since it's already located where it's supposed to live.




> For 2.4 I suggest we allocate the RTAS region above OF, this will
> hopefully fix the boot problems Ive been seeing.
>
> I experimented today with a larger change in 2.5:
>
> 1. link the kernel at 0xC....4000
> 2. store the exception vectors in an __init section
> 3. load the zImage wrapper somewhere high (8MB at this stage)
> 4. load the kernel at 0x4000 physical
> 5. when finished with OF, copy the linux exception vectors in


Reworking this code would definitely be a good thing!!!
Death to RELOC!!!!

Only problem I see is that the zImage can be quite large if
we have an attached ramdisk/initrd.  We might want to think
about linking it above OF too.


> Since we jump out of all the exception vectors via an rfid, we dont
> have to do anything at all to make this work. The really nice thing
> is that we can take advantage of the fact that real mode ignores the
> top two bits.
>
> This means RELOC is no more!


Hooray!




> In fact Ive just gone through and shortened the pSeries boot sequence by a
> significant amount. There is no need to do any of the copy_down or relocate
> toc/naca etc two or more times. We should be able to simplify the iSeries
> boot too since there will be no need to do the final paca/naca/r2 etc
> relocation in the common code. Its booting on my 270, I'll clean it up
> a bit more before posting the patch.


Once this gets tested well, we're going to want to backport this to 2.4.X
as the distro's will no doubt want/need this for installs on _large_ systems.



> Im interested to know how we can handle yaboot. Will it load the zImage
> correctly? Can we get it to load the vmlinux at physical 0x4000?


I can make yaboot do whatever we need it to do! :)  Yaboot just loads the
zImage LOAD segments where the zImage program header tells it to.  If you
want it to load somewhere else, we just need to update arch/ppc64/boot/
files to put it somewhere else.  No changes to Yaboot should be required.
Since the zImage wrapper is tightly connected to the kernel, we can make
any changes we want to it and not have to worry about backward compatability.

The plan is to also move to requiring booting a zImage rather than a vmlinux,
but we haven't done this yet...  I'm not sure whether you forward ported
our latest 2.4.X zImage changes, but that would be good to have in 2.5.X.
I'm including the entire vmlinux in it's own elf section so it can now be
extracted with objcopy.  Before, we stripped the elf header so you couldn't
objdump the extracted vmlinux.


Peter


** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/





More information about the Linuxppc64-dev mailing list