[patch 30/33] PS3: Bootwrapper support.
Geoff Levand
geoffrey.levand at am.sony.com
Tue Jun 19 08:47:38 EST 2007
Milton Miller wrote:
> On Sat Jun 16 08:07:06 EST 2007, Geoff Levand wrote:
>> --- a/arch/powerpc/boot/Makefile
>> +++ b/arch/powerpc/boot/Makefile
>> @@ -41,11 +41,13 @@ zliblinuxheader := zlib.h zconf.h zutil.
>> $(addprefix $(obj)/,$(zlib) gunzip_util.o main.o): \
>> $(addprefix $(obj)/,$(zliblinuxheader)) $(addprefix
>> $(obj)/,$(zlibheader))
>>
>> +src-plat-$(CONFIG_PPC_PS3) += ps3-head.S ps3-hvcall.S ps3.c
>> +
>> src-wlib := string.S crt0.S stdio.c main.c flatdevtree.c
>> flatdevtree_misc.c \
>> ns16550.c serial.c simple_alloc.c div64.S util.S \
>> gunzip_util.c elf_util.c $(zlib) devtree.c \
>> 44x.c ebony.c mv64x60.c mpsc.c mv64x60_i2c.c
>> -src-plat := of.c cuboot-83xx.c cuboot-85xx.c holly.c \
>> +src-plat := $(src-plat-y) of.c cuboot-83xx.c cuboot-85xx.c holly.c \
>> cuboot-ebony.c treeboot-ebony.c prpmc2800.c
>> src-boot := $(src-wlib) $(src-plat) empty.c
>
> The policy so far has been code for all platforms is compiled for all
> invocations of make in the boot directory. Therefore just add your
> files directly to src-plat. If needed, add a flags override to set the
> cpu type.
OK, I can do that. I don't think it is a good way, but I will do it.
>> -$(obj)/zImage.ps3: vmlinux
>> - $(STRIP) -s -R .comment $< -o $@
>> +$(obj)/zImage.ps3: vmlinux $(wrapper) $(wrapperbits)
>> $(srctree)/$(src)/dts/ps3.dts
>> + $(STRIP) -s -R .comment $< -o vmlinux.strip
>> + $(call cmd,wrap,ps3,$(srctree)/$(src)/dts/ps3.dts,,)
>>
>> -$(obj)/zImage.initrd.ps3: vmlinux
>> - @echo " WARNING zImage.initrd.ps3 not supported (yet)"
>> +$(obj)/zImage.initrd.ps3: vmlinux $(wrapper) $(wrapperbits)
>> $(srctree)/$(src)/dts/ps3.dts $(obj)/ramdisk.image.gz
>> + $(call
>> cmd,wrap,ps3,$(srctree)/$(src)/dts/ps3.dts,,$(obj)/ramdisk.image.gz)
>
> The separate rule here is to pick up ps3.dts, correct?
>
> After Marc's 3 patch series this would be handled by the config file.
Yes, these rules will need to be re-worked, but the current form
will continue to work with those changes.
>> + * __system_reset_overlay - The PS3 first stage entry.
>> + *
>> + * The bootwraper build script copies the 0x100 bytes at symbol
>> + * __system_reset_overlay to offset 0x100 of the rom image.
>> + */
>> +
>> + .globl __system_reset_overlay
>> +__system_reset_overlay:
>> +
>> + /* Switch to 32-bit mode. */
>> +
>> + mfmsr r9
>> + clrldi r9,r9,1
>> + mtmsrd r9
>> + nop
>> +
>> + /* Get thread number in r3 and branch. */
>> +
>> + mfspr r3, 0x88
>> + cntlzw. r3, r3
>> + li r4, 0
>> + li r5, 0
>> + beq 1f
>> +
>> + /* Secondary goes to __secondary_hold in kernel. */
>> +
>> + li r4, 0x60
>> + mtctr r4
>> + bctr
>> +
>> + /* Primary waits for __secondary_hold_acknowledge. */
>> +1:
>> + li r5, 0x10 /* __secondary_hold_acknowledge */
>> + or 28, 28, 28 /* db8cyc */
>> +
>> + ld r4, 0(r5)
>> + cmpdi r4, 0
>> + beq 1b
>
> (1) The address of __secondary_hold_acknowledge is not stable. It has
> changed in the past and may change in the future. (2) This works
> because you always have exactly 2 cpus, and your secondary cpu is not
> id 0.
Yes, it is simplified and PS3 specific.
> Because this is part of the kernel source tree, I'm not opposed to this
> usage, but I think some checking would be in order. A comment that
> this only works for two cpus with the slave id not being 0, and some
> check that nm on the kernel image results in the expected symbol and
> value.
I can make it work without using __secondary_hold_acknowledge by just
using a delay, but it is not as precise.
> Since I'm here, I'll mention that the li to r5 doesn't need to be in
> the loop. In fact, the load to r4 could be expressed as 0x10(0) or
> better create a symbol and use secondary_ack at l(0) eliminating the need
> for r5 and easing the above check.
Seems like a cleaner way, but I don't think it is a good idea to use
__secondary_hold_acknowledge if it is not well known, so I will avoid
its use.
>> +
>> + /* Primary goes to _zimage_start in wrapper. */
>> +
>> + lis r4, _zimage_start at ha
>> + addi r4, r4, _zimage_start at l
>> + mtctr r4
>> + bctr
>> +
>> +/*
>> + * __system_reset_kernel - Place holder for the kernel reset vector.
>> + *
>> + * The bootwrapper build script copies 0x100 bytes from offset 0x100
>> + * of the rom image to the symbol __system_reset_kernel. At runtime
>> + * the bootwrapper program copies the 0x100 bytes at
>> __system_reset_kernel
>> + * to ram address 0x100. This symbol must occupy 0x100 bytes.
>> + */
>> +
>> + .globl __system_reset_kernel
>> +__system_reset_kernel:
>> +
>> + . = __system_reset_kernel + 0x100
>
> Not bad. Since this code doesn't have to be at any particular
> location when it is built (you have no .org or similar, its all PIC), I
> don't see anything that couldn't be done from an asm block in the
> platform .c file, like the entry point in prpmc2800.c file; that would
> save a file in the boot directory.
For me that would be a significant effort, and I don't think I can
consolidate them for 2.6.23. I will consider it for 2.6.24.
>> +BSS_STACK(4096);
>> +
>> +/* A buffer that may be edited by tools operating on a zImage binary
>> so as to
>> + * edit the command line passed to vmlinux (by setting
>> /chosen/bootargs).
>> + * The buffer is put in it's own section so that tools may locate it
>> easier.
>> + */
>> +static char cmdline[COMMAND_LINE_SIZE]
>> + __attribute__((__section__("__builtin_cmdline")));
>> +
>> +static void prep_cmdline(void *chosen)
>> +{
>> + if (cmdline[0] == '\0')
>> + getprop(chosen, "bootargs", cmdline,
>> COMMAND_LINE_SIZE-1);
>> + else
>> + setprop_str(chosen, "bootargs", cmdline);
>> +
>> + printf("cmdline: '%s'\n", cmdline);
>> +}
>> +
>> +static void ps3_console_write(const char *buf, int len)
>> +{
>> +}
>
> I'm guessing your testing was done with some non-empty function that
> you are not prepared to release. Otherwise you would not have added
> all the printf code. I'm not complaining, just noting for my
> observations below.
Someone might be interested in making ps3_console_write() save the
messages to memory and print them later, so I thought it wold be
good to leave the printf's in.
>> +
>> +static void ps3_exit(void)
>> +{
>> + printf("ps3_exit\n");
>> + lv1_panic(0); /* zero = no reboot */
>> + while (1);
>> +}
>
> Does the hypervisor give some kind of indication that the code paniced?
> A comment telling us users what to expect might be nice.
The lpar will be shutdown. Non-zero will make it reboot. That is the
documented behavior. If I get a better idea of what it will do I will
put in a better comment.
>> +void platform_init(void)
>> +{
>> + extern char _end[];
>> + extern char _dtb_start[];
>> + extern char _initrd_start[];
>> + extern char _initrd_end[];
>> + const u32 heapsize = 0x1000000 - (u32)_end; /* 16MiB */
>
> Ok your heap is from (wrapper) _end to 16MiB.
>
>> + void *chosen;
>> + unsigned long ft_addr;
>> + u64 rm_size;
>> +
>> + console_ops.write = ps3_console_write;
>> + platform_ops.exit = ps3_exit;
>> +
>> + printf("\n-- PS3 bootwrapper --\n");
>> +
>> + simple_alloc_init(_end, heapsize, 32, 64);
>> + ft_init(_dtb_start, 0, 4);
>
> Ok this is the end of platform_init for most platforms.
>
>> +
>> + chosen = finddevice("/chosen");
>> +
>> + ps3_repository_read_rm_size(&rm_size);
>> + dt_fixup_memory(0, rm_size);
>
> this is dt_fixups.
>
>> +
>> + if (_initrd_end > _initrd_start) {
>> + setprop_val(chosen, "linux,initrd-start",
>> (u32)(_initrd_start));
>> + setprop_val(chosen, "linux,initrd-end",
>> (u32)(_initrd_end));
>> + }
>
> This is part of prep_initrd.
>
>> +
>> + prep_cmdline(chosen);
>> +
>> + ft_addr = dt_ops.finalize();
>> +
>> + ps3_copy_vectors();
>> +
>> + printf(" flat tree at 0x%lx\n\r", ft_addr);
>> +
>> + ((kernel_entry_t)0)(ft_addr, 0, NULL);
>> +
>> + ps3_exit();
>> +}
>
> Based on past comments, I belive that you are not calling the normal
> _start() specifically to avoid dragging in zlib. Oh, but you call
> prep_cmdline, which means you still get main.c and it drags zlib. I
> guess we should move prep_initrd and prep_cmdline. Were you just
> skipping prep_kernel?
I was hoping to eventually get rid of most of the unused stuff that
is now linked in, but for now I just need something working.
> This code and the concept in your .lds file looks like it could be used
> by other platforms in a generic way
> (except for the call to ps3_exit which should be exit, and the call to
> ps3_copy_vectors, which can be moved earlier. While its likely similar
> functionality would be needed, the details of what to patch and copy
> may vary).
I did try to make it generic in my initial RFC's but it wasn't well
received, so I just went with a platform specific approach and tried
to simplify it.
>> --- /dev/null
>> +++ b/arch/powerpc/boot/zImage.ps3.lds.S
>> @@ -0,0 +1,50 @@
>> +OUTPUT_ARCH(powerpc:common)
>> +ENTRY(_zimage_start)
>> +EXTERN(_zimage_start)
>> +SECTIONS
>> +{
>> + _vmlinux_start = .;
>> + .kernel:vmlinux.bin : { *(.kernel:vmlinux.bin) }
>> + _vmlinux_end = .;
>> +
>> + . = ALIGN(8);
>> + _dtb_start = .;
>> + .kernel:dtb : { *(.kernel:dtb) }
>> + _dtb_end = .;
>> +
>> + . = ALIGN(4096);
>> + _initrd_start = .;
>> + .kernel:initrd : { *(.kernel:initrd) }
>> + _initrd_end = .;
>> +
>> + _start = .;
>> + .text :
>> + {
>> + *(.text)
>> + *(.fixup)
>> + }
>> + _etext = .;
>> + . = ALIGN(4096);
>> + .data :
>> + {
>> + *(.rodata*)
>> + *(.data*)
>> + *(.sdata*)
>> + __got2_start = .;
>> + *(.got2)
>> + __got2_end = .;
>> + }
>> +
>> + . = ALIGN(4096);
>> + _edata = .;
>> +
>> + . = ALIGN(4096);
>> + __bss_start = .;
>> + .bss :
>> + {
>> + *(.sbss)
>> + *(.bss)
>> + }
>> + . = ALIGN(4096);
>> + _end = . ;
>> +}
>
> You don't seem to relocate the initrd. Your lds file has the initrd on
> the 4k page boundary after the dtb, which is after the kernel as copied
> by objcopy -O binary. As far as I know, objcopy only copies the load
> sections; it doesn't create space in the output binary image for alloc
> only sections. What prevents the initrd from being in the kernel bss,
> which it clears before even looking at the device tree? For that
> matter, what prevents the device tree from being allocated in this bss
> range?
Yes, a point I missed. I think it is easier to just make objcopy output
the bss, so that is what I'll do for now.
objflags="-O binary --set-section-flags=.bss=contents,alloc,load,readonly,data"
I'll consider doing a relocation later.
> I'm guessing that you tested with some additional code to display the
> console output, which made _end be above the kernel _end, or close
> enough that the finalized bss was above the kernel _end? Did you test
> an initrd? I guess an attached initrd would also solve the problem of
> the final dtb being overwritten, but not the start of the initrd.
No, I did not test initrd, so it is still todo.
> I think your approach to the image layout has promise but still needs
> some refinement. To solve the problem of where is the kernel end, I
> would create a section to hold the kernel elf header and use the
> elf_parse routines that are now conveniently in a separate file. The
> wrapper script would dd off the header and then it would be linked into
> the wrapper like the other sections. At that point the initrd could be
> moved before creating the malloc pool above it.
OK, I'll consider it.
> This approach can be used whenever there is something capable of
> installing an image at address 0 in ram before starting its execution,
> and is useful when the time to install that image is less than the time
> to decompress the kernel (and copy either it or the zImage, for the
> case where the entry point is fixed in low memory). I can think of
> other platforms where this holds true. In your case, this is true
> because the hypervisor decompresses the image.
>
> What should this type of image be called? Its not really a zImage,
> since there is nothing compressed (well, maybe the initrd/initramfs).
> uImage is already taken (u would have stood for uncompressed).
> Something like wrapped image? flat image? combined image? We'd still
> do make zImage to invoke make for it.
At this point I just want to get my platform code included so distros
can start to use it. I don't want to turn this patch into an effort to
provide a new generic kernel infrastructure. That should be a different
effort, then when that is ready we can switch this code over.
> One more point. You indicated that this image had a standard entry
> point. Were you thinking it will be used for kexec? Or were you just
> trying to state you used the library _zimage_start?
I was thinking kexec. My initial work was to add zImage support to
kexec, but I dropped that to focus on getting the bootwrapper and
kexec of vmlinux working. I have no plan to restart that work
so will drop that part of the comment.
I'll send out an updated patch.
-Geoff
More information about the Linuxppc-dev
mailing list