2.6.15 failure on power4 iSeries

will schmidt will_schmidt at vnet.ibm.com
Thu Jan 19 09:07:43 EST 2006


will schmidt wrote:
> Michael Ellerman wrote:
> 
>>On Thu, 19 Jan 2006 01:38, will schmidt wrote:
>>
>>
>>>Michael Ellerman wrote:
>>>
>>>
>>>>On Wed, 18 Jan 2006 01:42, will schmidt wrote:
>>>>
>>>>
>>>>>Michael Ellerman wrote:
>>>>>
>>>>>
>>>>>>On Tue, 17 Jan 2006 10:35, will schmidt wrote:
>>>>>>
>>>>>>
>>>>>>>attempting to boot current kernels on a power4 iSeries doesnt work. 
>>>>>>>have tried both the powerpc-git tree and the torvalds-git tree.
>>>>>>>
>>>>>>>OS/400 RefCode is "C200 82FF".  (which means nothing to me :-)
>>>>>>
>>>>>>C20082FF VSP IPL complete successfully
>>>>>>
>>>>>>
>>>>>>
>>>>>>>no console output at all.
>>>>>
>>>>>This looks like the "2.6.15-mm4 failure on power5" output..  but I dont
>>>>>see a *cpuc-to-mutexes.patch in this tree to back out.  (torvalds-git)
>>>>>
>>>>>Will clean my glasses and look closer in a bit.. :-)
>>>>
>>>>Yeah, looks similar. I can't reproduce that crash on my POWER5 box here
>>>>though, so I'm not sure if that patch is actually the problem. Might be
>>>>worth git bisecting.
>>>
>>>I already git bisected to get it narrowed down to that one patch.   Or are
>>>you saying that the patch is broken up into more parts in the powerpc-git
>>>tree?
>>>
>>>Same tree builds and boots OK on power5 partition here too..   this seems
>>>to be something unique to power4 iSeries.
>>
>>
>>Sorry, getting the two bugs confused. I'm not sure which of the code in 
>>question is in mm vs Linus' git. It might be worth trying one of Ingo's 
>>patches for the other bug though, just in case.
>>
>>
>>
>>>>You could try adding calls to udbg_printf() in start_kernel() to see if
>>>>we're getting in there.
>>
>>
>>Any luck with this?
> 
> 
> yup, just added some more debug..
> 
> looks like setup_arch() calls into do_init_bootmem() which loops around a reserve_bootmem() call.   the last call into reserve_bootmem isnt returning.
> 
> debug code:
>         DBG("-> do_init_bootmem %d\n",__LINE__);
> 
>          DBG("-> do_init_bootmem lmb.reserved.cnt %d\n",lmb.reserved.cnt);
>          /* reserve the sections we're already using */
>          for (i = 0; i < lmb.reserved.cnt; i++) {
>                  DBG("-> reserve_bootmem ( %lx %lx %d \n",lmb.reserved.region[i].base,lmb_size_bytes(&lmb.reserved,i),i);
>                  reserve_bootmem(lmb.reserved.region[i].base,
>                                  lmb_size_bytes(&lmb.reserved, i));
>          DBG("<- reserve_bootmem \n");
> }
>          DBG("-> do_init_bootmem %d\n",__LINE__);
> 
> 
> console output:
> -> do_init_bootmem 285
> -> do_init_bootmem lmb.reserved.cnt 5
> -> reserve_bootmem ( 0 500e80 0
> <- reserve_bootmem
> -> reserve_bootmem ( ffe5000 1b000 1
> <- reserve_bootmem
> -> reserve_bootmem ( 3c7f7000 8000 2
> <- reserve_bootmem
> -> reserve_bootmem ( 3c7ff668 994 3
> <- reserve_bootmem
> -> reserve_bootmem ( 3c800000 0 4
> 
> looks like lmb_size_bytes is returning a zero for that last lmb..
> 

have been trying to narrow down where the zero is coming from, Some additional debug here:

->  lmb_alloc_base 0x1000 0x80 0x10000000

  ->  lmb_add_region (4 base:0x0 size:0x0

  <-  lmb_add_region (4 base:0x0 size:0x0

  ->  lmb_alloc_base 0x1000 0x80 0x10000000

  ->  lmb_add_region (4 base:0x0 size:0x0

  <-  lmb_add_region (4 base:0x0 size:0x0

  ->  lmb_alloc_base 0x1000 0x80 0x10000000

  ->  lmb_add_region (4 base:0x0 size:0x0

  <-  lmb_add_region (4 base:0x0 size:0x0
-> setup_arch 595
-> setup_arch 598
-> do_init_bootmem 248
-> do_init_bootmem 260
-> do_init_bootmem 263

  ->  lmb_alloc_base 0x8000 0x1000 0x0

  ->  lmb_add_region (4 base:0x0 size:0x0

  <-  lmb_add_region (5 base:0x0 size:0x0
-> do_init_bootmem 269
-> do_init_bootmem 285
-> do_init_bootmem lmb.reserved.cnt 5
-> reserve_bootmem ( 0 500e80 0
<- reserve_bootmem
-> reserve_bootmem ( ffe5000 1b000 1
<- reserve_bootmem
-> reserve_bootmem ( 3c7f7000 8000 2
<- reserve_bootmem
-> reserve_bootmem ( 3c7ff668 994 3
<- reserve_bootmem
-> reserve_bootmem ( 3c800000 0 4


the code making the last call into lmb_alloc looks like:
       DBG("-> do_init_bootmem %d\n",__LINE__);
         bootmap_pages = bootmem_bootmap_pages(total_pages);

         DBG("-> do_init_bootmem %d\n",__LINE__);
         start = lmb_alloc(bootmap_pages << PAGE_SHIFT, PAGE_SIZE);   /* wms */
         BUG_ON(!start);

         boot_mapsize = init_bootmem(start >> PAGE_SHIFT, total_pages);


I think we call direct into lmb_alloc_base most of the time,..  so the difference with this call is the LMB_ALLOC_ANYWHERE parm..

unsigned long __init lmb_alloc(unsigned long size, unsigned long align)
{
         return lmb_alloc_base(size, align, LMB_ALLOC_ANYWHERE);
}

LMB_ALLOC_ANYWHERE looks to have a value of 0.  "include/asm-powerpc/lmb.h:#define LMB_ALLOC_ANYWHERE    0"


I am not very familiar with this code..  dont know if the '0' here is mistakenly mapped to a size, or if this 0 is a red herring.

-Will

> 
> 
> 
> 
> 
> 
> 
> 
> 
>>cheers
>>
> 
> 
> _______________________________________________
> Linuxppc64-dev mailing list
> Linuxppc64-dev at ozlabs.org
> https://ozlabs.org/mailman/listinfo/linuxppc64-dev




More information about the Linuxppc64-dev mailing list