PROBLEM: Linux 3.6.2 fails to boot on IBM Cell
grant.likely at secretlab.ca
Fri Oct 19 09:59:28 EST 2012
On Thu, Oct 18, 2012 at 11:44 PM, Grant Likely
<grant.likely at secretlab.ca> wrote:
> On Thu, Oct 18, 2012 at 10:59 PM, Dennis Schridde <devurandom at gmx.net> wrote:
>> Am Donnerstag, 18. Oktober 2012, 22:32:55 schrieb Grant Likely:
>>> Unfortunately the debug messages don't show up in the console log by
>>> default. Can you either send the output of 'dmesg' after booting, or
>>> add "loglevel=8" to the kernel boot parameters?
>> Here you go.
>> I also see some lines like:
>> irq: no irq domain found for /axon at 10000000000/plb5/pciex-utl at a00000a000004000
>> Is that also a problem?
> [cc'ing linuxppc-dev]
> Okay, so what is happening is that the function cbe_init_pm_irq() is
> trying to set up hwirq numbers 0x7e, 0x17e, 0x27e and continuing up
> every 0x100 to 0xff7e. This happens because that function is
> calculating the hwirq number used for_each_node, and shifts the node
> number up 8 bits to make up the upper bits of the hwirq number.
> However, according the the header file, only '0' and '1' are actual
> valid values for the upper bits.
> CONFIG_NODES_SHIFT = 8 for PowerPC 64, which accounts for the range 0..0xff.
> arch/powerpc/platforms/cell/interrupt.h defines the values of
> IIC_IRQ_NODE_SHIFT = 8 and IIC_IRQ_NODE_MASK 0x100.
> So, from the context, I assume the function is trying to set up a PM
> interrupt for each CPU in the Cell processor; and that there are 2 of
> them. for_each_node() knows nothing of this and dutifully tries to set
> up the irq for 256 processors; way beyond what is valid for the irq
> Also, it should be noted that the irq does actually get set up by
> irqdomain.c, but because everything above 0x1ff is larger than the
> lookup table, it complains. The new code complains loudly (as you
> discovered) if someone tries to use a hwirq larger than the map where
> the old code didn't.
> Looks to me like the fix is to change for_each_node() to something as
> simple as "for (i = 0; i < 2; i++)"
As for the next failure seen in that log, it would appear that the
MPIC ->map hook, mpic_host_map(), is failing on some mappings, but it
does appear that it has some legitimate reasons for doing so. The
difference now is that I changed irqdomain code to complain about it.
That is probably an overreach, or at least it should be yelling quite
so much about it.
More information about the Linuxppc-dev