2.6.37-git17 virtual IO boot failure

Nishanth Aravamudan nacc at us.ibm.com
Wed Jan 19 11:48:24 EST 2011


On 18.01.2011 [14:47:18 -0800], Nishanth Aravamudan wrote:
> On 18.01.2011 [12:31:52 +1100], Anton Blanchard wrote:
> > Hi,
> > 
> > I was testing 2.6.37-git17 on a POWER7 with virtual IO and hit this:
> > 
> > Trying to unpack rootfs image as initramfs...
> > Freeing initrd memory: 7446k freed
> > vio 30000000: Warning: IOMMU dma not supported: mask
> > 0xffffffffffffffff, table unavailable
> > vio 4000: Warning: IOMMU dma not supported: mask 0xffffffffffffffff,
> > table unavailable
> > vio 4001: Warning: IOMMU dma not supported: mask 0xffffffffffffffff,
> > table unavailable
> > vio 4002: Warning: IOMMU dma not supported: mask 0xffffffffffffffff,
> > table unavailable
> > vio 4004: Warning: IOMMU dma not supported: mask 0xffffffffffffffff,
> > table unavailable
> > audit: initializing netlink socket (disabled)
> > 
> > Haven't had a chance to look closer yet.
> 
> After debugging a bit, this would appear to be due to the second hunk of
> b3c73856ae47d43d0d181f9de1c1c6c0820c4515.
> 
> diff --git a/arch/powerpc/kernel/vio.c b/arch/powerpc/kernel/vio.c
> index b265405..1b695fd 100644
> --- a/arch/powerpc/kernel/vio.c
> +++ b/arch/powerpc/kernel/vio.c
> @@ -1257,6 +1257,10 @@ struct vio_dev *vio_register_device_node(struct device_node *of_node)
>         viodev->dev.parent = &vio_bus_device.dev;
>         viodev->dev.bus = &vio_bus_type;
>         viodev->dev.release = vio_dev_release;
> +        /* needed to ensure proper operation of coherent allocations
> +         * later, in case driver doesn't set it explicitly */
> +        dma_set_mask(&viodev->dev, DMA_BIT_MASK(64));
> +        dma_set_coherent_mask(&viodev->dev, DMA_BIT_MASK(64));
> 
>         /* register with generic device framework */
>         if (device_register(&viodev->dev)) {
> 
> Milton, Sonny, any thoughts?

A bit more detail after trying a few more kernels on the box that
originally showed the error:

1) This doesn't actually prevent booting, afaict. I think it "just"
disables DMA, which is bad, but not a boot fail, technically.

2) Reverting the above commit definitely prevents those messages.

3) I'm seeing a separate issue with 2.6.37-git17 (that's not present in
2.6.37):

sd 0:4:2:0: [sda] Aborting command: 2A
sd 0:4:2:0: Abort timed out. Resetting bus.

At which point the box locks up :)

So testing fixes is a bit of a challenge right now.

Ben, if you're ok with waiting to see if Milton or Sonny have any ideas,
I'd like to hold off on asking for a revert. In the case they do, I'll
be able to test and send out any proposed fix rapidly.

Thanks,
Nish

-- 
Nishanth Aravamudan <nacc at us.ibm.com>
IBM Linux Technology Center


More information about the Linuxppc-dev mailing list