Serial RapidIO Maintaintance read causes lock up

Anderson, Trevor tanderson at curtisswright.com
Thu Oct 7 05:08:02 EST 2010


I may have come across a similar problem, but I've never worked card-to-card without at least one switch in the way.
The problems I have encountered have been hang-ups during memory-mapped maintenance reads to devices that the switch reports as "up".

A workaround for the hang-up is to use a DMA transfer that bypasses the ATMU to perform a maintenance read (not all, just the first, a test read of a new discovery). If the DMA fails, as it does in those unusual circumstances, it will report a bus transfer error - but it will not hang up. You can recover, and avoid or re-try the connection later.

A real "fix" for your problem may be to ensure that both of your RapidIO interfaces are programmed to accept all incoming transactions (see "Accept All Configuration Register (AACR)" in Freescale reference manual). But no guarantees with that - it just seemed to clear up problems on my own rig.





From: linuxppc-dev-bounces+tanderson=curtisswright.com at lists.ozlabs.org [mailto:linuxppc-dev-bounces+tanderson=curtisswright.com at lists.ozlabs.org] On Behalf Of Bastiaan Nijkamp
Sent: Tuesday, October 05, 2010 7:28 AM
To: John Traill
Cc: Bounine, Alexandre; linuxppc-dev at lists.ozlabs.org
Subject: Re: Serial RapidIO Maintaintance read causes lock up

Hi John,

1. Yes, they are both running the exact same kernel and both are configured in the same way. With the exception that one is set as host and the other as a agent.

2. Accept All is set for both boards.

3. As i understand, the agent cannot send anything before it is enumerated, so it would be safe to first reset the agent and right after that the host. In either case, thats the way i am using. The full kernel log until the discovery times out after 30 seconds is shown below:

Using SBC8548 machine description
Memory CAM mapping: 256 Mb, residual: 0Mb
Linux version 2.6.35.6 (dl704 at lxws006<mailto:dl704 at lxws006>) (gcc version 4.4.1 (Wind River Linux Sour
cery G++ 4.4-250) ) #3 Tue Oct 5 13:24:45 CEST 2010
bootconsole [udbg0] enabled
setup_arch: bootmem
sbc8548_setup_arch()
arch: exit
Zone PFN ranges:
  DMA      0x00000000 -> 0x00010000
  Normal   empty
Movable zone start PFN for each node
early_node_map[1] active PFN ranges
    0: 0x00000000 -> 0x00010000
MMU: Allocated 1088 bytes of context maps for 255 contexts
Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 65024
Kernel command line: root=/dev/nfs rw nfsroot=192.168.100.21:/thales/target/rfs/
sbc8548_wrlinux4 ip=192.168.100.151:192.168.100.21:192.168.100.21:255.255.255.0:
sbc8548_1:eth0:off console=ttyS0,115200
PID hash table entries: 1024 (order: 0, 4096 bytes)
Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
Memory: 256996k/262144k available (2644k kernel code, 5148k reserved, 108k data,
 77k bss, 136k init)
Kernel virtual memory layout:
  * 0xfffdf000..0xfffff000  : fixmap
  * 0xfdffd000..0xfe000000  : early ioremap
  * 0xd1000000..0xfdffd000  : vmalloc & ioremap
Hierarchical RCU implementation.
 RCU-based detection of stalled CPUs is disabled.
 Verbose stalled-CPUs detection is disabled.
NR_IRQS:512 nr_irqs:512
mpic: Setting up MPIC " OpenPIC  " version 1.2 at e0040000, max 1 CPUs
mpic: ISU size: 80, shift: 7, mask: 7f
mpic: Initializing for 80 sources
clocksource: timebase mult[50cede6] shift[22] registered
pid_max: default: 32768 minimum: 301
Mount-cache hash table entries: 512
NET: Registered protocol family 16

PCI: Probing PCI hardware
bio: create slab <bio-0> at 0
vgaarb: loaded
Switching to clocksource timebase
NET: Registered protocol family 2
IP route cache hash table entries: 2048 (order: 1, 8192 bytes)
TCP established hash table entries: 8192 (order: 4, 65536 bytes)
TCP bind hash table entries: 8192 (order: 3, 32768 bytes)
TCP: Hash tables configured (established 8192 bind 8192)
TCP reno registered
UDP hash table entries: 256 (order: 0, 4096 bytes)
UDP-Lite hash table entries: 256 (order: 0, 4096 bytes)
NET: Registered protocol family 1
RPC: Registered udp transport module.
RPC: Registered tcp transport module.
RPC: Registered tcp NFSv4.1 backchannel transport module.
Setting up RapidIO peer-to-peer network /soc8548 at e0000000/rapidio at c0000
fsl-of-rio e00c0000.rapidio: Of-device full name /soc8548 at e0000000/rapidio at c0000
fsl-of-rio e00c0000.rapidio: Regs: [mem 0xe00c0000-0xe00dffff]
fsl-of-rio e00c0000.rapidio: LAW start 0x00000000c0000000, size 0x0000000020000000.
fsl-of-rio e00c0000.rapidio: pwirq: 48, bellirq: 50, txirq: 53, rxirq 54
fsl-of-rio e00c0000.rapidio: DeviceID is 0xffffffff
fsl-of-rio e00c0000.rapidio: Configured as AGENT
fsl-of-rio e00c0000.rapidio: Overriding RIO_PORT setting to single lane 0
fsl-of-rio e00c0000.rapidio: RapidIO PHY type: serial
fsl-of-rio e00c0000.rapidio: Hardware port width: 4
fsl-of-rio e00c0000.rapidio: Training connection status: Single-lane 0
fsl-of-rio e00c0000.rapidio: RapidIO Common Transport System size: 256
fsl-of-rio e00c0000.rapidio: LAW start 0x00000000c0000000, RIO Maintainance Window Size 0x400000,New Main Start: 0xd1080000
RIO: discover master port 0, RIO0 mport

A interesting thing that i found out is that when the agent is reset while the host is locked up (eg. it cannot be stopped nor can i read the registers and memory trough a JTAG Interface), the host comes back online and just continues booting linux with a RapidIO error. See the log below.

Setting up RapidIO peer-to-peer network /soc8548 at e0000000/rapidio at c0000
fsl-of-rio e00c0000.rapidio: Of-device full name /soc8548 at e0000000/rapidio at c0000
fsl-of-rio e00c0000.rapidio: Regs: [mem 0xe00c0000-0xe00dffff]
fsl-of-rio e00c0000.rapidio: LAW start 0x00000000c0000000, size 0x0000000020000000.
fsl-of-rio e00c0000.rapidio: pwirq: 48, bellirq: 50, txirq: 53, rxirq 54
fsl-of-rio e00c0000.rapidio: DeviceID is 0x0
fsl-of-rio e00c0000.rapidio: Configured as HOST
fsl-of-rio e00c0000.rapidio: Overriding RIO_PORT setting to single lane 0
fsl-of-rio e00c0000.rapidio: RapidIO PHY type: serial
fsl-of-rio e00c0000.rapidio: Hardware port width: 4
fsl-of-rio e00c0000.rapidio: Training connection status: Single-lane 0
fsl-of-rio e00c0000.rapidio: RapidIO Common Transport System size: 256
fsl-of-rio e00c0000.rapidio: LAW start 0x00000000c0000000, RIO Maintainance Window Size 0x400000,New Main Start: 0xd1080000
RIO: enumerate master port 0, RIO0 mport
fsl_rio_config_read: index 0 destid 255 hopcount 0 offset 00000068 len 4
fsl_rio_config_read: Passed IS_ALIGNED.
fsl_rio_config_read: Passed 'out_be32_1'
fsl_rio_config_read: Passed 'out_be32_2'
fsl_rio_config_read: len is 4
fsl_rio_config_read: triggering '__fsl_read_rio_config'
fsl_rio_config_read: going to request to read data at d1080068
RIO: cfg_read error -14 for ff:0:68
fsl_rio_config_read: index 0 destid 255 hopcount 0 offset 00000068 len 4
fsl_rio_config_read: Passed IS_ALIGNED.
fsl_rio_config_read: Passed 'out_be32_1'
fsl_rio_config_read: Passed 'out_be32_2'
fsl_rio_config_read: len is 4
fsl_rio_config_read: triggering '__fsl_read_rio_config'
fsl_rio_config_read: going to request to read data at d1080068
RIO: cfg_read error -14 for ff:0:68
fsl_rio_config_read: index 0 destid 255 hopcount 0 offset 00000068 len 4
fsl_rio_config_read: Passed IS_ALIGNED.
fsl_rio_config_read: Passed 'out_be32_1'
fsl_rio_config_read: Passed 'out_be32_2'
fsl_rio_config_read: len is 4
fsl_rio_config_read: triggering '__fsl_read_rio_config'
fsl_rio_config_read: going to request to read data at d1080068
RIO: cfg_read error -14 for ff:0:68
RIO: master port 0 device has lost enumeration to a remote host

Regards,
Bastiaan
2010/10/5 John Traill <john.traill at freescale.com<mailto:john.traill at freescale.com>>
Bastiaan,

A few things to check.

1. Is the target board also set up for small common transport system size ie 256.

2. Make sure the target has "Accept All" set - in fsl_rio.c look for
      /* Set to receive any dist ID for serial RapidIO controller. */
       if (port->phy_type == RIO_PHY_SERIAL)
               out_be32((priv->regs_win + RIO_ISR_AACR), RIO_ISR_AACR_AA);

3. How do you synchronise reset between both systems ? Both need to be reset to insure the inbound/outbound ackid's remain in sync. If you only reset one then you have the potential for the ackid's to get out of sync. Also what is the kernel log on the agent system ?

Cheers.



On 05/10/10 09:56, Bastiaan Nijkamp wrote:

Hi Alex,

Thanks for your advice. We are trying to make a board-to-board
connection without any additional hardware (eg. a switch). The boards
use a 50-pin, right-angle MEC8-125-02-L-D-RA1 connector from SAMTEC and
are connected trough a EEDP-016-12.00-RA1-RA2-2 cross cable from SAMTEC.
I hope this information is sufficient since there is not much one can
find about it on Google. In addition, you can see a picture of the board
including the connector in the datasheet located at
http://www.windriver.com/products/product-notes/SBC8548E-product-note.pdf.
It is the connector on the left side of the PCI-EX slot.

We have tried your suggestion but the situation does not change other
than the lane-mode being set to single lane 0, it still locks up when
trying to generate a maintenance transaction. I still think it is memory
related since the lock up occurs when accessing the maintenance window.
Although all memory related settings seems to be alright.

The kernel output is as follows:

Setting up RapidIO peer-to-peer network /soc8548 at e0000000/rapidio at c0000
fsl-of-rio e00c0000.rapidio: Of-device full name
/soc8548 at e0000000/rapidio at c0000
fsl-of-rio e00c0000.rapidio: Regs: [mem 0xe00c0000-0xe00dffff]
fsl-of-rio e00c0000.rapidio: LAW start 0x00000000c0000000, size
0x0000000010000000.
fsl-of-rio e00c0000.rapidio: pwirq: 48, bellirq: 50, txirq: 53, rxirq 54
fsl-of-rio e00c0000.rapidio: DeviceID is 0x0
fsl-of-rio e00c0000.rapidio: Configured as HOST
fsl-of-rio e00c0000.rapidio: Overriding RIO_PORT setting to single lane 0
fsl-of-rio e00c0000.rapidio: RapidIO PHY type: serial
fsl-of-rio e00c0000.rapidio: Hardware port width: 4
fsl-of-rio e00c0000.rapidio: Training connection status: Single-lane 0
fsl-of-rio e00c0000.rapidio: RapidIO Common Transport System size: 256
fsl-of-rio e00c0000.rapidio: LAW start 0x00000000c0000000, RIO
Maintainance Window Size 0x400000,New Main Start: 0xd1080000
RIO: enumerate master port 0, RIO0 mport
fsl_rio_config_read: index 0 destid 255 hopcount 0 offset 00000068 len 4
fsl_rio_config_read: Passed IS_ALIGNED.
fsl_rio_config_read: Passed 'out_be32_1'
fsl_rio_config_read: Passed 'out_be32_2'
fsl_rio_config_read: len is 4
fsl_rio_config_read: triggering '__fsl_read_rio_config'
fsl_rio_config_read: going to request to read data at d108006

Regards,
Bastiaan

2010/10/4 Bounine, Alexandre <Alexandre.Bounine at idt.com<mailto:Alexandre.Bounine at idt.com>
<mailto:Alexandre.Bounine at idt.com<mailto:Alexandre.Bounine at idt.com>>>


   Hi Bastiaan,

   Are you trying board-to-board connection?
   I am not familiar with WRS SBC8548 board - which type of connector they
   use for SRIO?

   Assuming that all configuration is correct,
   I would recommend first to try setting up x1 link mode at the lowest
   link speed.
   The x4 mode may present challenges in some cases.

   For quick test you may just add port width override into fsl_rio.c
   like shown below (ugly but sometimes it helps ;) ):

   @@ -1461,10 +1461,16 @@ int fsl_rio_setup(struct platform_device *dev)
           rio_register_mport(port);

           priv->regs_win = ioremap(regs.start, regs.end - regs.start + 1);
           rio_regs_win = priv->regs_win;

   +dev_info(&dev->dev, "Overriding RIO_PORT setting to single lane 0\n");
   +out_be32(priv->regs_win + 0x15C, in_be32(priv->regs_win + 0x15C) |
   0x800000);
   +out_be32(priv->regs_win + 0x15C, in_be32(priv->regs_win + 0x15C) |
   0x2000000);
   +out_be32(priv->regs_win + 0x15C, in_be32(priv->regs_win + 0x15C) &
   ~0x800000);
   +msleep(100);
   +
           /* Probe the master port phy type */
           ccsr = in_be32(priv->regs_win + RIO_CCSR);
           port->phy_type = (ccsr & 1) ? RIO_PHY_SERIAL : RIO_PHY_PARALLEL;
           dev_info(&dev->dev, "RapidIO PHY type: %s\n",
                           (port->phy_type == RIO_PHY_PARALLEL) ?
   "parallel" :


   Let me know what happens.
   Please keep me in the CC: list next time when posting RapidIO questions
   to the linuxppc-dev or kernel mailing lists.

   Regards,

   Alex.



_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev at lists.ozlabs.org<mailto:Linuxppc-dev at lists.ozlabs.org>
https://lists.ozlabs.org/listinfo/linuxppc-dev

--
John Traill
Systems Engineer
Network and Computing Systems Group

Freescale Semiconductor UK LTD
Colvilles Road
East Kilbride
Glasgow G75 0TG, Scotland

Tel: +44 (0) 1355 355494
Fax: +44 (0) 1355 261790

E-mail: john.traill at freescale.com<mailto:john.traill at freescale.com>

Registration Number: SC262720
VAT Number: GB831329053

[ ] General Business Use
[ ] Freescale Internal Use Only
[ ] Freescale Confidential Proprietary


_______________________________________________________________________
This e-mail and any files transmitted with it are proprietary and intended solely for the use of the individual or entity to whom they are addressed. If you have reason to believe that you have received this e-mail in error, please notify the sender and destroy this email and any attached files. Please note that any views or opinions presented in this e-mail are solely those of the author and do not necessarily represent those of the Curtiss-Wright Corporation or any of its subsidiaries.  Documents attached hereto may contain technology subject to government export regulations. Recipient is solely responsible for ensuring that any re-export, transfer or disclosure of this information is in accordance with applicable government export regulations.  The recipient should check this e-mail and any attachments for the presence of viruses. Curtiss-Wright Corporation and its subsidiaries accept no liability for any damage caused by any virus transmitted by this e-mail.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20101006/b969ad77/attachment-0001.html>


More information about the Linuxppc-dev mailing list