[PATCH v3] can: grcan: Add device driver for GRCAN and GRHCAN cores

Wolfgang Grandegger wg at grandegger.com
Wed Oct 31 23:51:02 EST 2012


On 10/30/2012 05:24 PM, Andreas Larsson wrote:
> On 10/30/2012 11:07 AM, Wolfgang Grandegger wrote:
>>> +    /* AHB bus error interrupts (not CAN bus errors) - shut down the
>>> +     * device.
>>> +     */
>>> +    if (sources & (GRCAN_IRQ_TXAHBERR | GRCAN_IRQ_RXAHBERR)) {
>>> +        if (sources & GRCAN_IRQ_TXAHBERR) {
>>> +            netdev_err(dev, "got AHB bus error on tx\n");
>>> +            stats->tx_errors++;
>>> +        } else {
>>> +            netdev_err(dev, "got AHB bus error on rx\n");
>>> +            stats->rx_errors++;
>>> +        }
>>> +        netdev_err(dev, "halting device\n");
>>> +
>>> +        /* Prevent anything to be enabled again and halt device */
>>> +        SPIN_LOCK(&priv->lock);
>>> +        priv->closing = true;
>>> +        netif_stop_queue(dev);
>>> +        grcan_stop(dev);
>>> +        SPIN_UNLOCK(&priv->lock);
>>
>> Hm, does that really happen? How can the user/app realized the problem
>> and recover?
> 
> My understanding of it is that if you get amba bus errors something is
> seriously wrong and nothing can be done at driver level to recover.
> Shutting down the device is to prevent the driver from spamming console
> messages. I used to have a sysfs indication of such errors. Now dmesg is
> the way to find out about the problem. The user can always bring the
> interface down and up again and try again after such an error.

Well, sounds like a fatal error. Did you ever saw it? If that happens
frequently, or even once, the device/system seems not really usable.

>> Furthermore, why is a spin_clock enough here? THe interrupt may run a
>> thread.
> 
> It should be enough because the function is only called
> directly from the interrupt handler, right? Or did I miss something?

The interrupt handler may run as thread, e.g. if the -rt patch has been
applied. Therefore it's safer to use the irqsave/restore variants of the
spin_lock functions. At least that's my understanding. See also
"Documentation/spinlock.txt".

>>> +    priv->can.ctrlmode_supported  =
>>> +        CAN_CTRLMODE_LISTENONLY | CAN_CTRLMODE_ONE_SHOT;
>>
>> What about triple-sampling?
> 
> That is not supported by the hardware.
> 
> 
>> I curious how the device behaves on bus errors and state changes. Could
>> you please show the output of "candump -e any,0:0,#FFFFFFFF" while
>> sending a CAN message with no other node on the bus (not connected) and
>> with going bus off (by short-circuiting CAN high and low).
> 
> 
> Here is the output (with long sequences of similar error frames
> where only one counter is increasing cut out) from the the upcoming v4.
> let me know if you see any problems with this.

How did your trigger that sequence? Short-circuit or not cable
connected? The latter, I assume, as bus-off is not reached.

> 
>   can0  20000006  [8] 00 00 00 00 00 00 10 00   ERRORFRAME
>         lost-arbitration{at bit 0}
>         controller-problem{}
>         error-counter-tx-rx{{16}{0}}

This is most probably an ack slot bus error similar to. You seem not to
handle bus errors and it's not a controller problem as the state did not
change.

http://lxr.linux.no/#linux+v3.6.4/drivers/net/can/flexcan.c#L353

>   can0  20000004  [8] 00 20 00 00 00 00 88 00   ERRORFRAME
>         controller-problem{tx-error-passive}
>         error-counter-tx-rx{{136}{0}}

OK, a state change to error passive. The device seems not to report
changes to error warning.

>   can0  20000006  [8] 00 20 00 00 00 00 90 00   ERRORFRAME
>         lost-arbitration{at bit 0}
>         controller-problem{tx-error-passive}
>         error-counter-tx-rx{{144}{0}}

State changes should be reported only once. But it did not change. This
is then a bus error (CAN_ERR_PROT | CAN_ERR_BUSERROR). See also above.

>   [...]
>   can0  20000006  [8] 00 20 00 00 00 00 F0 00   ERRORFRAME
>         lost-arbitration{at bit 0}
>         controller-problem{tx-error-passive}
>         error-counter-tx-rx{{240}{0}}
>   can0  20000006  [8] 00 20 00 00 00 00 F8 00   ERRORFRAME
>         lost-arbitration{at bit 0}
>         controller-problem{tx-error-passive}
>         error-counter-tx-rx{{248}{0}}
>   can0  20000042  [8] 00 00 00 00 00 00 80 00   ERRORFRAME
>         lost-arbitration{at bit 0}
>         bus-off
>         error-counter-tx-rx{{128}{0}}
>   can0  20000006  [8] 00 00 00 00 00 00 18 00   ERRORFRAME
>         lost-arbitration{at bit 0}
>         controller-problem{}
>         error-counter-tx-rx{{24}{0}}
>   can0  20000004  [8] 00 10 00 00 00 00 4F 80   ERRORFRAME
>         controller-problem{rx-error-passive}
>         error-counter-tx-rx{{79}{128}}
>   [...]
>   can0  20000006  [8] 00 10 00 00 00 00 77 80   ERRORFRAME
>         lost-arbitration{at bit 0}
>         controller-problem{rx-error-passive}
>         error-counter-tx-rx{{119}{128}}
>   can0  20000006  [8] 00 30 00 00 00 00 7F 80   ERRORFRAME
>         lost-arbitration{at bit 0}
>         controller-problem{rx-error-passive,tx-error-passive}
>         error-counter-tx-rx{{127}{128}}
>   can0  20000006  [8] 00 30 00 00 00 00 87 80   ERRORFRAME
>         lost-arbitration{at bit 0}
>         controller-problem{rx-error-passive,tx-error-passive}
>         error-counter-tx-rx{{135}{128}}
>   [...]
>   can0  20000006  [8] 00 30 00 00 00 00 F7 80   ERRORFRAME
>         lost-arbitration{at bit 0}
>         controller-problem{rx-error-passive,tx-error-passive}
>         error-counter-tx-rx{{247}{128}}
>   can0  20000006  [8] 00 30 00 00 00 00 FF 80   ERRORFRAME
>         lost-arbitration{at bit 0}
>         controller-problem{rx-error-passive,tx-error-passive}
>         error-counter-tx-rx{{255}{128}}
>   can0  20000042  [8] 00 00 00 00 00 00 80 00   ERRORFRAME
>         lost-arbitration{at bit 0}
>         bus-off
>         error-counter-tx-rx{{128}{0}}
>   can0  20000006  [8] 00 00 00 00 00 00 18 00   ERRORFRAME
>         lost-arbitration{at bit 0}
>         controller-problem{}
>         error-counter-tx-rx{{24}{0}}
>   can0  20000004  [8] 00 10 00 00 00 00 4F 80   ERRORFRAME
>         controller-problem{rx-error-passive}
>         error-counter-tx-rx{{79}{128}}
>   can0  20000006  [8] 00 10 00 00 00 00 57 80   ERRORFRAME
>         lost-arbitration{at bit 0}
>         controller-problem{rx-error-passive}
>         error-counter-tx-rx{{87}{128}}
>   [...]

Also a sequence to bus-off including the recovery would be nice.

Wolfgang.



More information about the devicetree-discuss mailing list