Strange tg3 regression with UMP fw. link reporting

Matt Carlson mcarlson at broadcom.com
Sat Aug 9 04:43:59 EST 2008


On Fri, Aug 08, 2008 at 07:18:31PM +1000, Benjamin Herrenschmidt wrote:
> On Fri, 2008-08-08 at 10:58 +0200, Segher Boessenkool wrote:
> > > I don't know yet for sure what happens, but a quick look at the commit
> > > seems to show that the driver synchronously spin-waits for up to 2.5ms
> > 
> > That's what the comment says, but the code says 2.5 _seconds_:
> > 
> > +       /* Wait for up to 2.5 milliseconds */
> > +       for (i = 0; i < 250000; i++) {
> > +               if (!(tr32(GRC_RX_CPU_EVENT) & GRC_RX_CPU_DRIVER_EVENT))
> > +                       break;
> > +               udelay(10);
> > +       }
> > 
> > (not that milliseconds wouldn't be bad already...)
> 
> Right, indeed. I think we have a good candidate for the problem :-) I'll
> verify that on monday. Now, that leads to two questions:
> 
>  - What such a synchronous and potentially horribly slow code is
> going in a locked section or a timer interrupts ? Ie, the link
> watch should probably move to a workqueue if that is to remain,
> or the code turned into a state machine that periodically check for
> events, or whatever is more sane than the above.
> 
>  - The code should at least display some error and do something sane in
> case of timeout such as disabling the new UMP feature instead of
> repeatedly looping ...
> 
>  - If this is indeed our problem (timing out in the code above), why is
> our firmware not emitting the requested event -> maybe the PowerStations
> need a tg3 firmware update.
> 
> Matt, what's your take on this ?

Segher is right.  The code should be 2.5 milliseconds but is actually
much longer.  This fix is actually already in my patch queue and needs
to be sent upstream.

We really shouldn't be displaying any error messages in the event of a
timeout though.  Earlier versions of the UMP firmware did not support
the link update interface.  The best thing the driver can do for all
cases is give the firmware a chance to service the event but continue
as if the event were serviced if it did not get an explicit ACK.





More information about the Linuxppc-dev mailing list