[v10, 7/7] mmc: sdhci-of-esdhc: fix host version for T4240-R1.0-R2.0

Kiran Kumar kiran1677 at gmail.com
Tue Apr 4 00:16:01 AEST 2017


Sorry to post in this huge email bunch.

I have most probably hit an errata in Freescale T4240 for
PPC_DISABLE_THREADS. I'm using Rev2 - T4240. Is this Errata required to be
taken care or not?  Any quick help is appreciated!

My issue: I'm running line rate of Traffic to T4240 [10G of traffic on each
port of capacity of 10G]; After few hours of running of traffic, My CPU/LMP
gets stuck and goes in for Hard reset. When I searched through the code and
some open forums, I saw the errata listed above. I'm not sure if that
errata works for me or not. I have pasted a snapshot of issue occuring in
my system

============================================

During hang, with Softlock up enabled, I get prints from smp_many ->
showing ‘every processor is waiting for processor 22’ ; I need to know what
happened to processor 22. The processor number keep changing every time
when I run the traffic.

Processor 22 goes into a deadlock state with interrupts disabled or it went
into a deep idle sleep state is the issue i feel.

If its deep idle state -> a NMI should have recovered it.

But if it’s a deadlock issue with interrupt disabled then I need to know
the root cause for the deadlock.

*root at A@0-1-1:~# [ 1602.261011] Current::17 waiting::22 flag::4353 [
1602.720488] Current::14 waiting::22 flag::3585 [ 1602.756404] Current::15
waiting::22 flag::3841 [ 1604.903248] Current::16 waiting::22 flag::4097 [
1619.400917] Current::14 waiting::22 flag::3585 [ 1619.517870] Current::17
waiting::22 flag::4353 [ 1619.749893] Current::15 waiting::22 flag::3841 [
1619.798453] Current::4 waiting::22 flag::1025 [ 1622.177171] Current::16
waiting::22 flag::4097 [ 1622.449412] INFO: rcu_preempt detected stalls on
CPUs/tasks: { 22} (detected by 4, t=21008 jiffies, g=101651, c=101650,
q=4713) [ 1622.460951] Task dump for CPU 22:*

*[ 1622.464275] swapper/22      R  running task        0     0      1
0x00000800*

*[ 1622.471355] Call Trace:*

*[ 1622.473820] [c0000001f92b78d0] [000000000000009e] 0x9e (unreliable) [
1622.480119] [c0000001f92b7960] [c0000001f92b7aa0] 0xc0000001f92b7aa0 [
1622.486497] [c0000001f92b79e0] [c000000000a90275] 0xc000000000a90275 [
1622.492878] [c0000001f92b7a60] [c000000000006f64] .do_IRQ+0x184/0x370 [
1622.499344] [c0000001f92b7b10] [c00000000001b93c]
exc_0x500_common+0xfc/0x100 [ 1622.506539] --- Exception: 501 at
0xc000000000a3e200*

*[ 1622.506539]     LR = .__check_irq_replay+0x68/0x110*

*[ 1622.516388] [c0000001f92b7e00] [c0000000000bc590]
.cpu_startup_entry+0x1d0/0x350 (unreliable) [ 1622.524950]
[c0000001f92b7ed0] [c000000000a00460] .start_secondary+0x3ec/0x3f4 [
1622.532197] [c0000001f92b7f90] [c00000000000036c]
.start_secondary_prolog+0x10/0x14 [ 1635.237829] Current::3 waiting::5
flag::769 [ 1636.587746] Current::14 waiting::22 flag::3585 [ 1637.082117]
Current::17 waiting::22 flag::4353 [ 1637.224920] Current::4 waiting::22
flag::1025 [ 1637.268789] Current::15 waiting::22 flag::3841 [ 1640.085792]
Current::16 waiting::22 flag::4097 [ 1651.093367] Current::4 waiting::22
flag::1025*

*=============================================================*

Regards,

Kiran

On Thu, May 5, 2016 at 4:40 PM, Arnd Bergmann <arnd at arndb.de> wrote:

> On Thursday 05 May 2016 09:41:32 Yangbo Lu wrote:
> > > -----Original Message-----
> > > From: Arnd Bergmann [mailto:arnd at arndb.de]
> > > Sent: Thursday, May 05, 2016 4:32 PM
> > > To: linuxppc-dev at lists.ozlabs.org
> > > Cc: Yangbo Lu; linux-mmc at vger.kernel.org; devicetree at vger.kernel.org;
> > > linux-arm-kernel at lists.infradead.org; linux-kernel at vger.kernel.org;
> > > linux-clk at vger.kernel.org; linux-i2c at vger.kernel.org;
> iommu at lists.linux-
> > > foundation.org; netdev at vger.kernel.org; Mark Rutland;
> > > ulf.hansson at linaro.org; Russell King; Bhupesh Sharma; Joerg Roedel;
> > > Santosh Shilimkar; Yang-Leo Li; Scott Wood; Rob Herring; Claudiu
> Manoil;
> > > Kumar Gala; Xiaobo Xie; Qiang Zhao
> > > Subject: Re: [v10, 7/7] mmc: sdhci-of-esdhc: fix host version for
> T4240-
> > > R1.0-R2.0
> > >
> > > On Thursday 05 May 2016 11:12:30 Yangbo Lu wrote:
> > > >
> > > > +       fsl_guts_init();
> > > > +       svr = fsl_guts_get_svr();
> > > > +       if (svr) {
> > > > +               esdhc->soc_ver = SVR_SOC_VER(svr);
> > > > +               esdhc->soc_rev = SVR_REV(svr);
> > > > +       } else {
> > > > +               dev_err(&pdev->dev, "Failed to get SVR value!\n");
> > > > +       }
> > > > +
> > > >
> > >
> > >
> > > Sorry for jumping in again after not participating in the discussion
> for
> > > the past few versions.
> > >
> > > What happened to my suggestion of making this a platform-independent
> > > interface to avoid the link time dependency?
> > >
> > > Specifically, why not add an exported function to drivers/base/soc.c
> that
> > > uses glob_match() for comparing a string in the device driver to the ID
> > > of the SoC that is set by whatever SoC identifying driver the platform
> > > has?
> >
> > [Lu Yangbo-B47093] I think this has been discussed in v6.
> > You can find Scott's comments about this in below link.
> > https://patchwork.kernel.org/patch/8544501/
>
> Ah, thanks for bearing with me and digging this out again. Let me follow
> up on Scott's older replies here then:
>
> > >> IIRC, it is the same IP block as i.MX and Arnd's point is this won't
> > >> even compile on !PPC. It is things like this that prevent sharing the
> > >> driver.
> >
> > The whole point of using the MMIO SVR instead of the PPC SPR is so that
> > it will work on ARM...  The guts driver should build on any platform as
> > long as OF is enabled, and if it doesn't find a node to bind to it will
> > return 0 for SVR, and the eSDHC driver will continue (after printing an
> > error that should be removed) without the ability to test for errata
> > based on SVR.
>
> It feels like a bad design to have to come up with a different
> method for each SoC type here when they all do the same thing
> and want to identify some variant of the chip to do device
> specific quirks.
>
> As far as I'm concerned, every driver in drivers/soc that needs to
> export a symbol to be used by a device driver is an indication that
> we don't have the right set of abstractions yet. There are cases
> that are not worth abstracting because the functionality is rather
> obscure and only a couple of drivers for one particular chip
> ever need it.
>
> Finding out the version of the SoC does not look like this case.
>
> > > I think the first four patches take care of building for ARM,
> > > but the problem remains if you want to enable COMPILE_TEST as
> > > we need for certain automated checking.
> >
> > What specific problem is there with COMPILE_TEST?
>
> COMPILE_TEST is solvable here and the way it is implemented in this
> case (selecting FSL_GUTS from the driver) indeed looks like it works
> correctly, but it's still awkward that this means building the
> SoC specific ID stuff into the vmlinux binary for any driver that
> uses something like that for a particular SoC.
>
> > >> Dealing with Si revs is a common problem. We should have a
> > >> common solution. There is soc_device for this purpose.
> > >
> > > Exactly. The last time this came up, I think we agreed to implement a
> > > helper using glob_match() on the soc_device strings. Unfortunately
> > > this hasn't happened then, but I'd still prefer that over yet another
> > > vendor-specific way of dealing with the generic issue.
> >
> > soc_device would require encoding the SVR as a string and then decoding
> > the string, which is more complicated and error prone than having
> > platform-specific code test a platform-specific number.
>
> You already need to encode it as a string to register the soc_device,
> and the driver just needs to pass a glob string, so the only part that
> is missing is the generic function that takes the string from the
> driver and passes that to glob_match for the soc_device.
>
> > And when would it get registered on arm64, which doesn't have
> > platform code?
>
> Whenever the soc driver is loaded, as is the case now. The match
> function can return -EPROBE_DEFER if no SoC device is registered
> yet.
>
>         Arnd
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev at lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20170403/a3015e31/attachment.html>


More information about the Linuxppc-dev mailing list