NXP P50XX/e5500: SMP doesn't work anymore with the latest Git kernel

Christian Zigotzky chzigotzky at xenosoft.de
Thu Nov 1 00:38:33 AEDT 2018


Hi Michael,

Many thanks for this good explanation. I will try to learn more about 
bisecting. Sometimes the problem is the time. I usually work for a Linux 
first level support for our AmigaOne machines. That means my main work 
is end user support. Therefore I have to learn more about second and 
third level Linux support.

Cheers,
Christian

On 31 October 2018 at 2:20PM, Michael Ellerman wrote:
> Christian Zigotzky <chzigotzky at xenosoft.de> writes:
>
>> Little progress ...
>>
>> I reverted the following two OF files of the commit 'Merge tag
>> devicetree-for-4.20' and SMP works! The problematic code is somewhere in
>> these two files.
>>
>> a/include/linux/of.h
>> a/drivers/of/base.c
> Hi Christian,
>
> Trying to debug things by reverting like this can work, but it's quite
> error prone and is usually only used *after* a bisect has identified the
> suspect code, or if a bisect can't work for some reason.
>
> I know you said you'd had trouble bisecting in the past, but this one
> should be a good one to practice on.
>
> You already identified that the merge of the devicetree changes was the
> problem, ie.
>
>    b27186abb37b Merge tag 'devicetree-for-4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
>
>
> So you do:
>    $ git show b27186abb37b
>    commit b27186abb37b7bd19e0ca434f4f425c807dbd708
>    Merge: 0ef7791e2bfb d061864b89c3
>    Author: Linus Torvalds <torvalds at linux-foundation.org>
>    Date:   Fri Oct 26 12:09:58 2018 -0700
>    
>        Merge tag 'devicetree-for-4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
>
>
> And that shows you the two commits that were merged 0ef7791e2bfb and
> d061864b89c3. If you look at them you see:
>
>    $ git log -1 --oneline 0ef7791e2bfb
>    0ef7791e2bfb Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal
>     
>    $ git log -1 --oneline d061864b89c3
>    d061864b89c3 ARM: dt: relicense two DT binding IRQ headers
>
> You can see that the first one is the previous commit on Linus' branch,
> ie. an unrelated merge. The 2nd commit is the commit that was on top of
> robh's tree, ie. that's the start of the interesting commits for us.
>
> You can also get to that 2nd commit using b27186abb37b^2.
>
> If you look at what came in via Rob's branch with:
>
>    $ git log --oneline d061864b89c3
>    or
>    $ git log --oneline b27186abb37b^2
>
> You see there's quite a few commits, and in particular there's another
> merge:
>
>    389d0a8a7af8 Merge branch 'dt/cpu-type-rework' into dt/next
>
> If we log the 2nd parent of that, we see:
>
>   $ git log --oneline 389d0a8a7af8^2
>   4c29e5934f6c microblaze: get cpu node with of_get_cpu_node
>   a691240e36e3 fbdev: fsl-diu: get cpu node with of_get_cpu_node
>   651d44f9679c of: use for_each_of_cpu_node iterator
>   a9a455e854cd iommu: fsl_pamu: use for_each_of_cpu_node iterator
>   37dc218bed44 edac: cpc925: use for_each_of_cpu_node iterator
>   76ec23b127cd clk: mvebu: use for_each_of_cpu_node iterator
>   7de8f4aa2f35 x86: DT: use for_each_of_cpu_node iterator
>   8cabf5bc1049 SH: use for_each_of_cpu_node iterator
>   38959a091e4a powerpc: 8xx: get cpu node with of_get_cpu_node
>   84dbc69a2ff3 powerpc: 4xx: get cpu node with of_get_cpu_node
>   a94fe366340a powerpc: use for_each_of_cpu_node iterator
>   5e5abae858b5 openrisc: use for_each_of_cpu_node iterator
>   1f0fe1f67cef nios2: get cpu node with of_get_cpu_node
>   5a931a3c80b5 c6x: use for_each_of_cpu_node iterator
>   de76e70a8d4e arm64: use for_each_of_cpu_node iterator
>   5af5d40c4015 ARM: shmobile: use for_each_of_cpu_node iterator
>   07d44f1f82b7 ARM: topology: remove unneeded check for /cpus node
>   d4866f751edf ARM: use for_each_of_cpu_node iterator
>   6487c15f1cc9 of: Support matching cpu nodes with no 'reg' property
>   f1f207e43b8a of: Add cpu node iterator for_each_of_cpu_node()
>   f6707fd6241e of: make PowerMac cache node search conditional on CONFIG_PPC_PMAC
>   6d0a70a284be vsprintf: print OF node name using full_name
>   a613b26a5013 of: Convert to using %pOFn instead of device_node.name
>   6901378c799d of/unittest: add printf tests for node name
>   b610e2ff4622 of/unittest: remove use of node name pointer in overlay high level test
>   57361846b52b (tag: v4.19-rc2) Linux 4.19-rc2
>
>
> So if we think the suspect commit is in there, we would confirm that by
> checking out v4.19-rc2 and testing it works. And then checkout out
> 4c29e5934f6c and testing that it's broken.
>
> Assuming the former worked and the latter was broken, we do:
>
>   $ git bisect good v4.19-rc2
>   $ git bisect bad 4c29e5934f6c
>
> And then just follow the prompts.
>
> One thing to watch out for is hitting an unrelated bug, that can
> sometimes derail your bisection.
>
> In this case the bug we're looking for is that CPU 1 isn't onlined
> properly. But if the system doesn't boot entirely for example then you
> shouldn't mark the commit as bad, instead it's better to skip it. Then
> git will choose a different commit for you to test.
>
> Anyway hope that helps.
>
> cheers
>
>> On 29 October 2018 at 6:00PM, Christian Zigotzky wrote:
>>> Hello,
>>>
>>> I figured out that the problem is in the OF source code of the commit:
>>> Merge tag devicetree-for-4.20. [1]
>>>
>>> I reverted the following OF files and SMP works!
>>>
>>> drivers/of/base.c
>>> drivers/of/device.c
>>> drivers/of/of_mdio.c
>>> drivers/of/of_numa.c
>>> drivers/of/of_private.h
>>> drivers/of/overlay.c
>>> drivers/of/platform.c
>>> drivers/of/unittest-data/overlay_15.dts
>>> drivers/of/unittest-data/tests-overlay.dtsi
>>> drivers/of/unittest.c
>>> include/linux/of.h
>>>
>>> Cheers,
>>> Christian
>>>
>>> [1]
>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b27186abb37b7bd19e0ca434f4f425c807dbd708
>>>
>>>
>>> On 29 October 2018 at 10:56AM, Christian Zigotzky wrote:
>>>> Hello,
>>>>
>>>> I have figured out that the commit 'devicetree-for-4.20' [1] is
>>>> responsible for the SMP problem. I was able to revert this commit
>>>> with 'git revert b27186abb37b7bd19e0ca434f4f425c807dbd708 -m 1' today.
>>>>
>>>> [master ec81438] Revert "Merge tag 'devicetree-for-4.20' of
>>>> git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux"
>>>> 138 files changed, 931 insertions(+), 1538 deletions(-)
>>>> rename Documentation/devicetree/bindings/arm/{atmel-sysregs.txt =>
>>>> atmel-at91.txt} (67%)
>>>> delete mode 100644
>>>> Documentation/devicetree/bindings/arm/freescale/fsl,layerscape-dcfg.txt
>>>> delete mode 100644
>>>> Documentation/devicetree/bindings/arm/freescale/fsl,layerscape-scfg.txt
>>>> rename Documentation/devicetree/bindings/arm/{zte,sysctrl.txt =>
>>>> zte.txt} (62%)
>>>> delete mode 100644 Documentation/devicetree/bindings/misc/lwn-bk4.txt
>>>> create mode 100644 arch/c6x/boot/dts/linked_dtb.S
>>>> delete mode 100644 arch/nios2/boot/dts/Makefile
>>>> create mode 100644 arch/nios2/boot/linked_dtb.S
>>>> delete mode 100644 arch/powerpc/boot/dts/Makefile
>>>> delete mode 100644 arch/powerpc/boot/dts/fsl/Makefile
>>>> delete mode 100644 scripts/dtc/yamltree.c
>>>>
>>>> It solves the SMP problem! SMP works again on my P5020 board and on
>>>> virtual e5500 QEMU machines.
>>>>
>>>> QEMU command: ./qemu-system-ppc64 -M ppce500 -cpu e5500 -m 2048
>>>> -kernel /home/christian/Downloads/uImage-4.20-alpha5 -drive
>>>> format=raw,file=/home/christian/Dokumente/ubuntu_MATE_16.04.3_LTS_PowerPC_QEMU/ubuntu_MATE_16.04_PowerPC.img,index=0,if=virtio
>>>> -nic user,model=e1000 -append "rw root=/dev/vda3" -device virtio-vga
>>>> -device virtio-mouse-pci -device virtio-keyboard-pci -soundhw es1370
>>>> -smp 4
>>>>
>>>> Screenshot:
>>>> https://plus.google.com/u/0/photos/photo/115515624056477014971/6617705776207990082
>>>>
>>>> Do we need a new dtb file or is it a bug?
>>>>
>>>> Thanks,
>>>> Christian
>>>>
>>>> [1]
>>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b27186abb37b7bd19e0ca434f4f425c807dbd708
>>>>
>>>>
>>>> On 28 October 2018 at 5:35PM, Christian Zigotzky wrote:
>>>>> Hello,
>>>>>
>>>>> SMP doesn't work anymore with the latest Git kernel (28/10/18
>>>>> 11:12AM GMT) on my P5020 board and on virtual e5500 QEMU machines.
>>>>>
>>>>> Board with P5020 dual core CPU:
>>>>>
>>>>> [    0.000000] -----------------------------------------------------
>>>>> [    0.000000] phys_mem_size     = 0x200000000
>>>>> [    0.000000] dcache_bsize      = 0x40
>>>>> [    0.000000] icache_bsize      = 0x40
>>>>> [    0.000000] cpu_features      = 0x00000003008003b4
>>>>> [    0.000000]   possible        = 0x00000003009003b4
>>>>> [    0.000000]   always          = 0x00000003008003b4
>>>>> [    0.000000] cpu_user_features = 0xcc008000 0x08000000
>>>>> [    0.000000] mmu_features      = 0x000a0010
>>>>> [    0.000000] firmware_features = 0x0000000000000000
>>>>> [    0.000000] -----------------------------------------------------
>>>>> [    0.000000] CoreNet Generic board
>>>>>
>>>>>      ...
>>>>>
>>>>> [    0.002161] smp: Bringing up secondary CPUs ...
>>>>> [    0.002339] No cpu-release-addr for cpu 1
>>>>> [    0.002347] smp: failed starting cpu 1 (rc -2)
>>>>> [    0.002401] smp: Brought up 1 node, 1 CPU
>>>>>
>>>>> Virtual e5500 quad core QEMU machine:
>>>>>
>>>>> [    0.026394] smp: Bringing up secondary CPUs ...
>>>>> [    0.027831] No cpu-release-addr for cpu 1
>>>>> [    0.027989] smp: failed starting cpu 1 (rc -2)
>>>>> [    0.030143] No cpu-release-addr for cpu 2
>>>>> [    0.030304] smp: failed starting cpu 2 (rc -2)
>>>>> [    0.032400] No cpu-release-addr for cpu 3
>>>>> [    0.032533] smp: failed starting cpu 3 (rc -2)
>>>>> [    0.033117] smp: Brought up 1 node, 1 CPU
>>>>>
>>>>> QEMU command: ./qemu-system-ppc64 -M ppce500 -cpu e5500 -m 2048
>>>>> -kernel
>>>>> /home/christian/Downloads/vmlinux-4.20-alpha4-AmigaOne_X1000_X5000/X5000_and_QEMU_e5500/uImage-4.20
>>>>> -drive
>>>>> format=raw,file=/home/christian/Downloads/MATE_PowerPC_Remix_2017_0.9.img,index=0,if=virtio
>>>>> -nic user,model=e1000 -append "rw root=/dev/vda" -device virtio-vga
>>>>> -device virtio-mouse-pci -device virtio-keyboard-pci -usb -soundhw
>>>>> es1370 -smp 4
>>>>>
>>>>> .config:
>>>>>
>>>>> ...
>>>>> CONFIG_SMP=y
>>>>> CONFIG_NR_CPUS=4
>>>>> ...
>>>>>
>>>>> Please test the latest Git kernel on your NXP P50XX boards.
>>>>>
>>>>> Thanks,
>>>>> Christian
>>>>>
>>>>
>>>



More information about the Linuxppc-dev mailing list