PPC476 hangs during tlb flush after calling /init in crash kernel with linux 5.4+

Christophe Leroy christophe.leroy at csgroup.eu
Wed Apr 28 16:08:17 AEST 2021



Le 28/04/2021 à 00:42, Eddie James a écrit :
> On Tue, 2021-04-27 at 19:26 +0200, Christophe Leroy wrote:
>> Hi Eddies,
>>
>> Le 27/04/2021 à 19:03, Eddie James a écrit :
>>> Hi all,
>>>
>>> I'm having a problem in simulation and hardware where my PPC476
>>> processor stops executing instructions after callling /init. In my
>>> case
>>> this is a bash script. The code descends to flush the TLB, and
>>> somewhere in the loop in _tlbil_pid, the PC goes to
>>> InstructionTLBError47x but does not go any further. This only
>>> occurs in
>>> the crash kernel environment, which is using the same kernel,
>>> initramfs, and init script as the main kernel, which executed fine.
>>> I
>>> do not see this problem with linux 4.19 or 3.10. I do see it with
>>> 5.4
>>> and 5.10. I see a fair amount of refactoring in the PPC memory
>>> management area between 4.19 and 5.4. Can anyone point me in a
>>> direction to debug this further? My stack trace is below as I can
>>> run
>>> gdb in simulation.
>>
>> Can you bisect to pin point the culprit commit ?
> 
> Hi, thanks for your prompt reply.
> 
> Good idea! I have bisected to:
> 
> commit 9e849f231c3c72d4c3c1b07c9cd19ae789da0420 (b8-bad,
> refs/bisect/bad)
> Author: Christophe Leroy <christophe.leroy at c-s.fr>
> Date:   Thu Feb 21 19:08:40 2019 +0000
> 
>      powerpc/mm/32s: use generic mmu_mapin_ram() for all blocks.
>      
>      Now that mmu_mapin_ram() is able to handle other blocks
>      than the one starting at 0, the WII can use it for all
>      its blocks.
>      
>      Signed-off-by: Christophe Leroy <christophe.leroy at c-s.fr>
>      Signed-off-by: Michael Ellerman <mpe at ellerman.id.au>
> 
> I also confirmed that reverting this commit resolves the issue in 5.4+.
> 
> Now, I don't understand why this is problematic or what is really
> happening... Reverting is probably not the desired solution.
> 

Can you provide the 'dmesg' or a dump of the logs printed by the kernel at boottime ?

The difference with this commit is that if there are several memblocks, all get mapped. Maybe your 
target doesn't like it.

You are talking about simulation, are you using QEMU ? If yes can you provide details so that I can 
try and reproduce the issue ?

Thanks
Christophe


More information about the Linuxppc-dev mailing list