RESEND: Re: Problem booting a PowerBook G4 Aluminum after commit cd08f109 with CONFIG_VMAP_STACK=y

Christophe Leroy christophe.leroy at c-s.fr
Fri Feb 14 22:02:40 AEDT 2020



Le 14/02/2020 à 07:24, Christophe Leroy a écrit :
> Larry,
> 
> Le 14/02/2020 à 00:09, Larry Finger a écrit :
>> Christophe,
>>
>> With this patch, it gets further. Sometime after the boot process 
>> tries to start process init, it crashes with the unable to read data 
>> at 0x000157a0 with a faulting address of 0xc001683c. The screenshot is 
>> attached and the gzipped vmlinux is at 
>> http://www.lwfinger.com/download/vmlinux2.gz. The patches that were 
>> applied for this kernel are also attached,
>>
> 
> 
> Did you try with the patch at https://patchwork.ozlabs.org/patch/1237387/ ?
> 
> I see the problem happens in kprobe_handler(). Can you try without 
> CONFIG_KPROBE ?
> 

In fact, you hit two bugs. The first one is due to CONFIG_VMAP_STACK. 
The second one has always existed (at least since kernel source tree has 
been in git).

First bug is in function enter_rtas() which tries to read data on stack 
by using the linear physical address translation. This cannot be used 
with VM stack, it must re-enable data MMU translation to access data on 
the stack.

Second bug is in kprobe_handler() function, which does:

	if (*addr != BREAKPOINT_INSTRUCTION)

addr is the address where the 'trap' happened. When a trap happens with 
MMU disabled, addr contains the physical address of the trap. 
kprobe_handler() tries to read the instruction using physical address 
whereas MMU is enabled, so you get a bad access either because the said 
address is not mapped, or because access to userspace is not allowed.


Due to the first bug, you get a 'machine check', and as 
current->thread.rtas_sp has not been cleared yet, the machine check 
handler jumps to 'machine_check_in_rtas'.

machine_check_in_rtas does a trap, which in turn triggers the second bug.


Once the first bug is fixed, the second one should not popup.

Can you test patch https://patchwork.ozlabs.org/patch/1237929/ that 
fixes the first bug ?

Christophe


More information about the Linuxppc-dev mailing list