Debian SID kernel doesn't boot on PowerBook 3400c

Stan Johnson userm57 at yahoo.com
Wed Aug 4 08:20:08 AEST 2021


On 8/3/21 4:08 AM, Christophe Leroy wrote:
> 
> 
> Le 02/08/2021 à 19:32, Stan Johnson a écrit :
>> On 8/2/21 8:41 AM, Christophe Leroy wrote:
>>>
>>> ...
>>>>>
>>>>> Can you try again without CONFIG_VMAP_STACK ?
>>>>>
>>>>> Thanks
>>>>> Christophe
>>>>> ...
>>>>
>>>>
>>>> With CONFIG_VMAP_STACK=y, 5.11.0-rc5-pmac-00034-g684da7628d9 hangs at
>>>> boot on the PB 3400c.
>>>>
>>>> Without CONFIG_VMAP_STACK, 5.11.0-rc5-pmac-00034-g684da7628d9 boots as
>>>> expected.
>>>>
>>>> I didn't re-build the Debian SID kernel, though I confirmed that the
>>>> Debian config file for 5.10.0-8-powerpc includes CONFIG_VMAP_STACK=y.
>>>> It's not clear whether removing CONFIG_VMAP_STACK would be appropriate
>>>> for other powerpc systems.
>>>>
>>>> Please let me know why removing CONFIG_VMAP_STACK fixed the problem on
>>>> the PB 3400c. Should CONFIG_HAVE_ARCH_VMAP_STACK also be removed?
>>>>
>>>
>>> When CONFIG_HAVE_ARCH_VMAP_STACK is selected by the architecture,
>>> CONFIG_VMAP_STACK  is selected by default.
>>>
>>> The point is that your config has CONFIG_ADB_PMU.
>>>
>>> A bug with VMAP stack was detected during 5.9 release cycle for
>>> platforms selecting CONFIG_ADB_PMU. Because fixing the bug was an heavy
>>> change, we prefered at that time to disable VMAP stack, so VMAP stack
>>> was deselected for CONFIG_ADB_PMU by commit
>>> 4a133eb351ccc275683ad49305d0b04dde903733.
>>>
>>> Then as a second step, the proper fix was implemented and then VMAP
>>> stack was enabled again by the commit you bisected.
>>>
>>> Taking into account that the problem disappears for you when you
>>> manually deselect VMAP stacks, it means the problem is not the fix
>>> itself, but the fact that VMAP stacks are now enable by default.
>>>
>>> We need to understand why VMAP stack doesn't work on your platform, more
>>> than that why it doesn't boot at all with VMAP stack.
>>>
>>> Could you send me the dmesg output of your system when it properly
>>> boots ?
>>>
>>> Did you check with kernel 5.13 ?
>>>
>>> Thanks
>>> Christophe
>>>
>>
>> Christophe,
>>
>> Thanks for your response. It looks like I never tested v5.13 (I was
>> originally just reporting that the default Debian SID kernel,
>> 5.10.0-8-powerpc, hangs at boot on the PB 3400c).
>>
>> So I rebuilt the stock v5.13 from kernel.org using Finn's
>> dot-config-powermac-5.13, which got changed slightly at compilation (see
>> dot-config-v5.13-pmac, attached). It has CONFIG_VMAP_STACK and
>> CONFIG_ADB_PMU set, and it booted, but there were multiple memory
>> errors. So it looks like the hang-at-boot problem was fixed sometime
>> after v5.11, but there are now memory errors (similar to Wallstreet).
>>
>> With CONFIG_VMAP_STACK not set (CONFIG_ADB_PMU is still set), the
>> .config file turns into the attached dot-config-v5.13-pmac_NO_VMAP. And
>> there were still memory errors (dmesg output attached).
>>
>> The memory errors may be a completely unrelated issue, since they occur
>> regardless of the CONFIG_VMAP_STACK setting.
>>
>> To help rule out a hardware issue, I confirmed that memory errors don't
>> occur with v5.8.2 (dmesg output attached).
>>
>> A useful git bisect might be possible if CONFIG_VMAP_STACK is disabled
>> for each build. I would need to determine where the memory errors
>> started (v5.9, v5.10, v5.11, or v5.12). There is the complication that
>> (at least) several v5.10 kernels won't compile if SMP is set, so I might
>> need to disable that everywhere as well, assuming the SMP fix didn't
>> cause the memory errors.
>>
> 
> Thanks a lot for the information.
> 
> Looks like the memory errors are linked to KUAP (Kernel Userspace Access
> Protection). Based on the places the problems happen, I don't think
> there are any invalid access, so there must be something wrong in the
> KUAP logic, probably linked to some interrupts happenning in kernel mode
> while the KUAP window is opened. And because is not selected by default
> on book3s/32 until 5.14, probably nobody ever tested it in a real
> environment before you.
> 
> I think the issue may be linked to commit
> https://github.com/linuxppc/linux/commit/c16728835 which happened
> between 5.12 and 5.13. Would be nice if you could confirm that 5.12
> doesn't have the problem (At the same time maybe you can see if 5.12
> also boots OK with CONFIG_VMAP_STACK)

On the PB 3400c:
1) v5.12 with CONFIG_VMAP_STACK -- hangs at boot; see attached config.
2) v5.12 without CONFIG_VMAP_STACK -- did not hang at boot, but hung at
"Run /sbin/init as init process" (I tested it twice; there were no
errors logged); see attached config and serial console log.
3) v5.11 with CONFIG_VMAP_STACK -- hangs at boot, no output at serial
console (see attached config).
4) v5.11 without CONFIG_VMAP_STACK -- no errors (confirms earlier
result); see attached config and dmesg output.

The PB 3400C has a 240 MHz 603e and 144M memory. It's a text-only system
running Debian SID with sysvinit (X Windows and systemd would run too
slowly here).

Please note that the issue on the PB 3400c appears to be different than
the memory problem on the Wallstreet (I think Finn sent you some
Wallstreet dmesg outputs for the Wallstreet problem).

On the Wallstreet, with CONFIG_VMAP_STACK, X logins start to fail and
memory errors start happening somewhere around mem=464M (I've never seen
problems at <= 464M, but it's possible the memory errors are just more
likely to start happening somewhere >384M). With CONFIG_VMAP_STACK off,
X logins work on the Wallstreet and there are no memory errors with any
memory size <= 512M.

On the PB 3400c, there was the hang-at-boot problem that went away in
v5.11 with CONFIG_VMAP_STACK off. However, it appears a new issue may
have happened between v5.11 and v5.12 that causes v5.12 to hang at "Run
/sbin/init as init process" when CONFIG_VMAP_STACK is off (further
complicating any possible bisect).

> 
> Note that the error detected in the other thread which is being
> discussed with Finn might also be an issue to be checked while we are here.
> 

I'm not sure of the issue you are referencing. If it's the Wallstreet
issue, I believe we were waiting to hear back from you regarding the
memory errors that crop up with CONFIG_VMAP_STACK=y and mem >464M. Finn,
if that is not correct, please let me know.

thanks

-Stan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: config-v5.12-pmac_VMAP.xz
Type: application/octet-stream
Size: 17824 bytes
Desc: not available
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20210803/db137705/attachment-0006.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: config-v5.12-pmac_NO_VMAP.xz
Type: application/octet-stream
Size: 17832 bytes
Desc: not available
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20210803/db137705/attachment-0007.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: serial_log_v5.12-NO_VMAP.txt.xz
Type: application/octet-stream
Size: 4328 bytes
Desc: not available
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20210803/db137705/attachment-0008.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: config-v5.11-pmac_VMAP.xz
Type: application/octet-stream
Size: 17804 bytes
Desc: not available
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20210803/db137705/attachment-0009.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: config-v5.11-pmac_NO_VMAP.xz
Type: application/octet-stream
Size: 17828 bytes
Desc: not available
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20210803/db137705/attachment-0010.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dmesg-5.11.0-pmac-NO_VMAP.txt.xz
Type: application/octet-stream
Size: 5736 bytes
Desc: not available
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20210803/db137705/attachment-0011.obj>


More information about the Linuxppc-dev mailing list