[Bug 213837] "Kernel panic - not syncing: corrupted stack end detected inside scheduler" at building via distcc on a G5

bugzilla-daemon at bugzilla.kernel.org bugzilla-daemon at bugzilla.kernel.org
Fri Sep 24 02:29:32 AEST 2021


https://bugzilla.kernel.org/show_bug.cgi?id=213837

--- Comment #9 from Erhard F. (erhard_f at mailbox.org) ---
Created attachment 298933
  --> https://bugzilla.kernel.org/attachment.cgi?id=298933&action=edit
System.map (5.15-rc2 + patch, PowerMac G5 11,2)

(In reply to mpe from comment #8)
> So it looks like you have actually overran your stack, rather than
> something else clobbering your stack.
> 
> Can you attach your System.map for that exact kernel? We might be able
> to work out what functions we were in when we overran.
> 
> You could also try changing CONFIG_THREAD_SHIFT to 15, that might keep
> the system running a bit longer and give us some other clues.
> 
> cheers
Hm, interesting...

What I do to trigger this bug is building llvm-12 on the G5 via distcc (on the
other side is a 16-core Opteron) and MAKEOPTS="-j10 -l3". As the G5 got 16 GiB
RAM building runs in a zstd-compressed ext2 filesystem (/sbin/zram-init -d1 -s2
-azstd -text2 -orelatime -m1777 -Lvar_tmp_dir 49152 /var/tmp). Most of the time
the bug is triggered very shortly after the actual building starts via meson.
At this time the build directory /var/tmp/portage occupies about 800 MiB.

Also sometimes I don't get a proper stack trace via netconsole but this:
BUG: unable to handle kernel data access on write at 0xc000000037c82040
BUG: unable to handle kernel data access on write at 0xc000000037c80000

Please find the relevant System.map attached. I'll do another kernel build with
CONFIG_THREAD_SHIFT=15 and see if anything changes.

Thanks for investigating this!

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.
You are watching someone on the CC list of the bug.


More information about the Linuxppc-dev mailing list